retain unicode characters such as - in xml to xml tra

Here should go questions about transforming XML with XSLT and FOP.
sderrick
Posts: 264
Joined: Sat Jul 10, 2010 4:03 pm

retain unicode characters such as - in xml to xml tra

Post by sderrick »

I have some xml files that I need to run a xsl script on. The xml file has numerous(thousands) of unicode caracters specified. Some in that are in standard ascii and some not.

I added <xsl:output encoding="US-ASCII"/>

to my script which kept all but x002d, that being in the US ascii set.

How can i retain the thousand or so &#x002d; declarations when I run the script?

Also, when teh script runs the retained unicode charaters are output as decimal instead of hex. Anyway of not having it do that too as I have to go in and search and replace the 15 odd numbers it is munging. Not as important as the first request to retain the &#x002d; charcter but would be nice.

Scott
sderrick
Posts: 264
Joined: Sat Jul 10, 2010 4:03 pm

Re: retain unicode characters such as &#x002d; in xml to xml tra

Post by sderrick »

I hacked it by changing all x002d chars to x202d and then the parser didn't replace it with a keuboard - minus sign...

Scott
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: retain unicode characters such as &#x002d; in xml to xml tra

Post by george »

With XSLT 2,0 you can use character maps:

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:character-map name="minus">
<xsl:output-character character="&#x002d;" string="&#x002d;"/>
</xsl:character-map>
<xsl:output use-character-maps="minus"/>
<xsl:template match="/">
<result>&#x002d;</result>
</xsl:template>
</xsl:stylesheet>
Best Regards,
George
George Cristian Bina
sderrick
Posts: 264
Joined: Sat Jul 10, 2010 4:03 pm

Re: retain unicode characters such as &#x002d; in xml to xml tra

Post by sderrick »

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:character-map name="minus">
<xsl:output-character character="&#x002d;" string="&#x002d;"/>
</xsl:character-map>
<xsl:output use-character-maps="minus"/>
<xsl:template match="/">
<result>&#x002d;</result>
</xsl:template>
</xsl:stylesheet>

Thats a much better solution George. I will use it! thanks..

Scott
Post Reply