[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] Combining use-character-maps and normalization-form="NFC" attributes produce unwanted output


Subject: [xsl] Combining use-character-maps and normalization-form="NFC" attributes produce unwanted output
From: "lancelot.meurillon@xxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 12 Feb 2016 14:28:55 -0000

XSL processor : Saxon-EE 9.5.1.8J from Saxonica
XSL version : 2.0

Dear all,

For some reasons, I need to escape specific characters in the output and also
need to produce normalised Unicode in NFC.
Here is my input :
<inputText>b; ;</ inputText >  => which is \u201D + \u003B + \u0020 + \u003B

Here is the output properties of my stylesheet :
<xsl:output method="xml" version="1.0" encoding="UTF-8"
        indent="yes" omit-xml-declaration="no"
        use-character-maps="unsupported_characters"
        normalization-form="NFC"
    />

The character-map definition :
<xsl:character-map name="unsupported_characters">
        <xsl:output-character character="&#8220;" string="&quot;"/>
        <xsl:output-character character="&#8221;" string="&quot;"/>
    </xsl:character-map>

With this template :
<xsl:template match="/ ">
    <shortDescription><xsl:value-of select=" inputText "/></shortDescription>
</xsl:template>

Now the output :
<shortDescription>"M> ;</shortDescription> => which is \u0022 + \u037E +
\u0020 + \u003B

Why the semicolon (\u003B) is translated into Greek question mark (\u037E)
just after the escaped quote while the next semi colon is kept ?
But the right question is why my semicolon is escaped into Greek question mark
?

Just to go further :
1- If I do not use character-map the result is :
<shortDescription>b; ;</shortDescription> => which is \u201D + \u003B +
\u0020 + \u003B

2- If I do not normalize the Unicode (without normalization-form="NFC"
attribute)
<shortDescription>"; ;</shortDescription> => which is \u0022 + \u003B + \u0020
+ \u003B

Thanks for the help
Lancelot


Current Thread
Keywords