[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Patrick,
On 8/3/2012 4:40 AM, Szabo, Patrick (LNG-VIE) wrote:
It's confusing until you set aside the circumstances that aren't actually relevant.
The rule is simply that if when writing a particular encoding, the serializer encounters a character not represented in that encoding, XML provides a handy way for it to represent the character anyway: the numeric character reference.
This happens to be the same thing as your character map says to use, which is why it looks the same.
But note that while Saxon will replace characters (outside the target encoding) with numeric character references, it won't replace them with just anything. If, for example, your character map said
<xsl:output-character character="C$" string="a-with-umlaut"/>
... you'll find that Saxon won't do that by itself (just as you'd expect). But your character map will.
One consequence is that a convenient way to get everything outside ASCII represented by numeric character references is to ask your processor to serialize with encoding="us-ascii". No character map is then necessary.
I hope this helps.
Cheers, Wendell
Re: AW: [xsl] output encoding character map
Subject: Re: AW: [xsl] output encoding character map From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Fri, 03 Aug 2012 10:59:06 -0400 |
Patrick,
On 8/3/2012 4:40 AM, Szabo, Patrick (LNG-VIE) wrote:
Thank you for your very detailed answer. I have now found the problem. My entries looked like this:
<xsl:output-character character="C$" string="ä"/>
When I changed them to this:
<xsl:output-character character="C$" string="&#228;"/>
it worked. You are right I am writing CDATA Sections, and I'm using disable-output-escaping but the text that is outputted there doesn't contain any chars outside my chosen encoding.
Now since you said Saxon should replace the chars even without a character-map I ran the transformation without the map and it works as well...now I'm confused.
It's confusing until you set aside the circumstances that aren't actually relevant.
The rule is simply that if when writing a particular encoding, the serializer encounters a character not represented in that encoding, XML provides a handy way for it to represent the character anyway: the numeric character reference.
This happens to be the same thing as your character map says to use, which is why it looks the same.
But note that while Saxon will replace characters (outside the target encoding) with numeric character references, it won't replace them with just anything. If, for example, your character map said
<xsl:output-character character="C$" string="a-with-umlaut"/>
... you'll find that Saxon won't do that by itself (just as you'd expect). But your character map will.
One consequence is that a convenient way to get everything outside ASCII represented by numeric character references is to ask your processor to serialize with encoding="us-ascii". No character map is then necessary.
I hope this helps.
Cheers, Wendell
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
AW: [xsl] output encoding character, Szabo, Patrick \(LNG | Thread | [xsl] XPath question concerning dis, Thorsten |
Re: [xsl] XPath question concerning, Michael Kay | Date | Re: [xsl] XPath question concerning, Mukul Gandhi |
Month |
Keywords