[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] How can I preserve ASCII Encoding Character Sets?


Subject: Re: [xsl] How can I preserve ASCII Encoding Character Sets?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Tue, 06 Nov 2012 22:21:29 +0000

On 06/11/2012 21:54, Philip Vallone wrote:
Thank You Michael and Ken,

Please excuse my lack of knowledge on numerical character references. I want to preserve &#160; or &#xA0; as &#160; or &#xA0;. Currently, as an example, &#160; will output into my resulting file as a space, but when the resulting xsl file is used to transform the xml file to FO it prints out a bad character "E". I have nailed down the issue to when I convert my stylesheet into one.


Encoding non-ASCII characters as character references can help to prevent transcoding errors like this, but it's better to find out why they are happening and to eliminate the root cause. What has happened is that your "resulting XSL" file contains the character coded in one encoding (say Windows-1252) but whoever is reading it thinks it is in a different encoding (say UTF-8). Often this happens because an XML file is written without an XML declaration to specify its encoding, and the recipient assumes UTF-8.

Whether the fix is to use only ASCII characters in your file (by using character references) or to add an XML declaration to the file that specifies its correct encoding, we need to know a lot more detail about the precise processing pipeline in order to establish what you are doing wrong and how to correct it. Unfortunately this is an area where there are a zillion ways of configuring things wrong, and only one way of configuring them right.

Michael Kay
Saxonica


Current Thread
Keywords