[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] 16-bit chars rendered as "?" in UTF-8?

Subject: Re: [xsl] 16-bit chars rendered as "?" in UTF-8?
From: John English <john.foreign@xxxxxxxxx>
Date: Tue, 14 Aug 2012 14:24:57 +0300

On 13/08/2012 14:19, David Carlisle wrote:
Most likely reason is that either your input document or your result
document are being served with the wrong encoding. (ie the encoding in
the http header does not match the encoding in the file)

Many thanks for this tip. The input was indeed ISO-8859-1 while the output was UTF-8. Changing the input encoding to UTF-8 fixed the problem. However, I still don't quite understand why this caused a problem, and if you have the time I'd be grateful for a brief explanation suited to a bear of vety little brain...

A single piece of code loads a single stylesheet which is used to
transform the input. In both case the input was encoded as ISO-8859-1
using entities "&#nnnn;" to represent the 16-bit characters using 8-bit
characters only. In both cases the output is UTF-8 (as defined in the
stylesheet) but in one case the entities are transformed into the
corresponding 16-bit characters "W" and so on, while in the other case
they are transformed into question marks "?", character 0x3F. What I
don't understand is why this should happen when both cases are dealt
with by the same code and stylesheet?

Again, many thanks,
John English

Current Thread