[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] 16-bit chars rendered as "?" in UTF-8?


Subject: Re: [xsl] 16-bit chars rendered as "?" in UTF-8?
From: John English <john.foreign@xxxxxxxxx>
Date: Sat, 01 Sep 2012 11:49:01 +0300

On 15/08/2012 12:03, John English wrote:
> I hate it when something fixes a problem and I don't understand
> what the original problem was or why the fix worked!

So in my spare time I've been nagging away at this problem even though
I've "fixed" it...

To summarise: I have a single XML document being served up via two
different routes, both of which use the same stylesheet and filter
code to transform it to HTML, The XML contains only 7-bit characters,
with 16-bit characters represented as entities. If the XML is sent as
ISO-8859-1, I get "?" for all 16-bit characters via one path but it
is rendered correctly via the other. Changing the XML encoding to UTF-8
"fixes" the problem (as in "makes it go away").

Now I have discovered something I'd overlooked before, but I still
don't see what's going on. The mechanism that fails involves loading
an HTML page that looks like this:

  <html>
  <head>
    <script language='JavaScript1.1' src='/scripts/base.js'/>
    <title>Please wait</title>
  </head>
  <body onLoad='setTimeout("checkLoad()",65000)'>
    <h1>Loading, please wait...</h1>
  </body>
  </html>

In due course the JavaScript loads the XML document by setting
location.href. This loading mechanism is the only significant
difference I can see between the two. I see that there is no
content-encoding specified in the HTML, but if I stop the JS
function in the debugger and looking at the HTML encoding, it
is UTF-8. Can the root of the problem be connected to replacing
a UTF-8 page with an ISO-8859-1 page via JavaScript?

Any advice gratefully accepted!

--
John English


Current Thread
Keywords