[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: [xsl] Missing byte-order mark problem
Subject: Re: [xsl] Missing byte-order mark problem From: Mike Brown <mike@xxxxxxxx> Date: Sun, 3 Aug 2003 16:30:58 -0600 (MDT) |
Vivek Shinde wrote: > For last two days I was struggling with a problem of applying a > XSL stylesheet to XML that had Danish characters (using entities > like ø etc.). The output=HTML was working fine but when I > tried to get text output I kept getting "Missing byte-order mark". > I tried it with encoding of UTF-8 as well as UTF-16, it did not work. > Finally I found a listing on google from this group from way back > in 2002 http://www.xslt.com/xsl-list/2002-02/msg00675.html and it > suggested to use encoding="iso-8859-1" and walla...it worked. Trial and error is not a very good way to go about document authoring or XSLT programming. In the prolog of an XML document, encoding="iso-8859-1" is an assertion that the document's bytes map to Unicode characters according to the iso-8859-1 encoding. This declaration may be entirely false, as you may have saved the document in UTF-8 or UTF-16 or some other format. It is required to be a truthful statement, though, by the XML spec, so that an XML parser will know how to interpret the bytes. In the xsl:output instruction element, encoding="iso-8859-1" is there to notify the XSLT processor that after it is done building the result tree, you would like it to be serialized as bytes according to the iso-8859-1 encoding. "Missing byte-order mark" indicates that your XML parser is trying to read a document under the assumption that it is utf-16 encoded (1 or 2 pairs of bytes per character, plus a 2-byte sequence at the beginning of the document to indicate whether the low or high byte comes first in each pair), but are in fact feeding it a document that is iso-8859-1 or windows-1252 (or any other non-BOM-using encoding) encoded. Most likely the cause of this is that your XML prolog contains an encoding="utf-16" declaration (or you've somehow told the XML parser externally that the document is utf-16), when in fact the document is actually iso-8859-1 or windows-1252 encoded. -Mike PS- It's "voilà" -- http://www.bartleby.com/61/81/V0138100.html XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Missing byte-order mark probl, Vivek Shinde | Thread | AW: [xsl] Missing byte-order mark p, Markus Abt |
Re: [xsl] jd.xslt bug (at least une, Daniel Veillard | Date | RE: [xsl] jd.xslt bug (at least une, Robert Koberg |
Month |