[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: Special entity characters in Shift-JIS XSL.
Subject: Re: Special entity characters in Shift-JIS XSL. From: "Nikolai Grigoriev" <grig@xxxxxxx> Date: Fri, 17 Dec 1999 04:39:41 +0300 |
David Carlisle wrote: >which spec? there is nothing that could be put into the xsl spec, as >what you are asking for is a change in XML 1.0, this is why your >suggested markup of using &# syntax will always be fragile and flaky. >As soon as your documents are touched by any xml parser the characters >may (or may not) be written out as character data in the document >encoding rather than as character references, since the xml spec makes >it explicit that these are equivalent when used in element character >data. I have much the same problems with Russian texts as Sean O'Dell has with Shift-JIS. For Russian, there exist two major 8-bit encoding schemes plus two minor ones; UTF-8 is scarcely used because it doubles the length of the text. Surely enough, none of the 8-bit Russian encoding is supported by currently available XSLT processors. Well, I can change the encoding declaration to ISO-8859-1 and let the whole text be parsed correctly. But outputting the processing results as UTF-8 is dramatic: what I get is "KOI8-r converted to UTF-8 as if it were Latin-1", too strong for poor me. I admire James Clark's XT, but I can hardly use it for Russian - because there's no way to make it output anything but UTF-8. Fortunately, there is SAXON that supports Latin-1 in the output, and lets me pass my weird letters through ;-); thanks to Mike Kay! I think a universal solution would be a proper support for US-ASCII output encoding. This would quote to numeric entities all characters but the 7-bit ones - exactly what Sean need. This is often a preferred solution for non-Latin-1 encodings that can hardly be supported by common tools in the nearest future. It's a pity that XML spec does not enforce this as a conformance criterion. SAXON kinda does it: it issues a message that US-ASCII encoding is not supported and threatens to switch to UTF-8, but still prints all special characters as numeric entities. Thanks again Mike ;-). Regards, Nikolai XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: Special entity characters in Sh, kbagepalli | Thread | Re: Special entity characters in Sh, David Carlisle |
Re:, Steve Muench | Date | A WAP questionnaire, Martin Gallwey |
Month |