[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
RE: [xsl] encoding woes: ISO-8859-1 vs. UTF-8
Subject: RE: [xsl] encoding woes: ISO-8859-1 vs. UTF-8 From: Xiaocun Xu <xiaocunxu@xxxxxxxxx> Date: Tue, 23 Jul 2002 06:42:14 -0700 (PDT) |
Hi, Michael: Thanks for your reply. > Perhaps you were using some proprietary Microsoft > 8-bit encoding that > includes these two characters? > > Rather than showing us what the CSV file looks like > on your screen > (which depends entirely on the software used to > display it) it might > help to show us what it looks like in hex. Good idea. I opened it in TextPad in Hex mode, those two characters in CSV are Hex 93 and 94 respectively. Checked code page, both are in C1 Control code page, hex 93 is SET TRANSMIT STATE and 94 is CANCEL CHARACTER. > You can't be using ISO-8859-1 to encode the > characters ? and > ? > > ISO-8859-1 can only encode the characters in the > range 0-255. That's what I thought as well. How did saxon converted those two control chars into the proper encoding for “ and ” even though the input XML was marked as encoding in ISO-8859-1? I was fully expecting the import would fail, but somehow it was successful. > > B. Export into CSV > > 1. pull from MSSQL7 to proprietary XML: "oLogo" > > 2. saxon convert proprietary XML to CSV: > exception > > org.xml.sax.SAXException: Output character not > > available in this encoding (decimal 8220) > > Why going one way it works and not the other? > > When you use > > <xsl:output method="text" encoding="iso-8859-1"/> > > you can only output the characters available in > iso-8859-1, namely the > XML characters in the range 0-255. Good point. For export output, I changed encoding to UTF-8, that seems to have resolved the problem, now export is successful. Open the exported CSV in Hex editor, those two chars are shown as Hex 93/94, respectively. Thanks, Xiaocun __________________________________________________ Do You Yahoo!? Yahoo! Health - Feel better, live better http://health.yahoo.com XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] encoding woes: ISO-8859-1, Michael Kay | Thread | RE: [xsl] encoding woes: ISO-8859-1, Michael Kay |
[xsl] Re: file manipulation with re, Dimitre Novatchev | Date | Re: [xsl] file manipulation with re, Jeni Tennison |
Month |
Keywords