[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] MSXML - Processing non standard characters


Subject: Re: [xsl] MSXML - Processing non standard characters
From: "Michael Beddow" <mbnospam@xxxxxxxxxxx>
Date: Thu, 2 Aug 2001 10:35:53 +0100

On Thursday, August 02, 2001 5:41 AM
Warren Keane wrote:

[..]

> the offending character is the "1/2" at column 386. Is
> "ISO-8859-1" the proper encoding to remedy this situation?

Declaring the encoding of that and similar documents to be ISO-8859-1 is
indeed what you need. But note that all you are doing there is telling
the processor what the encoding (already) is. You aren't altering the
encoding itself. Without that encoding declaration, the processor
assumes utf-8 and so cannot parse the file, because a stand-alone
decimal 189 (the representation of VULGAR FRACTION ONE HALF in
ISO-8859-1) is an illegal value in a utf-8 encoded document. The
encoding of that same abstract character in utf-8 would be the two byte
sequence (decimal ) 194, 189. This bewilders people who try to read
utf-8 as IS-8859-1, because they apparently have got their "half"
showing OK but they wonder where the "garbage" capital A circumflex in
front of it has come from (and generally ask here about it) I mention
this in case you accidentally end up with utf-8 output and get bitten
the other end as it were.

Michael
---------------------------------------------------------
Michael Beddow   http://www.mbeddow.net/
XML and the Humanities page:  http://xml.lexilog.org.uk/
---------------------------------------------------------



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords
xml