[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: ISO-8859-1 encoding and XmlDecl omision (was Re: [xsl] Lookin g up keys in a separate xml file)


Subject: RE: ISO-8859-1 encoding and XmlDecl omision (was Re: [xsl] Lookin g up keys in a separate xml file)
From: Rowland Shaw <Rowland.Shaw@xxxxxxxxxxxxxxxxxxx>
Date: Tue, 6 Jan 2004 16:55:51 -0000

Actually; you're both wrong; neither is a subset of the other.

They do however share a common subset (the 7 bit characters), and every
character representible in ISO-8859-1 has a representation in UTF-8 (which
is technically a translation of UCS-2, a.k.a. Unicode) but not vice-versa.

Problems come about when the encoding information is misreported or lost, an
example of how to lose it could be using MSXML, and it's "transformNode"
method to a string, and then response.write()ing that -- the transform will
return a UCS-2 string, including a <meta> declaring itself as UTF-16, and
the response.write will output using the session's codepage which usually
doesn't match. Instead you should use transformNodeToObject() in this case,
as well as ensuring your session's codepage matches the <xsl:output> and
other friends. 


-----Original Message-----
From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
[mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of Sergio Rodriguez
Sent: 06 January 2004 16:15
To: 'John Meyer '
Cc: 'xsl-list@xxxxxxxxxxxxxxxxxxxxxx '
Subject: RE: ISO-8859-1 encoding and XmlDecl omision (was Re: [xsl] Lookin g
up keys in a separate xml file)

Hi.

> John Meyer wrote at 06/01/2004 09:36 a.m.:
>I'm not sure I understand why 
>
><xsl:output method='xml' version='1.0' encoding='iso-8859-1'
>	omit-xml-declaration="yes" indent="yes"/>
>
>is invalid. ISO-8859-1 is a subset of UTF-8 and should cause no problems
>since most parsers default to UTF-8 if the XML declaration is ommited.

For my knowledge, that is incorrect.  Latin1 isn't a subset of UTF-8, for
the contrary, the opposite is true:  UTF-8 is a subset of Latin1.  If you
want to generate a XML document (result tree) that has *support* for Latin1
characters, like: á,é,í,ü,ë, etc. you have to generated the XML declaration
with Latin1 encoding (ISO-8859-1), otherwise the result would not be well
formed XML, just like David said:

...

>Note however that:
> 
> <xsl:output method='xml' version='1.0' encoding='iso-8859-1'
> omit-xml-declaration="yes" indent="yes"/>

>has incompatible options: you can't have latin1 output and omit the xml
>declaration (otherwise the result would not be well formed XML.
>The XSLt engine will be forced to ignore omit-xml-declaration="yes"
>here.
>
>David

Cheers,

Sergio Rodríguez.

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords