[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] [OT] charset (was: how to get an NCR in the output?)


Subject: [xsl] [OT] charset (was: how to get an NCR in the output?)
From: Tobias Reif <tobiasreif@xxxxxxxxxxxxx>
Date: Sun, 05 Jan 2003 15:17:30 +0100

Julian Reschke wrote:

> Tobias,
>
> AFAIK, the default for content type "text/html" *is* ISO-8859-1.

I don't think that it's as simple as that: one IETF spec says "The default character set, which must be assumed in the absence of a charset parameter, is US-ASCII."

As I said, I'm sending XHTML as text/html (since it's "HTML compatible").
In this case, the IETF says the following about the charset parameter:

http://ietf.org/rfc/rfc2854.txt
The 'text/html' Media Type
"
 charset
         The optional parameter "charset" refers to the character
         encoding used to represent the HTML document as a sequence of
         bytes. Any registered IANA charset may be used, but UTF-8 is
         preferred.  Although this parameter is optional, it is strongly
         recommended that it always be present. See Section 6 below for
         a discussion of charset default rules.
[...]
6. Charset default rules

   The use of an explicit charset parameter is strongly recommended.
   While [MIME] specifies "The default character set, which must be
   assumed in the absence of a charset parameter, is US-ASCII."  [HTTP]
   Section 3.7.1, defines that "media subtypes of the 'text' type are
   defined to have a default charset value of 'ISO-8859-1'".  Section
   19.3 of [HTTP] gives additional guidelines.  Using an explicit
   charset parameter will help avoid confusion.

   Using an explicit charset parameter also takes into account that the
   overwhelming majority of deployed browsers are set to use something
   else than 'ISO-8859-1' as the default; the actual default is either a
   corporate character encoding or character encodings widely deployed
   in a certain national or regional community. For further
   considerations, please also see Section 5.2 of [HTML40].
"

Personally, for XML sent as XML (eg SVG or XHTML), I think I'd prefer that the XML prolog would always overrule the charset param if present, and that the charset param would never be required, but the encoding="" in the XML prolog.

Tobi

--

Vim users               donate.
http://iccf-holland.org/donate.html

Web developers           check.
http://www.pinkjuice.com/check/


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list




Current Thread
Keywords