[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: [xsl] CJK UTF-16 test
Subject: Re: [xsl] CJK UTF-16 test From: David_N_Bertoni@xxxxxxxxx Date: Thu, 29 Mar 2001 11:29:48 -0500 |
> On Wed, 28 Mar 2001, David Carlisle wrote: > > > > > > as I don't have any parser that will swallow UTF-16. > > > > utf-16 support is _mandated_ by the XML spec. If you have anything that > > calls itself an XML parser it must be able to read utf-16. > > XML does NOT support UTF-16 since UTF-16 includes the surrogates - that is > in fact what *distinguishes* it from UCS-2. That the XML 1.0 spec ('scuse > me, 'Recommendation') *says* that it requires support for UTF-16 is in > fact an error in the text since it explicitly forbids surrogates (aka > UTF-16) in the allowed char range spec. It is like saying 'We require > Japanese support, except you can't use *any* Japanese.' It's a nonsense > statement. > > "Character Range > > [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | > [#x10000-#x10FFFF] > > /* any Unicode character, excluding the surrogate blocks, FFFE, > and FFFF. */ > > The mechanism for encoding character code points into bit > patterns may vary from entity to entity. All XML processors must accept > the UTF-8 and UTF-16 encodings of 10646; > ^ > | > The Error. What it actually requires is a > specifified subset of UTF-8 and UCS-2 encodings. > You're confusing Unicode characters with how those characters are encoded. UTF-16 uses surrogates, which are introduced by a value that is not a valid Unicode character. However, taken together, the pair represents a valid Unicode character. The character range refers to Unicode characters, not their values in any given encoding. > -- > Benjamin Franz > > "Real programmers can write assembly code in any language." > -- Larry Wall Dave XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] CJK UTF-16 test, Dave Hartnoll | Thread | Re: [xsl] CJK UTF-16 test, Michael Beddow |
Re: [xsl] xml to html paragraphing, Peter Flynn | Date | RE: [xsl] value of an element, Java XML |
Month |