[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Sorting Upper-Case first. Microsoft bug?

Subject: Re: [xsl] Sorting Upper-Case first. Microsoft bug?
From: "W. Eliot Kimber" <eliot@xxxxxxxxxx>
Date: Sat, 09 Aug 2003 20:38:47 -0400

David Carlisle wrote:

Dr. Johnson and every lexicographer since has used case as the least significant, most rapidly varying element in ordering. The example I have in front of me from the Concise Oxford Dictionary lists daily - Dalmatian - dalmatic and I would not expect it to do anything else.
Dictionaries are not really a good example to follow here as they don't
have to deal with all strings, it probably doesn't list
DAILY or dalmatioN at all, but xsl:sort has to deal with these things.

I haven't seen anyone mention that in the general case it is not possible for any XSLT implementation to define the appropriate collation rules for all possible uses of sort--the variance even within a single language is too great, as evidenced by, for example, the discussion of back-of-the-book index sorting in the _Chicago Manual of Style_. In addition, the Unicode standard is very clear that the ordering of characters in the Unicode character set does not define the collation sequence for any language or writing system. While most alphabetic languages have a natural or default collation order, sylabic and ideographic languages mostly do not.

For example, Simplified Chinese is collated in terms of its pin-yin transliteration. That is, a character transliterated as "pi" would sort under "p". But there is no universal agreement about what the transliteration of every character is--some authorities might transliterate "pi" as "bi", for example.

Not to mention that collation rules could vary within a single document. For example, the index might use one set of rules (for example, ignoring punctuation and spaces) while a generated glossary or parts list respects them.

Any XSLT implementation that does not provide a way for users to easily integrate custom collators will not be useful for a number of important use cases, including producing back-of-the-book indexes. In particular, any application that needs to do culturally- and editorially-appropriate collation in non-Western lanuages (essentially the languages and locales for which Java does not currently provide appropriate Collator implementations) will only be able use XSLT processors that provide a way to specify custom collators.

As far as know, only Saxon provides this facility today (although I haven't looked into MS-XSL's extension facilities since all my work is done in Java).

Cheers,

Eliot
--
W. Eliot Kimber, eliot@xxxxxxxxxx
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX  78752 Phone: 512.656.4139

XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list

Current Thread
Re: [xsl] Sorting Upper-Case first. Microsoft bug?, (continued) David Carlisle - Thu, 7 Aug 2003 22:23:25 +0100 David . Pawson - Fri, 8 Aug 2003 08:25:53 +0100 John Marshall - Fri, 8 Aug 2003 08:56:07 +0100 David Carlisle - Fri, 8 Aug 2003 10:39:07 +0100 W. Eliot Kimber - Sat, 09 Aug 2003 20:38:47 -0400 <= Stan Devitt - Fri, 08 Aug 2003 09:23:06 -0400 David . Pawson - Tue, 12 Aug 2003 14:18:57 +0100 John Marshall - Thu, 28 Aug 2003 11:31:01 +0100 David Carlisle - Thu, 28 Aug 2003 12:09:23 +0100

Current Thread

Re: [xsl] Sorting Upper-Case first. Microsoft bug?, (continued)
- David . Pawson - Fri, 8 Aug 2003 08:25:53 +0100
- John Marshall - Fri, 8 Aug 2003 08:56:07 +0100
  - David Carlisle - Fri, 8 Aug 2003 10:39:07 +0100
    - W. Eliot Kimber - Sat, 09 Aug 2003 20:38:47 -0400 <=
  - Stan Devitt - Fri, 08 Aug 2003 09:23:06 -0400
- David . Pawson - Tue, 12 Aug 2003 14:18:57 +0100
- John Marshall - Thu, 28 Aug 2003 11:31:01 +0100
  - David Carlisle - Thu, 28 Aug 2003 12:09:23 +0100

<- Previous	Index	Next ->
Re: [xsl] Sorting Upper-Case first., David Carlisle	Thread	Re: [xsl] Sorting Upper-Case first., Stan Devitt
RE: [xsl] Mozilla and IE6 are diffe, Julian Reschke	Date	RE: [xsl] pulling a document with a, Michael Kay
	Month

Keywords

unicode
xslt

Re: [xsl] Sorting Upper-Case first. Microsoft bug?

Products

Features

Shop

Resources

Support

Company