[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] for-each-group grouping accented versions of letters together


Subject: Re: [xsl] for-each-group grouping accented versions of letters together
From: Graydon <graydon@xxxxxxxxx>
Date: Sat, 21 Apr 2012 10:36:11 -0400

On Sat, Apr 21, 2012 at 03:02:22AM +0200, Imsieke, Gerrit, le-tex scripsit:
> You can strip the accents by unicode decomposition and then removing
> the diacritical marks:
> 
> <xsl:for-each-group select="index-0"
>   group-by="substring(
>               upper-case(
>                 replace(
>                   normalize-unicode(heading, 'NFKD'),
>                   '[&#x300;-&#x36f;]',
>                   ''
>                 )
>               ), 1, 1
>             )">
>   <xsl:sort select="current-grouping-key()"/>

Thank you!

I had tried decomposing, using replace with \p{Lm} and then recomposing
with NFKC, and that didn't work, but it was also fairly late on Friday
afternoon.

> When writing the group (= starting letter) to an output file further
> down in you template, you should sort it according to the
> upper-case(b&) part as first sort key, then according to the actual
> heading as a second (tie-breaker) sort key.
> 
> So itbs best to make a function (call it, e.g., my:sortkey) out of
> upper-case(b&).

Yes.

> In that function, you can also do other useful stuff, such as
> eliminating stop words or replacing all numbers with a zero, so that
> everything that starts with a number will be in the same group.

Fortunately these are very uncomplicated headings, so no stop words, but
the point about numbers is very well taken.

Thanks!
Graydon


Current Thread
Keywords