[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Testing for upper and lower case


Subject: Re: [xsl] Testing for upper and lower case
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Thu, 03 Nov 2011 23:09:50 +0000

On 03/11/2011 16:23, Houghton,Andrew wrote:
Your string-to-codepoints example only works for ASCII upper/lower case letters. It fails to recognize composed and decomposed diacritical characters such as a combined uppercase A with a grave U+00C1, with an accute U+00C1, with a circumflex U+00C2, etc. Yes you could detect these too with additional logic, but matches() with a character class of \p{Ll}, \p{Lu}, \p{Lt} handles all the messy details of Unicode.

Andy.

If you want to handle both composed and decomposed characters then it's probably safest to use normalize-unicode() before using matches().

Michael Kay
Saxonica


Current Thread