[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: [xsl] re: Generate identifier
Subject: Re: [xsl] re: Generate identifier From: Liam R E Quin <liam@xxxxxx> Date: Thu, 07 Jan 2010 15:17:26 -0500 |
On Thu, 2010-01-07 at 05:27 -0800, Vladimir Nesterovsky wrote: > Is there a way to decompose characters like: > C& 'LATIN SMALL LETTER AE' (U+00E6) > > into a separate letters? > Are there many such characters derived from Latin (I'll be calling > replace() if it's only one or two)? The primary ones are OE and AE (and oe and ae) and I usually special-case them, as you can turn them into either the two letters or just an e, depending on whether you favour "mediaeval" or "medieval", "foetus" or "fetus" and so on. There are quite a few others, though, e.g. D2 E R$ R4 S(cyrillic) V (armenian) W0 (werbeH), [ (Arabic), o, o, o, o, o, A search for "ligature" in the Unicode database - or, e.g. in Linux/Gnome, the character map utility, gucharmap - will find them. For my purposes (e.g. making filenames and URIs from dictionary headwords) I turn sequences of one or more non-letters into "-", after handling accents and ligatures. Liam -- Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/ Pictures from old books: http://fromoldbooks.org/ Ankh: irc.sorcery.net irc.gnome.org www.advogato.org
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Re: Generate identifier, G. Ken Holman | Thread | [xsl] How can I achieve correct tai, Frédéric Schwebel |
[xsl] [Announce] Upcoming hands-on , G. Ken Holman | Date | [xsl] Incrementing a page number wi, Flanders, Charles E |
Month |