[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] Re: How to match a element + part of an immediate text sibling?


Subject: RE: [xsl] Re: How to match a element + part of an immediate text sibling?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 14 Jan 2004 11:23:45 -0500

At 04:59 AM 1/14/2004, Mike wrote:
...
Secondly, there is nothing in XSLT 1.0 that allows you to split a string
into its component words. You can do it yourself using a recursive
template (there are examples in my book XSLT Programmers Reference), or
you can use a vendor- or third-party extension function xx:tokenize().

This may seem obvious and gratuitous (for which I apologize), but I hasten to add that this is only a *particular* notion of what a "word" is (a substring delimited by white space), which may not be robust enough for all purposes. For example, if your input reads


<p> the quick <em>brown</em> fox, ears up, jumps </p>

you may want your output to read not

<p> the <em>quick brown fox,</em> ears up, jumps </p>

but

<p> the <em>quick brown fox</em>, ears up, jumps </p>

which will require a more sophisticated definition of the concept of a "word", and which will not be so tractable using basic substringing around whitespace (or a simple tokenize function either, FTM).

This kind of thing is not impossible to work around in most real-world cases, but since XSLT 1.0 is not designed for up-conversion, it can get pretty hairy.

But it all depends on the actual processing requirements for the data. Caveat lector.

Cheers,
Wendell


====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list




Current Thread
Keywords