[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] When to use text()


Subject: Re: [xsl] When to use text()
From: Graydon <graydon@xxxxxxxxx>
Date: Fri, 21 Mar 2014 14:36:41 -0400

On Fri, Mar 21, 2014 at 05:44:26PM +0000, Ihe Onwuka scripsit:
> On Fri, Mar 21, 2014 at 5:40 PM, Graydon <graydon@xxxxxxxxx> wrote:
> > But it doesn't.  The parent element does.  The issue is not that there
> > might be a comment node in the text node, but that there might be a
> > comment node child of the parent element node that separates the string
> > contents of the element into two or more text node children.
> 
> If it's a guaranteed leaf you shouldn't get the contents of anything
> else when you ask for it.

You don't.

The problem is when you get *less* than you expect.

Consider the fragment

<species>white-winged <i>scoter</i></species>

("The scoters are stocky seaducks in the genus Melanitta.")

The result of species/text() is "white-winged " which is very probably not
what you want.

species/string() gives "white-winged scoter" which probably is what you
want.  (Though experience suggests species/normalize-space() is more
likely to be what you really want in that case.)

Then consider

<species>white-winged<!-- check that hyphen --> scoter</species>

In XSLT 2.0, species/text() returns two values, "white-winged" and "
scoter" because that comment node can't be a child of a text node and
compels the string contents of the species element to be expressed as
two distinct text nodes.

So you've got species/text()[1] and species/text()[2] as
("white-winged", " scoter") being returned by species/text() from an
element that looks like

       species
 /        |         \
|         |          |
text()  comment()   text()

(pardon the ascii art) in the actual tree.

If you're trying to, for instance, set an attribute node with the value
returned by species/text() in the result tree, you've got an error
because you're trying to set an attribute value with a sequence and
that's not allowed.

string() and text() are different; string is all the descendant text
nodes of the context node returned in document order, and text() is all
the text node children of the context node in document order.  This is
not an obvious distinction to make when you're looking at things like
<term>word</term>, but gets a lot more useful when you're looking at
paragraphs with inline markup.

-- Graydon


Current Thread
Keywords