[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] When to use text()

Subject: Re: [xsl] When to use text()
From: Ihe Onwuka <ihe.onwuka@xxxxxxxxx>
Date: Sun, 23 Mar 2014 08:30:13 +0000

On Sat, Mar 22, 2014 at 2:48 PM, Abel Braaksma (Exselt) <abel@xxxxxxxxxx> wrote:
> Interesting thoughts.
> When designing a language, there will always be a lot of discussion
> about the choice of words for keywords, terminology, language
> constructs. Take C#, they used the word "assembly" for physically
> separated packages, and the word "namespace" for logical separations. To
> this day, many (starting) programmers have a hard time understanding
> those concepts, not in the last place because "assembly" reminds them of
> assembly language and "namespace" about XML namespaces. Similarly, why
> did they choose the keyword "fixed" when the meaning is to "pin" a variable?
> Those discussions will never end, and should never end. It will always
> remind language designers to think carefully about the words they choose.


Sometimes though bad habits are subliminally copied.

You shutdown a windows machine by clicking the start button.
You can shutdown an exist database by running start.jar

For me the most succinct and resonant commentary in the whole thread
has been largely ignored.

<quote author="David Sewell">(Applying the Sapir-Whorf hypothesis to
programming languages, i.e. the way a language encodes things
influences the way we think about them.)</quote>

> In this particular case, the working group at the time had a conflict of
> interest. There was XML, which was already defined, which had text
> nodes. And there was XPath (not XSLT) that required a method for
> selecting those text nodes. Since they were already called text nodes in
> DOM [1], it made sense to follow this nomenclature. Note that, in the
> XML Infoset, they did not exist, nor in the original XML specifications.
> Instead, they were called character information items[2], which referred
> to the individual characters, not the whole node.
> On the other hand they had a requirement to be able to atomize nodes, in
> other words, to turn them into what is commonly known in computing as a
> "string". There are languages that use the keyword TEXT when referring
> to strings, but many common languages use the keyword string.
> What were they to do? Are there other alternatives? Text nodes needed a
> name and atomized text nodes too. Both were an important requirement,
> because if you would always atomize, then how can you query mixed content?
> An important distinction is that text() is a a KindTest (it tests
> whether a given node is a text node, as such, it in fact returns a
> boolean), and string() and string(x) are functions that take an implicit
> or explicit argument and turn it into a string.
> One might argue that you could use is-text() and is-comment(), and
> conversely convert-to-string and the like But that doesn't work well in
> an expression as para/em/is-text() or even para/em[is-text()], because
> the semantics here are not "is" but "has" (select all the nodes that
> have an "em" parent, or select all the em-nodes that have one or more
> text children). And my argument against convert-to-string would be that
> it is annoyingly long, but that's just me. My argument against string()
> itself is that it looks too much like a constructor function, which it
> is not.
> I'm not saying that the choice of words is perfect, but I wanted to
> point out that the choice of words is never an easy one. W3C standards
> are created by consensus of all the members and it is an open process
> where non-members can submit bug reports to draft standards and the
> working group is required to look into them. If you have a strong
> argument, they are likely to take your argument seriously.

Yes this is the sort of process that one would have imagined. I would
describe it as compromise rather than consensus. My interpretation -
compromise - nobody gets all they want and you end up with something
that all parties agree to live with. Consensus - the parties agree
upon what the best thing is in the circumstances. We are going out for
a meal - I like Chinese you like Indian so we compromise and settle on
Italian - is that the best for all concerned - probably not so it's
not a consensus.

Back from digression.

IMHO there is an overarching viewpoint and here is how I would present it.

Extracting text from an XML document is the hello world of XSLT.
text() would appear to be an obvious way of doing that and it's really
important that it entails no surprises. If I were an XSLT antagonist
that is exactly the sort of thing I would  home in on to portray the
language as arcane, difficult to use and not suitable for my project.

> That said, I invite you and everyone on this list or elsewhere to look
> at the current XSLT 3.0 Last Call Working Draft[3]. Even now there are
> still some open bugs on choices of terms and keywords. It is still open
> for bug-reports from anyone, which you can file into W3C's bugzilla[4]
> (signing up is easy).

I guess I have to try and make time.

> Small disclaimer: I was not a member of the WG at the time they needed
> to make a choice for the string() function and text() kindtest, so the
> road to consensus I laid out above may not be the actual road that lead
> to consensus.
> Cheers,
> Abel Braaksma
> Exselt XSLT 3.0 processor
> http://exselt.net
> PS: you don't need to look up the spec to remind you of text() vs
> string(),

Before this thread I'd never given it a moments thought. Confession -
after this thread that situation is unlikely to have changed.

Why? Because I satisfice - and it's very tempting to preface that wIth
"like most other programmers in the world". It's really instructive to
look at the first two sentences of the wikipedia definition as it
exposes the contrasting viewpoints in this thread.

I wonder what peoples response was and what they did, back in the
pre-internet days when there wasn't a hyperlinked language
specification available at a click of a button. Programming still got
done then.

Personally I blame Michael Kay. The man is his own  worst enemy. If he
wants people to read the spec then he should stop producing  such
assiduous and fantastically written text books.

>in fact, just about any book on XSLT clearly explains their
> semantics and pitfalls. And you are right, people starting out with a
> language will start with a tutorial book, and that is exactly where they
> learn this distinction.

Never read XSLT for Dummies then did you.

Yeah haha explains alot, but I wouldn't knock it.

It was only after reading it that I was able to grok anything in the
other books I had tried to learn XSLT from.

Current Thread