[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] Apply-templates - how to omit top level element tags?


Subject: RE: [xsl] Apply-templates - how to omit top level element tags?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Thu, 08 Sep 2005 14:13:16 -0400

Hi Mike,

At 08:23 PM 9/7/2005, you wrote:
>> the canonical approach

I've never understood the word "canonical" even though I've looked up
the definition many times.  How do you mean it?

Ah, in this case I mean it in the sense of "approved by authorities", although it has to be admitted that there are no (official) authorities (we are not a church, though some might feel sometimes we have "religion"). Yet it's more than just "popular" since this is the way XSLT was designed to be used. Evidence for that assertion can be found both by reading the XSLT 1.0 Recommendation (which actually is not as tough as one might imagine), and by looking at simple and clean examples of stylesheets written to achieve the kinds of processing envisioned by that spec (which is to say, "terminal" or "final" transformations, not medial ones, and down-conversions especially into presentational formats such as HTML or XSL-FO, not up-conversions or cross-conversions into other forms of XML or other formats altogether).


>> As for the strengths and weaknesses of the paradigm: not only is
processing with templates not fraught with side-effects -- actually
side-effects (in the strict technical sense of having the state of the
processor being altered by the processing as it goes) is precisely what
the XSLT paradigm is designed to *avoid*.

I understand, and I've read that. But I believe you are referring to the
strict technical use of the term and I am referring to the practive
casual use of the term.

Yes.


  So for example, I can have working code in XSLT
and in my XML I add or change an element, or reorder elements, and the
XSLT it no longer outputs that content correctly. To me that's a side
effect. That's okay if the output is a few lines long where I can see
error, but if my document is 50 pages long, I won't notice it without
re-proofreading every time I touch the XSL, which is not practical.

Ah, I see.


As Jon has mentioned in this thread, a key thing to understand here is the operation of built-in templates and the expectation that the organization of the output will mirror, at least in significant respects, the organization of the input. When this is the case, the built-in templates provide for a "default traversal" of the input tree, in which the nodes of the tree are processed without rearrangement. Obviously it's not the case that

<a>
  <b>text here</b>
   more text here
  <c>more 'c' text here</c>
</a>

will *always* map to

<X>
  <Y>text here</Y>
   more text here
  <Z>more 'c' text here</Z>
</X>

but the assumption of XSLT is that most of the time, it will -- that a "plain vanilla" transformation needs to do little or nothing more than map 'a' to 'X', 'b' to 'Y' and 'c' to 'Z'. When this is the case, a stylesheet as simple as

<xsl:template match="a">
  <X>
    <xsl:apply-templates/>
  </X>
</xsl:template>

<xsl:template match="b">
  <Y>
    <xsl:apply-templates/>
  </Y>
</xsl:template>

<xsl:template match="c">
  <Z>
    <xsl:apply-templates/>
  </Z>
</xsl:template>

will do the job -- and likewise, *the same stylesheet* will gracefully convert

<a>
  <c>some 'c' text here</c>
  <b>some 'b' text here</b>
   more text here
  <c>more 'c' text here</c>
   and yet more text
  <b>and more 'b' text</b>
  <c>and finishing with some 'c'</c>
</a>

into

<X>
  <Z>some 'c' text here</Z>
  <Y>some 'b' text here</Y>
   more text here
  <Z>more 'c' text here</Z>
   and yet more text
  <Y>and more 'b' text</Y>
  <Z>and finishing with some 'c'</Z>
</a>

... which is just what you want.

This is because the default traversal of the input node tree (or any traversal that simply used xsl:apply-templates, without selecting nodes specifically or performing a sort) constructs the output to reflect the document order of the input.

Of course if you write a template that says

<xsl:template match="a">
  <xsl:apply-templates select="text()"/>
  <xsl:apply-templates select="b"/>
  <xsl:apply-templates select="c"/>
</xsl:template>

the order if your output will no longer reflect the order of your input. Nor should it, since you have explicitly requested that 'b' nodes be processed and added to the result tree before 'c' nodes are.

And this kind of thing can indeed have "side effects" in the sense that you mean it, if it is written to handle one order of nodes in the input but then encounters another order.

Accordingly, getting XSLT stylesheets to behave "correctly" is often (even usually) a matter of having them do as little as possible.

I understand that getting a validating schema can help, but those are
hard to author and don't solve the problem entirely (according to an
email reply to me from Michael Kay on this list.)

But for the basic problem you are describing here -- a problem avoided by not "over-writing" your stylesheets -- validation won't even necessarily help -- nor should it. Rather, what helps is merely the assurance of the processing model itself that *unless you do something to change this*, the order of the output will reflect the order of the input.


An exercise I often recommend for beginners:

1. Compose and try a "null" stylesheet (one with no templates at all) on a moderately complex input document. Consider the results.

2. Add a single template to this stylesheet:

<xsl:template match="*">
  <BOO>
    <xsl:apply-templates/>
  </BOO>
</xsl:template>

Run this on the same input and consider the results.

3. Add another template to your one-template stylesheet, this time matching an element in your input. (Ideally it would be an element that appears multiple times, sprinkled throughout your document possibly at many levels.)

<xsl:template match="your-element">
  <YAY>
    <xsl:apply-templates/>
  </YAY>
</xsl:template>

Again, run and consider.

If your theory of how the stylesheet processor works with its inputs (source document and stylesheet) does not account gracefully for all three of these cases, you know you have a little research to do to understand how it works. You might research:

1. Built-in default templates
2. Document order and element hierarchy (parents, child nodes and attributes)
3. apply-templates "select" vs template "match"

In view of these, understand that if an apply-templates instruction does not say to select anything in particular, select="child::node()" is assumed. So the instruction

<xsl:apply-templates/>

tells the processor it should select and process any child nodes (i.e. any elements, text nodes, comments or processing instructions inside this one), and to add the results of this processing (application of templates) to the result in the order reflecting the input.

Is there a concise set of rules for "properly constructed" stylesheets?
One that doesn't require someone to already understand how to "think" in
XSLT?

Well of course "properly constructed" has to be considered in view of the requirements ... which means that unfortunately the answer is "no". (We've worked on the rules, and several accounts have been given, but it is hard to be concise and the *application* of the rules is highly contingent on your particular problem, so over-generalizing is easy.)


Rather, I think learning to think in XSLT is a matter of:

a. Understanding and respecting the kinds of things XSLT 1.0 was designed for and does well
b. Understanding, preferably through practice, how it is designed to do those things
c. Understanding how close to that scenario is the problem you actually have at hand
d. Adapting the "canonical" (the approved, tried and true) methods to achieve your local goals


That said, by not addressing the specific question, are you implying
that what I asked about can't be done in XSLT, i.e. to have
<Element>Text<SubElement/> More Text</Element> and, with an inline
statement apply templates to <SubElement/> and output it's results along
with "Text .. More Text" but not outputting the tags for <Element>?

Not in the least. In fact I suspect that your problem is actually quite straightforward, even that you've solved it (a later post suggested that didn't it?), and moreover solved it by *simplifying* your code, which further suggests to me that you're on the right track even if certain things about the built-in processing model are still unclear.


BTW, it seems like is should be such a simple thing. I've been a
programmer for decades, but have also been a business man 1/2 that long,
and I can really see where the programmer's pendantic-ness can cause
business people to totally disrespect programmers. Business people just
need to get jobs done and can't afford the luxury of jumping thru ten
hoops for what appear to be purist reasons. I have been a pendantic
programmer many times in my past, especially when I trained programmers
and when I wrote a book on programming, but at this point I'm really the
business man who just needs to get this done so I can generate enough
revenue to pay my bills!

Understood. Stick with it -- bring both your programmer's mentality to bear, and your business mentality. But combine the virtues of both. As a programmer, you need to be curious about a powerful set of methods and techniques that may be new to you, but that have been demonstrated to work well on a given set of problems. Also, as a programmer you need to understand which tools are good for what, and learn to recognize the problems that are best handled with one set of tools, vs those best handled with another. (FWIW, it does seem to me that your transformation problems will be very well handled by XSLT once you get the hang of it.) As a businessman you can be pragmatic -- having assessed the risk, you can decide whether the benefits will be worth it. Here, the benefits will be that your code will be simple, easy to maintain and effective, even verifiably so. The risk is that your problems, despite appearances, are actually not XSLTish in nature and that those benefits will be harder to realize than early impressions suggest -- or that even if they are suited to XSLT, you, as a programmer, do not really have the time and labor to invest in mastering this new paradigm and learning to write that graceful and powerful code. (If you're going to end up writing Perl, then it's best done in Perl, etc.)


Thanks again for all your help.

Based on subsequent posts I think your problem may be solved, and that further researches will make it clearer to you how and why. Try the exercise outlined above; read Evan's chapter at


http://www.oreilly.com/catalog/xsltpr/chapter/ch03.pdf

plus anything else you can find on the XSLT processing model.

If it's not solved, please consider reposting the problem, with the usual samples of input/output/XSLT demonstrating the problem in microcosm. Sometimes in these threads we end up talking theory so much that we fail to define the actual problem at hand sufficiently well to really solve it....

Cheers,
Wendell


====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================


Current Thread
Keywords