[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] XSLT basics


Subject: Re: [xsl] XSLT basics
From: Mike Brown <mike@xxxxxxxx>
Date: Sat, 20 Oct 2001 00:16:58 -0600 (MDT)

Jonathan Yue wrote:
> I am a beginner on XSLT. I read some documents and am not
> very clear about these concepts. Please correct me if I
> understand it wrong.
> 
> 1) Element -- The whole thing from start tag and end tag
> 2) Node 

Jonathan,

Stop thinking about tags for a moment.

There are different ways of looking at XML. Yes, if you take the XML spec
at its word, an element "is" the element start tag to the element end tag,
and the element's content is everything in between.

However, you will have an easier time dealing with XPath and XSLT if you
try not to relate these terms directly to the text of the raw document and
its syntax. Instead, think of the text of the XML document, tags and all,
as being instructions on how to build a hierarchical data structure -- a
tree of "nodes".

Nodes are something that exist in an abstract (imaginary, implied) 
universe. Just think of them as "things"; little containers of
information.

In the world of XPath/XSLT, the nodes are given relatively simple
relationships to each other to form a hierarchical tree. Consider this 
tree:

               root node
                   |
             element node named 'stuff' (the document element)
                   |
           +-------+--------+
           |                |
       text node       element node named 'crap'
  with value 'hello'             \__attribute node named 'foo1'
                                  \__attribute node named 'foo2'

The document element is a "child" of the root node. The text node and 2nd
element node are children of the document element node. The root node is
an "ancestor" of all the nodes. The element node is the "parent" of the
attribute nodes, but in a twist of XPath and DOM weirdness, the attributes
are not children of the element they apply to. You can see that an
element is just one of several types of node.

Now consider that the node tree can be serialized into a linear syntax
consisting of certain sequences Unicode characters:

<stuff>hello<crap foo1="bar1" foo2="bar2"/></stuff>

These 'character' things are still a bit abstract, since computers need to
ultimately deal with them as bit patterns (0's and 1's). So the characters
are mapped ("encoded") into sequences of bits & bytes according to a
character-to-bit-pattern map that goes by a cryptic name like iso-8859-1
or utf-8 or one of a bazillion others. Once encoded as bits (or bytes or
whatever the most convenient level abstraction is for you), your data is 
ready for storage and/or transmission in copper and silicon.

And that... is XML. It is up to you to figure out how to best arrange your
data. Typically you use elements as named containers for other elements
and/or runs of character data that become text nodes in the XPath/XSLT
tree model. Attributes are name-value pairs that are attached to elements.
When to use attributes and when to use elements is a matter for XML Zen
101.

You're off to a good start. You will also need to understand namespaces,
what an XML parser does, and the XSLT processing model. Buy Michael Kay's 
XSLT Reference tome and pore over the introductory chapters.

> 3) Element node 
> 
>    -- Just the <book></book> part, excluding attributes, text nodes ...
> 
>    It seems the element includes attributes, text, etc., but the
>    element node does not. right or wrong?

Not necessarily wrong, but you should be thinking about trees, not tags,
or you will be burned hard by XSLT. Check out what I just posted at
http://skew.org/xml/stylesheets/treeview/html/ ... compare the
sample_input.xml (view it in a text editor, not Internet Explorer, so you
see the line breaks) and the sample_output.html. Then download an XSLT
processor like Instant Saxon or msxml.exe, and start applying
tree-view.xsl to your own XML documents.

> 4) descendant::* 
>    
>    -- includes all element nodes, attribute nodes, text nodes, etc. down
>    from the current node (in the tree). right or wrong?

The descendant and child axes do not include attribute nodes.

   - Mike
____________________________________________________________________________
  mike j. brown, fourthought.com  |  xml/xslt: http://skew.org/xml/
  denver/boulder, colorado, usa   |  personal: http://hyperreal.org/~mike/

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords