[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Selecting First Direct Sibling


Subject: Re: [xsl] Selecting First Direct Sibling
From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx>
Date: Wed, 22 Aug 2007 10:03:09 +0100

On 8/21/07, David Carlisle <davidc@xxxxxxxxx> wrote:
>
>
>    You are using XSLT 2.0
>    You are using a schema-aware processor
>    You have a schema
>    The schema is parsed when the document tree is built
>
>
> No, it also applies to basic (not schema aware processors) and DTD
> specified element content.
>
> saxon * B for example defaults to stripping white space in (dtd specified)
> element content but has command line options to not do this or to strip
> all white space (whether in element or mixed content)

This looked interesting so I did a quick example:

<root>
	<node/>
</root>

and:

<xsl:value-of select="count(/root/node())"/>

returns 3 as expected.

Then if you add a DTD:

<!DOCTYPE root [
  <!ELEMENT root     (node)>
  <!ELEMENT node    (#PCDATA)>
]>
<root>
	<node/>
</root>

and count the nodes again:

<xsl:value-of select="count(/root/node())"/>

the result is 1, which is demonstrating Davids point.

If your DTD specifies mixed content then you're ok:

<!DOCTYPE root [
  <!ELEMENT root    (#PCDATA|node)*>
  <!ELEMENT node    (#PCDATA)>
]>
<root>
	<node/>
</root>

counting the nodes here:

<xsl:value-of select="count(/root/node())"/>

returns 3

...so it seems when the XML has been validated you can confidently
drop whitespace nodes that you know are presentational whitespace.

It all makes sense, and I suppose it must be worthwhile otherwise why bother?

The gotcha case would be something like:

<!DOCTYPE p [
  <!ELEMENT p    (b|i)*>
  <!ELEMENT b    (#PCDATA)>
  <!ELEMENT i    (#PCDATA)>
]>
<p>
	<b>hello</b> <i>world</i>
</p>

With the DTD the output is: "helloworld"

Without the DTD the output is "hello world"

DTDs canst both addeth, and taketh awayeth :)


cheers
andrew
-- 
http://andrewjwelch.com


Current Thread
Keywords