[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] XSLT streaming: the processor "remembers" things as it descends the XML tree?


Subject: Re: [xsl] XSLT streaming: the processor "remembers" things as it descends the XML tree?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Wed, 20 Nov 2013 21:58:26 +0000

Many non-streaming stylesheets will process the tree using a recursive walk,
typically using the Java stack, and this means they will tend to fail if the
depth is greater than 500 or so. I think we can treat anything above this kind
of level as pathological. A streaming processor may well have similar limits
imposed by the size of Java stack space. This would occur whether or not
information about ancestors is retained by a streaming processor.

Handling extremely deep documents probably needs a different set of techniques
from handling extremely wide documents. The streaming facilities in XSLT 2.0
are focussed on the latter, because that's what our use cases tell us is
needed.

It doesn't make much difference whether its 10 levels or 30 levels deep, but
once you get into the hundreds and thousands, recursion depth becomes
significant.

Michael Kay
Saxonica

On 20 Nov 2013, at 20:09, Michael M|ller-Hillebrand <mmh@xxxxxxxxx> wrote:

> Am 20.11.2013 um 16:25 schrieb Wendell Piez <wapiez@xxxxxxxxxxxxxxx>:
>
>> I think it would be very interesting to see a survey of how deep XML
>> documents go in the wild. Except for pathological cases, I think they
>> would rarely go beyond 20 deep.
>
> It really depends on the document type. I just looked at a document
(Operating Manual) from our CMS and it gives me 27 for
"max(//node()[not(node())]/count(ancestor::node()))":
>
> * root element
> * 14 levels for elements that control referencing modules from the CMS and
build hierarchy
> * 5 levels for table structure
> * 7 levels: module structure, block level and inline level elements.
>
> Maybe those 14 levels could be seen as pathological but even by removing
some of those, there will still be 7 levels building hierarchy, which results
in a total of 20 levels. But I can easily see that some other customers are
using an even more specialized DTD/XSD which e.g. handles technical data at
additional levels. Or, if you have tables in tables it will give you another 5
levels
>
> So, from my point of view 2030 levels seems like normal business.
>
> - Michael
>
> --
> Michael M|ller-Hillebrand
> mmh@xxxxxxxxx


Current Thread
Keywords