[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] following-sibling is evil


Subject: Re: [xsl] following-sibling is evil
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sun, 6 Jul 2014 10:28:55 -0000

Any reasonably intelligent processor will do an "early exit" on such an
expression, which means it will only scan the following-sibling axis until it
finds the first node that matches the predicates.

However, this kind of logic is still O(n^2), and the performance is especially
bad if there are few or no duplicates.

It's much better to use constructs such as key() or distinct-values() or
xsl:for-each-group for this kind of processing, though it may be difficult to
achieve that in Schematron.

Michael Kay
Saxonica
mike@xxxxxxxxxxxx
+44 (0118) 946 5893



On 6 Jul 2014, at 10:34, Costello, Roger L. costello@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Hi Folks,
>
> I have a list of <Book> elements. Each Book contains Title, Author, and
Genre.
>
> For each Book I wish to determine if there is a following Book with the same
Author and Genre.
>
> I am using Schematron. This XSLT is generated from my Schematron rule:
>
>    <xsl:template match="Book">
>        <xsl:variable name="currentAuthor" select="Author" />
>        <xsl:variable name="currentGenre" select="Genre" />
>        <xsl:if test="following-sibling::Book[Title eq $currentAuthor][Genre
eq $currentGenre][1]">
>            <xsl:text>There is another Book with the same Author and
Genre</xsl:text>
>        </xsl:if>
>    </xsl:template>
>
> Notice the XPath expression:
>
> following-sibling::Book[Title eq $currentAuthor][Genre eq $currentGenre][1]
>
> As I understand it, an XPath processor will evaluate that XPath expression
like so:
>
> Step 1. Gather up all the following sibling Book elements. Let's denote the
resulting set by S1.
>
> Step 2. Filter S1 by eliminating those Books that don't satisfy this
predicate: [Title eq $currentAuthor]. Let's denote the resulting set by S2.
>
> Step 3. Filter S2 by eliminating those Books that don't satisfy this
predicate: [Genre eq $currentGenre]. Let's denote the resulting set by S3.
>
> Step 4. Filter S3 by eliminating all Books except the first.
>
> Do I correctly understand how an XPath processor will evaluate the XPath
expression?
>
> Suppose the list contains 100,000 Books.
>
> Consider evaluating the XPath expression for the first Book in the list. The
XPath processor must collect 99,999 Books and apply the steps to them.
>
> For the second Book the XPath processor must collect 99,998 Books and apply
the steps to them.
>
> For the third Book the XPath processor must collect 99,997 Books and apply
the steps to them.
>
> And so forth.
>
> Yikes!
>
> That XSLT program will take a very long time to execute!
>
> Is there a way to modify my XPath so that the XPath processor stops as soon
as it gets to a Book that matches all the predicates?
>
> /Roger


Current Thread
Keywords