Oxygen XML Forum

Posted: **Tue Feb 20, 2024 12:48 pm**

Hi,

I am facing a performance issue with a XSLT test using “following-sibling“ and I would like to know if anybody has a suggestion to optimize the test.

The problem is that the test-time increase exponential with the number of lines and some of the XML-files have 80.000 or more lines.

I have attached XSLT fil with a test using “following-sibling“ (Original test) and also a “for-each-group” test to see the performance, but it’s as bad as using “following-sibling“.

The test is used to make sure that each line identifier is unique (cbc:ID)

Any comments and suggestion are appreciated.

Rgds,
Dan

LargeTestFile.zip

MediumTestFile.zip

XSLT.zip

HugeTestFIle.zip

Posted: **Fri Mar 08, 2024 6:26 pm**

Can you tell us which XSLT processor you are using to run the tests? Is that Saxon 12? And if so, will it be the HE edition or can you also use the EE edition provided in oXygen?

I would certainly think that doing e.g.

Code: Select all

<xsl:template match="doc:Catalogue">
  <xsl:for-each-group select="cac:CatalogueLine" group-by="cbc:ID">
     <xsl:if test="current-group()[2]">
        <Error context="{name(parent::*)}/{name()}">
          <Pattern>current-group()[2]</Pattern>
           <Description>[F-CAT248] CatalogueLine.ID must be unique within the document instance</Description>
           <Duplicated-ID>{current-grouping-key()}</Duplicated-ID>
           <XPath>{path()}</XPath>
        </Error>
     </xsl:if>
  </xsl:for-each-group>
</xsl:template>

should perform way better than your code using following-sibling and preceding-sibling repeatedly.

If you can use EE I would also suggest to try for large data sets whether XSLT 3 streaming (capturing the ID in a map, for instance) is not way faster than doing traditional tree based processing navigating sibling axes.

Oxygen XML Forum

Performance issue when using “following-sibling“

Performance issue when using “following-sibling“

Re: Performance issue when using “following-sibling“