Sloooow XPath evaluation with large files

Having trouble installing Oxygen? Got a bug to report? Post it all here.
dsewell
Posts: 125
Joined: Mon Jun 09, 2003 6:02 pm
Location: Charlottesville, Virginia USA

Sloooow XPath evaluation with large files

Post by dsewell »

oXygen takes a very long time to evaluate and return an XPath expression for large files. For example, with a 1.3 MB file it takes nearly a minute to evaluate //body in a TEI document, on a dual G4 Power Mac. The same XPath query takes under a second to process using the GNU libxslt2.

Is this just a function of the Java XML parser? Can anything be done to improve performance?
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Post by george »

Hi David,

Please let us know more details.
Do you see the same delay when you run an XPath like /* ?
How many results do you get ?
What is the size of your document ?

Also check to have the TEI catalog set.

Thanks,
George
dsewell
Posts: 125
Joined: Mon Jun 09, 2003 6:02 pm
Location: Charlottesville, Virginia USA

More details

Post by dsewell »

There are delays with any XPath expression. The ones that return a lot of results are somewhat slower than an expression that returns a single node or only a few nodes. Or if I enter an XPath expression that points to a nonexistent node like /foo/bar/baz, it also takes a very long time to return a null result. For example, about 75 seconds to return results on an XML file with a size of 1438094 bytes. For a comparison, if I write an XSLT script to return each /foo/bar/baz, the transformation using Saxon takes about 6 seconds on my system.

I do have the TEI catalog set in my preferences. But there is no difference in performance if I remove the DOCTYPE declaration and do the same search on the file.
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Post by george »

Hi David,

For a 1M document I get the result of /foo/bar/baz in about 2 seconds. Can you zip and send to support@oxygenxml.com a document to see if we get the same results here ? If this is not possible let us know and we will poit you to some document so we can run similar tests.

We are using the Xalan XPath API - what time do you get if you run the XSLT script with Xalan ?

Best Regards,
George
dsewell
Posts: 125
Joined: Mon Jun 09, 2003 6:02 pm
Location: Charlottesville, Virginia USA

Post by dsewell »

George -- I will email a file to support.

If I use the Xalan transformer in the XSLT configuration instead of the Saxon, it processes the XSLT script to return /foo/bar/baz in about 4 seconds (compared to 6 for Saxon).

The choice of XSLT transformer should not affect the behavior of the XPath toolbar search, should it?
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Post by george »

(Just to update the forum entry)

The longer time compared with running a stylesheet is due to setting some properties on the Transformer when running the XPath query to make it report location information needed to locate the result hits in the document.

The solution is a medium term one and implies rewriting the XPath support using Saxon instead of Xalan.

The choice of the XSLT transformer engine does not affect the XPath execution time.

Best Regards,
George
Post Reply