[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] use XSLT or XQuery in Saxon?

Subject: RE: [xsl] use XSLT or XQuery in Saxon?
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 6 Jan 2005 12:10:46 -0000

> > I have extremely large (over 300 MB) XML file and tens
> > of thousands of small xml files generated after
> > applying various XSLT on the one big XML file.
> I don't know whether Mr Kay have tested Saxon with 100+MB 
> files or not, but we 
> did (6.5.?), and could not get a simple transform to complete 
> within hours (I 
> think we gave up after ~4hours on a 80-100MB file), on a 
> machine with 1GB of RAM.

I've only gone up to about 50Mb myself, but I know of users who've gone up
to 200Mb.

For one Saxonica client I managed to get the processing time for a 40Mb
transformation down from 90 minutes to 45 seconds. Once you've allocated
enough memory, if it still takes hours then it's because there's a
non-linearity in the stylesheet logic, and this can usually be eliminated by
careful use of keys, sorting, or grouping.

But I do agree with you that there are some problems that are better tackled
with a SAX-based Java application: or sometimes a SAX filter as a precursor
to an XSLT transformation.

Michael Kay
> I wrote a custom transformer in Java doing exactly what we 
> needed using;
>  *  SAX events
>  *  Only keeping one branch/leaf of the XML tree in memory at 
> any time.
>  *  Aggregation of content into small mutable value objects, 
> which were output 
> and discarded when completed.
> 1500 files, varying from 360MB to ~10MB of a total of ~10GB 
> could be processed 
> in a linear speed of ~2MB per second, or close to the disk 
> drive speed, on a 
> dual CPU workstation.
> I suspect that you will end up in 'custom transformer' 
> territory, but perhaps 
> Saxon has improved and can deal with the transforms you give 
> it. I suggest 
> that you make some simple tests first, which somewhat 
> ressemble what you need 
> to do later.
> Cheers
> Niclas
> -- 
> ---------------
> If at first you don't succeed, destroy all evidence that you tried.
>  -  Steven Wright
> +---------//-------------------+
> |   http://www.dpml.net        |
> |  http://niclas.hedhman.org   |
> +------//----------------------+

Current Thread