[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Performance tips to speed up multiple transforms


Subject: Re: [xsl] Performance tips to speed up multiple transforms
From: Liam R E Quin <liam@xxxxxx>
Date: Fri, 05 Nov 2010 19:25:41 +0100

On Fri, 2010-11-05 at 18:12 +0000, Neil Owens wrote:
> Evening all
> 
> I'm transforming around 500 individual XML log files - ~250MB worth -
> into 4 o[...]  I guess it's slow 'cos I'm shelling out to a hidden
> command box every time I transform a log file(open a shell, start
> java, open the Saxon .jar file, run the transform, append the 4
> results to the 4 'master' output files, close everything, repeat ).
> 
> What would the smart person be looking at to speed up the process?
> Write the whole script in Java?  .NET?  C++... Or just make a
> Transform 'do' the whole thing?

I'd probably consider a single transform.

You might run into memory problems -- as I recall, there's a way
to tell Saxon you've finished with a particular input document,
which may help. (Longer term, the new XSLT 3 streaming support
will help even more, I expect, but the design isn't final yet).

An alternative for you might be four separate passes, one for
each output file.

I admit sometimes I use a perl or sed script (on Linux) to
combine lots of small XML files, then check the result (e.g.
against a DTD or schema, or just for well-formedness) and then
use XSLT to split it up in some different way.

Liam

-- 
Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
Pictures from old books: http://fromoldbooks.org/
Ankh: irc.sorcery.net irc.gnome.org www.advogato.org


Current Thread
Keywords