[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] parsing large xml files using Saxon 6.5.2


Subject: RE: [xsl] parsing large xml files using Saxon 6.5.2
From: "Michael Kay" <mhk@xxxxxxxxx>
Date: Mon, 11 Aug 2003 17:00:57 +0100

So what's the difference between the 18.1Mb run that ran "for hours",
and the 19.2Mb run that ran in 26 seconds? Somewhere there is a
significant difference that explains the problem, and you haven't given
us enough information to find it.

Running with the -T option can be useful. It will produce far more
information than you can analyse, and will slow down processing
considerably, but it should give you some indication as to whether the
processing is hung, looping, or just doing a lot of work.

The evidence of your measurements is that the stylesheet's performance
is essentially linear.

I would advise, by the way, moving off Instant Saxon to full Saxon for
any serious work. The Microsoft Java VM is now a thing of the past, so
any benefits that Instant Saxon once offered have pretty well
disappeared.

Michael Kay

> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx 
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of marina
> Sent: 11 August 2003 13:36
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] parsing large xml files using Saxon 6.5.2
> 
> 
> Hi,
> 
> I am having problems parsing some xml files.I have a
> 1ghz processor and 256Meg Ram.
> 
> The xslt stylesheet "wordgroup.xsl" from Dimitri
> (thankyou!) wwas tested
> and worked perfectly on smaller test files. When I run
> it on a larger file
> "1cl.xml" = 18.1Mb it builds the tree for 
> str-Split-to-words.xsl and then sits there for hours.
> 
> See output below.
> 
> --------------------------------------------------------------
> -------------------
> Microsoft Windows 2000 [Version 5.00.2195]
> (C) Copyright 1985-2000 Microsoft Corp.
> 
> h:\saxon\testbed>saxon -t -o output.txt 1cl.xml
> wordgroup.xsl
> SAXON 6.5.2 from Michael Kay
> Java version 1.1.4
> Preparation time: 371 milliseconds
> Processing file:/h:/saxon/testbed/1cl.xml
> Building tree for file:/h:/saxon/testbed/1cl.xml using
> class com.icl.saxon.tinyt
> ree.TinyBuilder
> Tree built in 7070 milliseconds
> Building tree for
> file:/h:/saxon/testbed/strSplit-to-Words.xsl using
> class com.i
> cl.saxon.tinytree.TinyBuilder
> Tree built in 10 milliseconds
> 
> --------------------------------------------------------------
> -------------------
> 
> 
> So I made another xml file "little.xml" by pasting
> sections of 1cl.xml in different sizes to see 
> 
> where it was having problems processing.
> 
> little.xml = 1.4Mb time = 1.2sec
> little.xml = 4.4Mb time = 3.3 sec
> little.xml = 7.3Mb time = 6 sec
> little.xml = 10.3Mb time = 9.8 sec
> little.xml = 19.2 Mb (bigger than the file I want to
> parse!) time = 26.1 sec! (see nice output 
> 
> below)
> 
> 
> h:\saxon\testbed>saxon -t -o output.txt little.xml 
> wordgroup.xsl SAXON 6.5.2 from Michael Kay Java version 1.1.4 
> Preparation time: 701 milliseconds Processing 
> file:/h:/saxon/testbed/little.xml Building > tree for 
> file:/h:/saxon/testbed/little.xml using class 
> com.icl.saxon.ti nytree.TinyBuilder Tree built in 7912 
> milliseconds Building tree for 
> file:/h:/saxon/testbed/strSplit-to-Words.xsl > using class 
> com.i cl.saxon.tinytree.TinyBuilder Tree built in 20 
> milliseconds Execution time: 26178 milliseconds
> 
> Any ideas for me to try?
> 
> Thanks
> 
> Marina
> 
> 
> 
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design 
> software http://sitebuilder.yahoo.com
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords