[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Streaming with XSLT version 3.0


Subject: Re: [xsl] Streaming with XSLT version 3.0
From: Terry Badger <terry_badger@xxxxxxxxx>
Date: Sat, 8 Mar 2014 13:14:43 -0800 (PST)

MIchael,
I did run the process successfully. See my notes here. I have
reported it to Oxygen.
Details for running a large file with xslt v3 streaming
==========
Large source file is found here:
http://dumps.wikimedia.org/enwiki/20130403/enwiki-20130403-pages-articles-mul
tistream.xml.bz2
==========
Here is the result of Saxon running for a DOS
shell with a respectable 21 minutes and no out-of-memory report
C:\Temp\wiki>C:\Progra~2\Java\jre7\bin\java -Xmx180m -Xss4096k -Xms48m -cp
C:/saxon/saxon9ee.jar; net.sf.saxon.Transform -TJ -t -it:main 
-o:C:/Temp/wiki/out/wiki-03-output.xml C:/Temp/wiki/xsl/wiki-03.xsl 
Saxon-EE
9.5.1.4J from Saxonica
Java version 1.7.0_45
Using license serial number
V001638
Generating byte code...
Stylesheet compilation time: 476 milliseconds
Processing  (no source document) initial template = main
URIResolver.resolve
href="../source/enwiki.xml" base="file:/C:/Temp/wiki/xsl/wiki-03.xsl"
Using
parser com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Writing to file:/C:/Temp/wiki/out/output-wiki-03.xml
Execution time: 21m
24.612s (1284612ms)
Memory used: 25491272
NamePool contents: 28 entries in 27
chains. 7 URIs
==========
With this xsl stylesheet
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
   
xmlns:xs="http://www.w3.org/2001/XMLSchema"
   
xmlns="http://www.mediawiki.org/xml/export-0.8/" 
   
xpath-default-namespace="http://www.mediawiki.org/xml/export-0.8/"
exclude-result-prefixes="#all"
    version="3.0">
    <xsl:output
method="xml"/>
    <xsl:variable name="root" select="/"/>
    <xsl:mode
streamable="yes"/>
    <xsl:template name="main">
        <xsl:stream
href="../source/enwiki.xml">
            <xsl:result-document
href="../out/output-wiki-03.xml">
                <count>
                   
<xsl:iterate select="mediawiki/page">
                        <xsl:param
name="count" select="0" as="xs:decimal"/>
                       
<xsl:next-iteration>
                            <xsl:with-param name="count"
select="$count+1"/>
                        </xsl:next-iteration>
                        <xsl:on-completion>
                           
<xsl:value-of select="$count"/>
                        </xsl:on-completion>
                    </xsl:iterate>
                </count>
           
</xsl:result-document>
        </xsl:stream>
    </xsl:template>
</xsl:stylesheet>
============
With this result file
<?xml version="1.0"
encoding="UTF-8"?>
<count
xmlns="http://www.mediawiki.org/xml/export-0.8/%22%3E13355093%3C/count>
============
While running in Oxygen 15.2 with Saxon 9.5.1.3 with same source
and stylesheet file after about an hour we had an out of memory error. I have
reported it to Oxygen.

 

On Saturday, March 8, 2014 5:43 AM, Michael Kay
<mike@xxxxxxxxxxxx> wrote:
Could you try it outside oXygen? You can get a
30-day free Saxon-EE evaluation license to enable this. That will establish
whether the problem is primarily a Saxon one or an oXygen one, which will make
it a lot easier to help you.

Michael Kay
Saxonica

On 7 Mar 2014, at 23:10,
Terry Badger <terry_badger@xxxxxxxxx> wrote:

> David,
> Thank you. I tried
your suggestion but it still failed with an out-of-memory report.
> Terry
> 
>
> On Friday, March 7, 2014 9:10 AM, David Rudel <fwqhgads@xxxxxxxxx> wrote:
>
Terry,
> You can address the possibility that oXygen is simply choking on the
> output by wrapping your output in <xsl:result-document> instructions.
> 
>
If you pipe output to a file, oXygen does not attempt to display it in
> the
application when the scenario completes. This would eliminate at
> least one
possible reason for the crash without requiring you to run
> from the command
line.
> 
> -David
> 
> On Fri, Mar 7, 2014 at 1:09 AM, Abel Braaksma (Exselt)
<abel@xxxxxxxxxx> wrote:
> 
>> It is also important to try to find out what is
actually causing the
>> memory exception. If you run it from oXygen like you
say, it is very
>> well possible that the exception comes from oXygen itself,
not capable
>> of handling the output file. This would explain the late memory
>> exception. To find this out, simply run it from the command line, and
>>
what what happens to memory in task manager.
> 
> 
> -- 
> 
> "A false
conclusion, once arrived at and widely accepted is not
> dislodged easily, and
the less it is understood, the more tenaciously
> it is held." - Cantor's Law
of Preservation of Ignorance.
> 
> 
>
--~------------------------------------------------------------------
>
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> To
unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
> or e-mail:
<mailto:xsl-list-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx>

> --~-- 
> 
>
--~------------------------------------------------------------------
>
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> To
unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
> or e-mail:
<mailto:xsl-list-unsubscribe@xxxxxxxxxxxxxxxxxxxxxx>
> --~--


Current Thread
Keywords