[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Streaming with XSLT version 3.0

Subject: Re: [xsl] Streaming with XSLT version 3.0
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Thu, 6 Mar 2014 16:45:21 +0000

On 6 Mar 2014, at 12:58, Terry Badger <terry_badger@xxxxxxxxx> wrote:

> I have a 42 GB media and valid xml file as my source. I am using Oxygen 15.2
with Saxon ee I am using this stylesheet which as you can see I
turned off what I wanted to do to see if I could get to the end of the file.
After about 45 minutes it hits my memory limit and quits. Am I doing this
right or will I need to cut this elephant into pieces?
> Terry

This should work. The most common reason for this kind of problem is that
people try to supply the input document as a conventional source document as
well as supplying it to xsl:stream, so it builds a tree in memory anyway.
However, that would normally fail much sooner than 45 minutes - though you
didn't say how much memory was available. Do you get a stack trace at the
point where it runs out of memory?

(I know you don't want to run for 45 minutes just to find out! I find it very
useful when developing streaming applications to have something smaller to
play with, preferably about 1Gb. Of course, creating such a file is much
easier once you have got streaming working!)

Michael Kay

> <xsl:stylesheet
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns="http://www.mediawiki.org/xml/export-0.8/"
> xpath-default-namespace="http://www.mediawiki.org/xml/export-0.8/"
> exclude-result-prefixes="#all"
> version="3.0">
> <xsl:outputmethod="xml"/>
> <xsl:templatename="main">
> <xsl:streamhref="../source/enwiki.xml">
> <!--  <xsl:result-document href="../out/output-wiki-02.xml">
> <xsl:for-each select="mediawiki">
> <xsl:element name="mediawiki">
> <xsl:for-each select="page[position() &lt; 10]">
> <xsl:copy-of select="."/>
> </xsl:for-each>
> </xsl:element>
> </xsl:for-each>
> </xsl:result-document>-->
> </xsl:stream>
> </xsl:template>
> </xsl:stylesheet>

Current Thread