[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] How to stream-process non-XML text using unparsed-text-lines( ) ?

Subject: Re: [xsl] How to stream-process non-XML text using unparsed-text-lines( ) ?
From: "mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 24 Jul 2014 09:36:55 -0000

In Saxon, just use xsl:for-each
IIRC the Saxon 9.5
implementation is streamed, but its definition of "lines" is
based on Java readLine() which has some subtle differences from the
definition in the XPath spec as to exactly what counts as a line
xsl:stream is only for use with XML input.
> Hi Folks,


> In Abel Braaksma's XML London paper he writes:


> This paper discusses streaming of XML, but XPath 3.0

> introduces a new function, fn:unparsed-text-lines,

> which takes an external resource as input and parses it

> line by line. The original intent of that function was to

> allow unparsed data to be streamed; however, the

> Working Group at some point decided to not formalize

> this requirement. But, the specification leaves

> enough room for implementers to allow streamed

> processing of data read through this function. When

> your intention is to do streaming of unparsed input, you

> should check the capabilities of your processor to find

> out whether it can do streaming using this function.


> Neat!


> So, I have two questions:


> 1. Do any XSLT processors support streaming of non-XML text using

> unparsed-text-lines( )?


> 2. Suppose the XSLT processor is streaming the strings returned

> calling unparsed-text-lines( ), how would I construct the XSLT


> Example: Here is a non-XML text file:


> Six Great Ideas/Mortimer J. Adler/1981/0-02-072020-3/Macmillan

> Company

> Illusions/Richard Bach/1977/0-440-34319-4/Dell Publishing Co.

> The First and Last Freedom/J. Krishnamurti/1954/0-06-064831-7/Harper

> Row


> I want to stream-process it. I want to count the number of lines.
Below is

> my attempt at an XSLT program to implement this. I am not sure that

> code even makes sense. Does it? What is the right way to do it?


> -----------------------------------------------------

> <xsl:stylesheet

> version="3.0">


> <xsl:output method="xml" />


> <xsl:template match="/">

> <xsl:stream

> <count>

> <xsl:value-of select="count(tokenize(., '\n'))" />

> </count>

> </xsl:stream>

> </xsl:template>


> </xsl:stylesheet>

> -----------------------------------------------------

Current Thread