[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Preferred declarative approach for outputting tallies based on complex triggers


Subject: Re: [xsl] Preferred declarative approach for outputting tallies based on complex triggers
From: David Carlisle <davidc@xxxxxxxxx>
Date: Thu, 10 Apr 2014 14:09:51 +0100

On 10/04/2014 13:20, David Rudel wrote:
On Thu, Apr 10, 2014 at 2:00 PM, David Carlisle <davidc@xxxxxxxxx>
wrote:

Well quite I was going to ask what you mean by
"declarative/non-declarative and "updating state variables" in an
XSLT system.

Didn't you hear? Michael Kay showed us how to update variables in XSLT :J (reference to another thread...)


if you mention saxon:assign you will be banished to comp.text.fortran


Seriously, I was just referring to using a variable of the same name
 in a new scope, as in: <xsl:next-iteration> <xsl:param name="var1"
select="$new.var1"/> </xsl:next-iteration>

yes well (I know:-) but since xslt has "variables" and "parameters" and that's a parameter it's confusing in an XSLT context to call them variables.



I'd do something like something (untested)

<xsl:variable name="sids" select="31,35"/> <xsl:variable name="a"
select="(item[1],item[@id=$sids][1])[last()]"/> <xsl:variable
name="b" select="(item[last()],item[@id=$sids][last()])[last()]"/>
 <xsl:variable name="s"
select="$a|item[$a&lt;&lt;.][.&lt;&lt;$b]|$b"/>

no of items <xsl:value-of select="count($s)"/> no of specials
<xsl:value-of select="count($s[@id=$sids])/> avg  <xsl:value of
select="sum($s/@value) div count($s)"/>


Thanks, David. I was trying to avoid this approach for performance considerations. I wanted to do this type of analysis in a single pass.

Shrug. I just work on the assumption of absolute faith in Michael's
ability to optimise whatever code I use to do the right thing at run time.

(Also, I don't think this specific implementation addresses the
possibility that an item with the same @id might show up  multiple
times, and only the first one is "special." So I was expecting this
approach to need something like "for $s in $sids return index-of($s,
 $seq/@id)[1]" to retrieve the positions of the special items... this
 seemed to me to be quite expensive if there were several such items
 and the number of $items is large.

I didn't follow all the details but usually use of keys as in muenchian grouping and/or use of for-each-group can avoid the quadratic behaviour of a double loop (but might still be O(n log n) rather than O(n) as you'd get from a single pass).


So I was planning on using <xsl:iterate> and keeping track of information such as: Have I seen a special item yet? (this is what I mean as a "trigger" as it signals when to start keeping track of data.)

What $sids have I not seen yet? (This lets me judge whether a given
item is "special" since it is only special the first time.)

Yes as Michael and I both said in our first replies xsl:iterate is the same as a recursive function except set up for streaming use.

What is the sum of all values for @value among items I _know_ to be in the subsequence I care about.

What is the sum of all values for @value that I have seen since the
last time I saw a special item? (This sum will be added to the above
 sum next time I see a special id for the first time.)

Getting two values out of a sequence without iterating over a sequence twice is course a perennial problem in functional programming generally and the answers are the same whatever language, either don't worry about it (as in my code sketch) or use a recursive function (or application of fold) with a function that returns some kind of structure that holds both results.

And similar information like the above that can then be brought together at the end to give the results I want in a single pass.

-David

If your sequences are long enough that you need to worry about any of
this, especially if they are long enough that you don't want to hold
them all in memory, then xsl:iterate is likely to be the only approach
that doesn't run out of memory so "preferred" may not be the right
description.

David


Current Thread
Keywords