[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] things about grouping

Subject: Re: [xsl] things about grouping
From: Ihe Onwuka <ihe.onwuka@xxxxxxxxx>
Date: Sat, 24 Nov 2012 14:17:00 +0000

On Sat, Nov 24, 2012 at 9:31 AM, Geert Bormans
<geert@xxxxxxxxxxxxxxxxxxx> wrote:

Have congregated all I want to respond to in one reply.

The title of the thread is "things about grouping". There is a  reason
for that. Really the thread is about what I would call programming
language ergonomics.

>> 2. If your group by patttern evaluates to  an empty element
>> current-grouping-key() will yield nothing. However one may (as I did)
>> want to get at the attributes of such an element but this is not
>> possible.
> current-grouping-key() will not yield "nothing"
> it will yield the empty string (if a sequence of empty strings, duplicates
> are removed)
> All the element that have a grouping key = empty string, will be in the
> group with current-grouping-key() = ''

The issue is that in a scenario where you are grouping by an empty
element  that can turn out to be specificationally correct (sic) but
semantically nonsensical.

Let me contrive an example. Supposing you have some XML for running a
political campaign. You've collected a mass of unordered data in
amongst which is an  element called blob whose content model consists

- no child or text elements
- a variety of optional attributesl  - because we cannot force the
public to give us all the information we ask for and they may not know
some of the answers or their answers may straddle more than one

here is an example fragment

<stuff .....>
<blob sex=M" age="30" demographicClass="ABC1" ....../>
<moreStuff ....
<blob sex="M" ....../>
<anotherStuff ..>

<blob sex=M" age="31" demographicClass="ABC1"  occupuationType='""/>

So I come along and write (for arguments sake)

<xsl:for-each-group select="*" group-by="blob">

The specification says current-grouping-key() gives an atomic value
and the atomic value of any empty element is the empty string. Do I
not want to group by that

I asked to group by a node. If I had wanted to group by the nodes
atomic value then I could have asked to group by something like
blob/text() but I didn't.

Moving on.

<xsl:for-each-group select="*" group-starting-with="blob">

Once again I have asked to group based on a node - this time the
specification will give me the empty sequence regardless of the atomic
value of node. But I'd like to access the age attribute of blob so I
say current-grouping-key()/@age.

Aha you say - that won't type check so you will be swiftly be
disabused of that notion..... but....

<xsl:variable name="x" select="current-grouping-key()"/>
<xsl:apply-templates select="current-group>
   <xsl:with-param select="x"/>

and now when you say x/@age in the applied template rule it type checks.!!!!?

Plus when one has  been groomed to expect nodes from the key()
function by reason of consistency and symmetry why should one not
expect to to get nodes from current-grouping-key().

Perhaps If it were called current-grouping-value() that expectation
may not be there. In fact there's a suggestion (complete with
wide-eyed naivete) have current-grouping-key() return a node
corresponding to the grouped value and a function called
current-grouping-value() that returns the atomic value; (although of
course if you returned the node you wouldn't need the other function
as you could navigate to the value. Hmmm?

> and you will have access to their attributes without a problem.
> You will have to expand on this to explain what exactly your problem is, so
> we can help you get around this

Understand,  it's not a case of not knowing how to get around these things.

>> 3. If you group-starting-with it's weird to then see things not
>> encompassed by the starting-with expression appear as a group because
>> they precede the first occurrence of the starting-with expression.
>> While this works as specified  because for-each-group is a total
>> function over the population it does result in a mismatch between the
>> grammatical semantic and the exhibited behaviour, or to put it another
>> way it does what it says in the spec but  doesn't do what it says on
>> the tin.
> Thank God a programming language does as per specification ;-)

Generically applied that statement destroys the raison d'etre for
quality assurance.
Works as designed. What if the design is crap.

Disclaimer - The above is a reply to the general sentiment expressed.
It is not an expression of opinion on any of the XSLT (or any other
language) specification(s).

> If you don't want that preceding group simply get rid of it using some test
> For me the semantics is right...

It is beyond reproach  only if one chooses to think like a computer.
If I say starts with X and the specification contrives to make it
start with Y the issue is with the semantics of the specification -
not whether you can work around it (I know I can).

On Sat, Nov 24, 2012 at 9:37 AM, Geert Bormans
<geert@xxxxxxxxxxxxxxxxxxx> wrote:
> At 10:14 24/11/2012, you wrote:
>> On 24 November 2012 05:54, Ihe Onwuka <ihe.onwuka@xxxxxxxxx> wrote:
>> > 1. <xsl:apply-templates select="current-group() except blah"/>
>> >
>> > still applies templates to blah
>> It needs to be " except self::blah" otherwise it will look on the child
>> axis.
> In my understanding the first item in a group is selected as the context of
> the for-each-group
> current-group() is the sequence of all nodes in the group
> so I am not sure this should work in all cases.
> Correct me if I am wrong please, Andrew
> edit: I did some tests using Saxon 9.4PE
> and it seems there we need indeed
> <xsl:apply-templates select="current-group() except  current-group()/self::blah"/>

except is one of those things I keep tripping up on and if you say
there isn't an ergonomic issue I'd ask why you had to look it up to be

In just about as many keystrokes I can (and did do)

<xsl:apply-templates select="current-group()"/>
<xsl:template match="blah"/>

and save my brain cells for remembering something useful.

The following  taught me the folly of habitually relying on
abbreviations - sometimes they don't save you any keystrokes and cost
you time.


So the take home heuristic to write out the unabbreviated syntax first
when things are not as expected.

Current Thread