[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] XSLT2, collection(), and xsl:key


Subject: Re: [xsl] XSLT2, collection(), and xsl:key
From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx>
Date: Fri, 1 Feb 2008 17:38:07 +0000

Create a variable that contains the element counts for each document,
something like:

<xsl:variable name="foo">
  <xsl:for-each select="collection()">
    <doc name="{document-uri()}">
      <xsl:for-each-group select="//*" group-by="name()">
        <elem name="{current-grouping-key()}"
count="{count(current-group())}"/>
      </xsl:for-each-group>
    </doc>
  </
</

That will give you:

<doc name="doc1.xml">
  <elem name="foo" count="20"/>
  <elem name="bar" count="44"/>
</doc>
<doc name="doc2.xml">
  <elem name="baz" count="1"/>
   ...


Then just use grouping again to generate the report.

cheers
andrew


On 01/02/2008, James Cummings <cummings.james@xxxxxxxxx> wrote:
> Hiya,
>
> I'm using the collection() function and Saxon to produce some
> statistics about how many of which elements of which type in a
> particular set of documents.
>
> Let's say that document one has something like:
>
> <p xml:id="doc1" type="hypothetical">
> There is some text with <seg type="foo">some foo</seg> and
> occasionally <seg type="blort">blort</seg> and <other
> type="wibble">wibble</other></p>
>
>
> and document two (and up to some really large number) is like:
>
> <p xml:id="doc2">
> There is another doc with <seg type="foo">some foo</seg> and
> occasionally <seg type="notBlort">notBlort</seg> and <other
> type="fluffy">fluffy other</other> and <some
>   name="thing">someThing</some></p>
>
> What I want to produce are tables of counts of specific elements, by
> document and type. So something like the following (though using
> table/row/cell xml markup):
>
>
> table: other
> document | fluffy | wibble | stuff
> doc1 | 0 | 1 | 0
> doc2 | 1 | 0 | 0
> doc3 | 20 | 12 | 54
>
> table: seg
> document | blort | foo | notBlort
> doc1 | 1 | 1 | 0
> doc2 | 0 | 1| 1
> doc3 | 23 | 44 | 58
>
> table: some
> document | thing | else | now
> doc1 | 0 | 0 | 0
> doc2 | 1 | 0 | 0
> doc3 | 12 | 5 | 24
>
> I can build this manually (and for one element I have done so) by doing:
>
> <xsl:variable name="docs" select="collection('../../working/xml/docs.xml')"/>
> <xsl:template name="main">
> <table><head>seg by type</head>
> <row rend="label">
> <cell>document</cell>
> <cell>blort</cell>
> <cell>foo</cell>
> <cell>notBlort</cell>
> </row>
> <xsl:for-each select="$docs//p"> <!-- let's pretend p is the root element -->
> <row>
> <xsl:variable name="doc" select="@xml:id"/>
> <cell><xsl:value-of select="$doc"/></cell>
> <cell><xsl:value-of select="count(.//seg[@type='blort'])</cell>
> <cell><xsl:value-of select="count(.//seg[@type='foo'])</cell>
> <cell><xsl:value-of select="count(.//seg[@type='notBlort'])</cell>
> </row>
> </xsl:for-each>
> </table>
> </xsl:template>
>
> But that isn't really the point now is it?  I tried to use <xsl:key>
> but I ran into the problem of it not liking the collection() function
> as part of the match.
>
> What I want to do is be able to say for-each doc, build me a table of
> all the (let's pretend unknown) values of this attribute on this
> element.  So something like:
>
> <xsl:for-each select="$docs//p">
> <xsl:value-of select="my:function(other/@type, seg/@type, thing/@name,
> new/@type)"/>
> </xsl:for-each>
>
> and without knowing the values of @type in advance it makes a table
> like above of them (using distinct-values()?) and counting their
> occurrences.
>
> This is a case where I know it must be possible, and I could just go
> and do it manually, (in reality there are about 10 elements with a
> number of attributes, with around 20 values each), but it just seems
> *wrong* to do it that way. ;-)
>
> Suggestions?
>
> Thanks,
>
> -James
>
>


-- 
Andrew Welch
http://andrewjwelch.com
Kernow: http://kernowforsaxon.sf.net/


Current Thread
Keywords
xml