[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] xpath key problems


Subject: [xsl] xpath key problems
From: "Birnbaum, David J" <djbpitt@xxxxxxxx>
Date: Sun, 21 Nov 2010 13:14:50 -0500

Dear XSLT List,

I'd be grateful for advice about a problem I'm having with the use of keys.
I've pasted a simplified xslt stylesheet below; in case it gets damaged in
email transit, it's available on-line at:

http://clover.slavic.pitt.edu/~djb/keys/folio-line-test.xsl

The xml document it is supposed to transform is too large to paste here (and
it would be difficult to simplify it enough to make it sufficiently brief for
pasting into a list posting), but it's at:

http://clover.slavic.pitt.edu/~djb/keys/edition.xml

It's in early (medieval) Cyrillic, so the text may look like gibberish, but
that incomprehensibility shouldn't interfere with troubleshooting the problem
described below.

The document is an interlinear collation of three manuscripts in parallel,
e.g.,

<block ref="blah">
<hm280>parallel text from ms hm280</hm280>
<hm281>parallel text from ms hm281</hm281>
<hm282>parallel text from ms hm282</hm282>
</block>
<!-- more blocks -->

The lines of text have empty milestone line break (lb) and page break (pb)
tags; the latter have @folio attributes identifying the page number. The task
is to retrieve the page identifier (@folio value) for the current line in the
interlinear edition, and, eventually, also to retrieve the line number. Were a
line in the interlinear edition spans two manuscript lines or pages, I just
need to retrieve the first of them. The specific problems are described in xml
comments within the xslt stylesheet below. I'd be grateful for any advice!

Sincerely,

David (djbpitt@xxxxxxxx)

--[xslt stylesheet is pasted below]--

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
    <xsl:output method="xml" indent="yes"/>
    <!-- Two problems, the first of which is fatal:
        1. Doesn't work; key returns nothing
            Change @use to "." and it returns the current element
            Change @use to any function (e.g., string-length(.) or name(.))
and it returns nothing (?)
            Same results with . or current() or self::node() or nothing; does
the choice matter here?
        2. Very slow
            Is this inevitable with the long axes or can it be optimized
(Saxon HE 9.2.1.2)?
            Likely to get *much* slower when we also count lines (see below)
    -->
    <!-- desired result: @folio value for most immediately preceding <pb> in
same ms (specific hm28x)
        1. get all preceding <pb> elements
        2. restrict the set to those in the same ms
        3. take the last (immediately preceding) one of those
    -->
    <!-- interim simplifications
        Must handle first page of each ms specially, since there is no
preceding <pb>
        Contents of each ms line requires additional templates
        Eventually must also count <lb> elements between that most recent
preceding <pb> and current location;
            full pointer will be something like 312r14, where 312r is the page
and 14 is the line on the page
    -->
    <xsl:key name="msline-pb" match="hm281 | hm280 | hm282"
        use="preceding::pb[parent::*/name() eq name()][1]/@folio"/>
    <xsl:template match="/">
        <output>
            <xsl:apply-templates/>
        </output>
    </xsl:template>
    <xsl:template match="block">
        <block>
            <folio>
                <xsl:value-of select="@ref"/>
            </folio>
            <xsl:apply-templates/>
        </block>
    </xsl:template>
    <xsl:template match="hm281 | hm280 | hm282">
        <ms>
            <identifier>
                <xsl:value-of select="name()"/>
            </identifier>
            <text>
                <xsl:apply-templates/>
            </text>
            <pointer>
                <xsl:value-of select="key('msline-pb',.)"/>
            </pointer>
        </ms>
    </xsl:template>
</xsl:stylesheet>

--[end of xslt stylesheet]--


Current Thread
Keywords