[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] grouping xhtml title with first sibling


Subject: Re: [xsl] grouping xhtml title with first sibling
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Tue, 18 Jan 2011 12:36:20 +0000

You don't need dynamic XPath evaluation for this.

The basic structure you want can be achieved with

<xsl:template name="group">
<xsl:param name="nodes" as="node()"/>
<xsl:param name="level" as="xs:integer"/>
<xsl:for-each-group select="*" group-starting-with="*[local-name() = concat('h', $level)]">
<div>
<xsl:call-template name="group">
<xsl:with-param name="nodes" select="current-group()"/>
<xsl:with-param name="level" select="$level + 1"/>
</xsl:call-template>
</div>
</xsl:for-each-group>
</xsl:template>


and then start the recursion off with level="1". You'll have to modify this slightly to avoid getting a div element for the elements that precede the first hN element at each level.

Michael Kay
Saxonica




On 18/01/2011 11:55, Matthieu Ricaud-Dussarget wrote:
Hi All,

This is grouping problem, I'm not sure I'm going to the good direction, any help would be welcome.

My input file is an XHTML one, it contains title elements h1, h2, h3.
(Let's say they are all at under the <body>, or more generally they all are adjacent)


I'd like to group each of them with his first-sibling element, with those restrictions :
- if the first sibling is a title element (h1, h2 or h3), then continue and group with the next sibling.
- if the next sibling element is higher level (h1 > h2 > h3) then stop grouping and start a new group from this next sibling
- if the next sibling has @class='foo', then don't perform the grouping (I actually like to perform any xpath test here)


This is my unit-test sample
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>unit-test</title>
</head>
<body>
<h1>title1</h1>
<p>para1</p>
<hr/>
<div><img src="img1.jpg" alt=""/></div>
<table><tr><td>table1</td></tr></table>
<h2>title2</h2>
<p>para2</p>
<h3>title3</h3>
<p>para3</p>
<p>para4</p>
<h2>title4</h2>
<h3>title5</h3>
<p>para5</p>
<table><tr><td>table2</td></tr></table>
<h2>title6</h2>
<h1>title7</h1>
<p>para6</p>
<p>para7</p>
<h2>title8</h2>
<p class="foo">para8</p>
<p>para9</p>
<h1>title9</h1>
<h2>title10</h2>
<h3>title11</h3>
<p>para10</p>
</body>
</html>

Desired ouput is :
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>unit-test</title>
</head>
<body>
<div class="group">
<h1>title1</h1>
<p>para1</p>
</div>
<hr/>
<div><img src="img1.jpg" alt=""/></div>
<table><tr><td>table1</td></tr></table>
<div class="group">
<h2>title2</h2>
<p>para2</p>
</div>
<div class="group">
<h3>title3</h3>
<p>para3</p>
</div>
<p>para4</p>
<div class="group">
<h2>title4</h2>
<h3>title5</h3>
<p>para5</p>
</div>
<table><tr><td>table2</td></tr></table>
<h2>title6</h2>
<div class="group">
<h1>title7</h1>
<p>para6</p>
</div>
<p>para7</p>
<h2>title8</h2>
<p class="foo">para8</p>
<p>para9</p>
<div class="group">
<h1>title9</h1>
<h2>title10</h2>
<h3>title11</h3>
<p>para10</p>
</div>
</body>
</html>

I actually like this to work with any h1, h2, ..., h6.
In this purpose I gave a param to my xslt :
<xsl:param name="elements" select="'h1,h2,h3'"/>

That means I need the eval() function after this (I'm using saxon9 for this)
If my memory is good, I think I gave a try with grouping-adjacent but it didn't work, so I move to another method.


This is my XSLT :
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:saxon="http://saxon.sf.net/"
xpath-default-namespace="http://www.w3.org/1999/xhtml" xmlns:h="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" indent="yes"/>


<xsl:param name="debug" select="'no'"/>
<xsl:param name="verbose" select="'yes'"/>
<xsl:param name="elements" select="'h1,h2,h3'"/>

<xsl:variable name="direct-concerned">
<xsl:for-each select="tokenize($elements,',')">
<xsl:text>self::</xsl:text>
<xsl:value-of select="."/>
<xsl:if test="not(position()=last())">
<xsl:text> or </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>

<xsl:variable name="not-following-sibling">
<xsl:text>not(</xsl:text>
<xsl:for-each select="tokenize($elements,',')">
<xsl:text>name(following-sibling::*[1])='</xsl:text>
<xsl:value-of select="."/>
<xsl:text>'</xsl:text>
<xsl:if test="not(position()=last())">
<xsl:text> or </xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>)</xsl:text>
</xsl:variable>

<xsl:variable name="concerned" select="concat( $direct-concerned, ' and ', $not-following-sibling )"/>

<xsl:variable name="uncopy">
<xsl:for-each select="tokenize($elements,',')">
<xsl:text>preceding-sibling::*[1][self::</xsl:text>
<xsl:value-of select="."/>
<xsl:text>]</xsl:text>
<xsl:if test="not(position()=last())">
<xsl:text> or </xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:variable>

<xsl:template match="/">
<xsl:if test="$verbose='yes'">
<xsl:message>direct-concerned test=<xsl:value-of select="$direct-concerned"/></xsl:message>
<xsl:message>concerned= test<xsl:value-of select="$concerned"/></xsl:message>
<xsl:message>uncopy= test<xsl:value-of select="$uncopy"/></xsl:message>
</xsl:if>
<xsl:if test="$debug!='yes'">
<xsl:apply-templates/>
</xsl:if>
</xsl:template>


<xsl:template match="* | node() | @*" >
<xsl:param name="copy" select="false()"/>
<xsl:choose>
<xsl:when test="$copy">
<xsl:copy>
<xsl:apply-templates select="* | node() | @*" />
</xsl:copy>
</xsl:when>
<xsl:when test="saxon:evaluate($uncopy)"/>
<xsl:when test="saxon:evaluate($direct-concerned)">
<xsl:element name="div" namespace="http://www.w3.org/1999/xhtml">
<xsl:attribute name="class">group</xsl:attribute>
<xsl:apply-templates select="self::* | following-sibling::*[1]">
<xsl:with-param name="copy" select="true()"/>
</xsl:apply-templates>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates select="* | node() | @*" />
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

</xsl:stylesheet>

I don't get a good result : some para disapears, and the grouping doesn't follow excatly the rules above.
Before continuing debuging this, I'd like to know if there is not another way to see the problem ?


Thanks in advance for your light,

Matthieu.

PS : To tell you everything about this project : I would like at best that the concerned title elements list can be set somewhere, within such a node set variable for example :
<xsl:variable>
<title match="h1[@class='bar']" level="1"/>
<title match="h1[span]" level="1"/>
<title match="h2" level="2"/>
<title match="h3[not(@class)]" level="3"/>
</xsl:variable>
But well I know... it's a greedy demand. Maybe one day.


Current Thread
Keywords