[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Transforming XML Blockquotes - Mixed Content - XSLT 2.0 Solution


Subject: Re: [xsl] Transforming XML Blockquotes - Mixed Content - XSLT 2.0 Solution
From: JBryant@xxxxxxxxx
Date: Thu, 14 Apr 2005 11:47:10 -0500

Here's an XSLT 2.0 solution to the problem. It involves two stylesheets 
(though it could be combined with a bit more effort). The algorithm goes 
like this:
First, chunk up all the bits, which turns this into a grouping problem.
Second, solve the grouping problem.

Given the following XML file:

<doc>
  <paragraph num="1">Yadda Yadda Yadda <italic>Italic Yadda</italic> 
Yadda: <blockquote>Blah Blah Blah Blah</blockquote> Yackity Yack 
Yack</paragraph>
</doc>

Use this XSL file:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="/">
    <x>
      <xsl:apply-templates/>
    </x>
  </xsl:template>

  <xsl:template match="paragraph">
    <xsl:apply-templates/>
  </xsl:template>

  <xsl:template match="paragraph/text()">
    <p num="{../@num}" 
group="{count(preceding-sibling::blockquote)}"><xsl:value-of 
select="."/></p>
  </xsl:template>

  <xsl:template match="blockquote">
    <blockquote><xsl:apply-templates/></blockquote>
  </xsl:template>

  <xsl:template match="italic">
    <p num="{../@num}" 
group="{count(preceding-sibling::blockquote)}"><span 
style="font-style:italic"><xsl:apply-templates/></span></p>
  </xsl:template>

</xsl:stylesheet>

To create the chunks, thus:

<?xml version="1.0" encoding="UTF-8"?>
<x>
  <p num="1" group="0">Yadda Yadda Yadda </p>
  <p num="1" group="0"><span style="font-style:italic">Italic 
Yadda</span></p>
  <p num="1" group="0"> Yadda: </p>
  <blockquote>Blah Blah Blah Blah</blockquote>
  <p num="1" group="1"> Yackity Yack Yack</p>
</x>

Now it's a grouping problem, which can be solved in XSLT 2.0 with this 
stylesheet:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:key name="mixed" match="p" use="@group"/>

  <xsl:template match="x">
    <html>
      <head>
        <title>Paragraph Chunking Test</title>
      </head>
      <body>
        <xsl:for-each-group select="p" group-by="@group">
          <p>
            <xsl:for-each select="../p[@group=current-grouping-key()]">
              <xsl:apply-templates/>
            </xsl:for-each>
          </p>
          <xsl:apply-templates 
select="current-group()/following-sibling::blockquote"/>
        </xsl:for-each-group>
      </body>
    </html>
  </xsl:template>

  <xsl:template match="p"/>

  <xsl:template match="blockquote">
    <xsl:copy-of select="."/>
  </xsl:template>

  <xsl:template match="span">
    <xsl:copy-of select="."/>
  </xsl:template>

</xsl:stylesheet>

which yields:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 
      <title>Paragraph Chunking Test</title>
   </head>
   <body>
      <p>Yadda Yadda Yadda <span style="font-style:italic">Italic 
Yadda</span> Yadda: 
      </p>
      <blockquote>Blah Blah Blah Blah</blockquote>
      <p> Yackity Yack Yack</p>
   </body>
</html>

I have not yet gotten an XSLT 1.0 grouping solution for this problem (I 
don't have much time to spend on this issue). I am sending along the XSLT 
2.0 solution (which took me perhaps 10 minutes to do - I love 
xsl:for-each-group) just to show one workable (IMHO) way to solve the 
problem. James Fuller has shown us another, but I think there's value in 
multiple approaches.

I tested it all with Saxon 8.4, by the way.

Jay Bryant
Bryant Communication Services
(presently consulting at Synergistic Solution Technologies)


Current Thread
Keywords