[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] Re: Selecting consecutive elements using generate-id


Subject: RE: [xsl] Re: Selecting consecutive elements using generate-id
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Thu, 10 Jul 2008 09:32:46 +0100

Processing the XML that comes out of MS Word is not easy. One useful
technique is to start by filtering it down to something simpler and smaller
by cutting out all the bits you don't need - it just makes it so much easier
to see what's really going on.

It's certainly a lot easier to handle using XSLT 2.0, with the help of
xsl:for-each-group, especially the positional options (group-starting-with
and group-ending-with). However, a lot of problems are best tackled using
the sibling recursion technique - which essentially means that instead of
applying templates to all your children, you apply templates to your first
child and/or your immediately following sibling, passing parameters down the
line if needed.

The generate-id technique in XSLT 1.0 is used to compare node identities, in
2.0 it can nearly always be replaced by the "is" operator.

I'm not entirely sure what you're trying to do in your example. In my
previous encounters with Word XML, I haven't attempted to use the tabBefore
depths to identify structure, and it looks as if it could be tricky because
some of your list items seem to have tabBefore="280" while others have
tabBefore="285". I would have thought the style (ListBullet2, ListBullet3
etc) was a better guide.

I think I would go for sibling recursion on this one. 

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: Kelly Attrill [mailto:kellyattrill@xxxxxxxxx] 
> Sent: 10 July 2008 02:44
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Re: Selecting consecutive elements using generate-id
> 
> Hi, I am having some trouble selecting a consecutive set of 
> elements in order to build a hierarchical result.
> 
> I have looked at a number of examples and attempted the following:
> - recursively calling templates to add a list element or a 
> new list by just checking the first following sibling. When 
> the first sibling has a listPr element then I enter either an 
> add-element or begin-new-list template based on the tabBefore 
> numbers being higher/equal to the current node. This works 
> well, however I am unable to get the final list elements from 
> the first list, the resulting structure should look like a 
> html list e.g.
> <ul>
>      <li></li>
>      <li></li>
>      <ul>
>           <li></li>
>           <li></li>
>      </ul>
>      <li></li> <<< This is the element that I am not able to 
> get using the method described above.
> </ul>
> 
> I would like to switch to a for-each select statement that 
> selects all the list elements from the start of the list to 
> the end of the list, and then do this again however specify 
> the start and end as the nested list based on the tabBefore 
> depths. I am able to identify the start of the list as:
> <xsl:template match="w:p">
> <xsl:choose>
> <xsl:when test="(w:pPr/w:listPr) and
> (not(preceding-sibling::w:p[1]/w:pPr/w:listPr))">
> ...
> - at this point i call the recursive templates, but i would 
> like instead to do something like I have seen on this list using
> generate-id:
>     <xsl:for-each
> select="following-sibling::w:p[generate-id(following-sibling::
> w:p[preceding-sibling::w:p
> = $this]) and
>             ($p_listItem/../following-sibling::w:p[w:pPr/w:listPr])">
> 
> this doesn't work of course but is meant to illustrate that I 
> don't know how to select only the following siblings between 
> the w:p element with "The first element in the list" and the 
> w:p element containing "The third element in the first list" 
> - based on the example xml below.
> 
> I have been working on this for a number of days and I think 
> I am at a sticking point - hopefully somebody can help me get 
> back on track again.
> 
> Thanks!
> 
> My base XML is from WordML, below is an extract:
>   <w:p wsp:rsidR="007A561C" wsp:rsidRDefault="007A561C" 
> wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="NormalBold"/>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>Text for a paragraph</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet2"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="285"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>The first element in the list</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet2"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="285"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>The second element in the list</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet2"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="285"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>The second element in the list</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet3"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="990"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>Nest List - The first 
> element in the list</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet3"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="990"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>Nest list - the second 
> element in the list</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet2"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="280"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>The third element in the 
> first list</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="007A561C" wsp:rsidP="003B62CD"/>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="007A561C" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="NormalBold"/>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>A new paragraph</w:t>
>                         </w:r>
>                     </w:p>
>                     <w:p wsp:rsidR="007A561C"
> wsp:rsidRDefault="003B62CD" wsp:rsidP="003B62CD">
>                         <w:pPr>
>                             <w:pStyle w:val="ListBullet2"/>
>                             <w:listPr>
>                                 <wx:t wx:val="." wx:wTabBefore="285"
> wx:wTabAfter="255"/>
>                                 <wx:font wx:val="Symbol"/>
>                             </w:listPr>
>                         </w:pPr>
>                         <w:r>
>                             <w:t>First item of a new list.</w:t>
>                         </w:r>
> </w:p>


Current Thread
Keywords