Page 1 of 1

Find text, get Xpath of containing element

Posted: Fri Jan 30, 2009 4:59 am
by kjforsyth
Greetings,

I need to obtain the Xpath of each element that contains a particular search string.

Here's a sample XML file:

Code: Select all

<table>
<tablerow>
<tablecell>
<par>Name of employer and location:</par>
</tablecell>
<tablecell>
<par>Jack Spratt Industries</par>
<par>123 E. West Road</par>
</tablecell>
</tablerow>
<tablerow>
<tablecell>
<par>Name of employer and location:</par>
</tablecell>
<tablecell>
<par>Red Robin Amalgamated</par>
<par>123 W. East Road</par>
</tablecell>
</tablerow>
<tablerow>
<tablecell>
<par>Name of employer and location:</par>
</tablecell>
<tablecell>
<par>Wicked Witch Ltd.</par>
<par>123 Yellow Brick Road</par>
</tablecell>
</tablerow>
</table>
I want to be able to search for the string "employer and location" and get back the Xpath of each of the three paragraphs that contain that string.

I would be most grateful for any help you can offer here.

Best regards,

Karl Forsyth

Re: Find text, get Xpath of containing element

Posted: Fri Jan 30, 2009 1:41 pm
by george
There are multiple XPath expressions that match an element. The ISO Schematron implementation generates a couple of templates that get the XPath location. You can find below an XSLT 2.0 stylesheet that on your example gives:

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<result>
<hit>/table/tablerow[1]/tablecell[1]/par[1]</hit>
<hit>/table/tablerow[2]/tablecell[1]/par[1]</hit>
<hit>/table/tablerow[3]/tablecell[1]/par[1]</hit>
</result>

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>

<xsl:key name="match" match="*" use="contains(string-join(text(), ''), 'employer and location')"/>

<xsl:template match="/">
<result>
<xsl:for-each select="key('match', true())">
<hit><xsl:apply-templates mode="getXPath" select="."/></hit>
</xsl:for-each>
</result>
</xsl:template>

<xsl:template match="node() | @*" mode="getXPath">
<xsl:for-each select="ancestor-or-self::*">
<xsl:text>/</xsl:text>
<xsl:value-of select="name(.)"/>
<xsl:if test="parent::*">
<xsl:text>[</xsl:text>
<xsl:value-of
select="count(preceding-sibling::*[name(.)=name(current())])+1"/>
<xsl:text>]</xsl:text>
</xsl:if>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>
The XSLT 2.0 part is the string-join function in the key, but you can find the elements in some other way and then you can change the version to 1.0 and use it with an XSLT 1.0 processor.

Regards,
George

Re: Find text, get Xpath of containing element

Posted: Sat Jan 31, 2009 2:32 am
by kjforsyth
That works perfectly. You da man! Thank you.