Page 1 of 1

Parsing address data from PAR and BREAK

Posted: Tue Jan 27, 2009 12:33 am
by kjforsyth
Greetings,

I'm relatively new to XSLT. I need to extract legacy data from an XML representation of rich-text, and am having difficulty parsing around the <break> element. Specifically, I'm trying to reliably parse address information from this:

...
<tablecell borderwidth='0px'>
<par def='23'><run><font size='9pt' name='Arial'
truetype='false' familyid='10'/>
123 E. Main Street<break/>Anytown, ST 12355<break/>USA</run>
<run><font size='9pt' style='bold' name='Arial' truetype='false'
familyid='10' color='navy'/>
</run>
</par>
</tablecell>
...

...to this:

<address>123 E. Main Street</address>
<city>Anytown</city>
<state>ST</state>
<zipcode>12355</zipcode>
<country>USA</country>

I'm using an XSLT 2.0 engine. I've been poking around trying to find how this might be done, but am coming up short. Any suggestions will be much appreciated.

Thanks,

Karl Forsyth

Re: Parsing address data from PAR and BREAK

Posted: Tue Jan 27, 2009 12:52 am
by george
You can use something like below:

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="tablecell">
<result>
<xsl:variable name="values" select="par//text()"/>
<address><xsl:value-of select="normalize-space($values[1])"/></address>
<city><xsl:value-of select="substring-before($values[2], ',')"/></city>
<state><xsl:value-of select="substring-before(substring-after($values[2], ' '), ' ')"/></state>
<zipcode><xsl:value-of select="substring-after(substring-after($values[2], ' '), ' ')"/></zipcode>
<country><xsl:value-of select="$values[3]"/></country>
</result>
</xsl:template>
</xsl:stylesheet>
Note however that this is very sensitive to the input format and you should check if the input follows the same rules for other tablecell elements.

Regards,
George

Re: Parsing address data from PAR and BREAK

Posted: Tue Jan 27, 2009 10:09 pm
by kjforsyth
This is exactly what I needed. Thank you so much.

Karl Forsyth