Parsing address data from PAR and BREAK
Here should go questions about transforming XML with XSLT and FOP.
-
- Posts: 6
- Joined: Tue Jan 27, 2009 12:29 am
Parsing address data from PAR and BREAK
Greetings,
I'm relatively new to XSLT. I need to extract legacy data from an XML representation of rich-text, and am having difficulty parsing around the <break> element. Specifically, I'm trying to reliably parse address information from this:
...
<tablecell borderwidth='0px'>
<par def='23'><run><font size='9pt' name='Arial'
truetype='false' familyid='10'/>
123 E. Main Street<break/>Anytown, ST 12355<break/>USA</run>
<run><font size='9pt' style='bold' name='Arial' truetype='false'
familyid='10' color='navy'/>
</run>
</par>
</tablecell>
...
...to this:
<address>123 E. Main Street</address>
<city>Anytown</city>
<state>ST</state>
<zipcode>12355</zipcode>
<country>USA</country>
I'm using an XSLT 2.0 engine. I've been poking around trying to find how this might be done, but am coming up short. Any suggestions will be much appreciated.
Thanks,
Karl Forsyth
I'm relatively new to XSLT. I need to extract legacy data from an XML representation of rich-text, and am having difficulty parsing around the <break> element. Specifically, I'm trying to reliably parse address information from this:
...
<tablecell borderwidth='0px'>
<par def='23'><run><font size='9pt' name='Arial'
truetype='false' familyid='10'/>
123 E. Main Street<break/>Anytown, ST 12355<break/>USA</run>
<run><font size='9pt' style='bold' name='Arial' truetype='false'
familyid='10' color='navy'/>
</run>
</par>
</tablecell>
...
...to this:
<address>123 E. Main Street</address>
<city>Anytown</city>
<state>ST</state>
<zipcode>12355</zipcode>
<country>USA</country>
I'm using an XSLT 2.0 engine. I've been poking around trying to find how this might be done, but am coming up short. Any suggestions will be much appreciated.
Thanks,
Karl Forsyth
-
- Site Admin
- Posts: 2095
- Joined: Thu Jan 09, 2003 2:58 pm
Re: Parsing address data from PAR and BREAK
You can use something like below:
Note however that this is very sensitive to the input format and you should check if the input follows the same rules for other tablecell elements.
Regards,
George
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="tablecell">
<result>
<xsl:variable name="values" select="par//text()"/>
<address><xsl:value-of select="normalize-space($values[1])"/></address>
<city><xsl:value-of select="substring-before($values[2], ',')"/></city>
<state><xsl:value-of select="substring-before(substring-after($values[2], ' '), ' ')"/></state>
<zipcode><xsl:value-of select="substring-after(substring-after($values[2], ' '), ' ')"/></zipcode>
<country><xsl:value-of select="$values[3]"/></country>
</result>
</xsl:template>
</xsl:stylesheet>
Regards,
George
George Cristian Bina
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service