String parsing and matching

Here should go questions about transforming XML with XSLT and FOP.
sandeepm
Posts: 10
Joined: Tue Feb 24, 2009 1:54 pm

String parsing and matching

Post by sandeepm »

Hello All,

I want to convert the following xml

Code: Select all


<root>
<name>test1,value1|test2,value2|test3,value3|test4,value4|</name>
</root>
to

Code: Select all


<root>
<test1>value1</test1>
<some-node>
<child-node>note</child-node>
</some-node>
<test2>value2</test2>
<some-node>
<child-node>note</child-node>
... more child nodes
<test3>value3</test3>
</some-node>
<test4>value4</test4>
</root>

Thus, we can see that the nodes test1, test2 etc. are already present in output XML and their positions are at random, so I need to first locate their positions in the output file and then replace their values with value1 or value2 etc.

Please can someone suggest, how should I go about parsing this XML and then correctly producing the desired output using XSLT.

Many Thanks in advance.
Sandeep Mestry
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: String parsing and matching

Post by george »

It is not very clear to me how you want to come out with that output from the given source document... However, the example below may help you. Suppose you want as output

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<result>
<test1>value1</test1>
<test2>value2</test2>
<test3>value3</test3>
<test4>value4</test4>
</result>
then the following XSLT 2.0 stylesheet will do it:

Code: Select all


<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="root">
<result>
<xsl:apply-templates/>
</result>
</xsl:template>
<xsl:template match="name">
<xsl:for-each select="tokenize(., '\|')">
<xsl:variable name="nameValue" select="tokenize(., ',')"/>
<xsl:if test="$nameValue[2]">
<xsl:element name="{$nameValue[1]}">
<xsl:value-of select="$nameValue[2]"/>
</xsl:element>
</xsl:if>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Regards,
George
George Cristian Bina
sandeepm
Posts: 10
Joined: Tue Feb 24, 2009 1:54 pm

Re: String parsing and matching

Post by sandeepm »

Hi George,

Thanks for your help.

The output source document is not only derived from given source document but also consists of some other predefined nodes.
The delimited string from the source document will be parsed to insert/ update the values of these predefined nodes.

Cheers, Sandeep
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: String parsing and matching

Post by george »

Ok, suppose you have a document test.xml with the following content:

Code: Select all


<root>
<test1>value1</test1>
<some-node>
<child-node>note</child-node>
</some-node>
<test2>value2</test2>
<some-node>
<child-node>note</child-node>
... more child nodes
<test3>value3</test3>
</some-node>
<test4>value4</test4>
</root>
and a document called updates.xml with the following content

Code: Select all


<root>
<name>test1,newValue1|test2,newValue2|test3,newValue3|test4,newValue4|</name>
</root>
then the stylesheet below will update the test1, test2, test3 and test4 values according with the values from the updates.xml file

Code: Select all


<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<xsl:variable name="updates" as="element()">
<result>
<xsl:apply-templates select="doc('updates.xml')/root/name" mode="readUpdates"/>
</result>
</xsl:variable>
<xsl:template match="name" mode="readUpdates">
<xsl:for-each select="tokenize(., '\|')">
<xsl:variable name="nameValue" select="tokenize(., ',')"/>
<xsl:if test="$nameValue[2]">
<xsl:element name="{$nameValue[1]}">
<xsl:value-of select="$nameValue[2]"/>
</xsl:element>
</xsl:if>
</xsl:for-each>
</xsl:template>

<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>

<xsl:template match="*[$updates/*[name()=current()/name()]]">
<xsl:copy>
<xsl:value-of select="$updates/*[name()=current()/name()]"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
resulting the following output

Code: Select all


<?xml version="1.0" encoding="UTF-8"?><root>
<test1>newValue1</test1>
<some-node>
<child-node>note</child-node>
</some-node>
<test2>newValue2</test2>
<some-node>
<child-node>note</child-node>
... more child nodes
<test3>newValue3</test3>
</some-node>
<test4>newValue4</test4>
</root>
Regards,
George
George Cristian Bina
sandeepm
Posts: 10
Joined: Tue Feb 24, 2009 1:54 pm

Re: String parsing and matching

Post by sandeepm »

Hey George,

Many Thanks for your help. :D

The solution you have provided is very close to what I wanted, however there is a slight change in the test.xml as below:

Code: Select all


<root>
<test1>$variable1</test1>
<some-node>
<child-node>note</child-node>
</some-node>
<test2>$variable2</test2>
<some-node>
<child-node>note</child-node>
... more child nodes
<test3>$variable3</test3>
</some-node>
<test4>$variable4</test4>
<test1>SomeOtherValue</test1> <!-- Duplicate test1 node which does not need to be replaced -->
</root>
The $variable1, $variable2 .. etc are inserted as the test.xml may have duplicate nodes and not all the nodes need to be replaced with the new value.

Many Thanks once again for your time.

Cheers,
Sandeep
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: String parsing and matching

Post by george »

It gets more complicated with each iteration, huh? :)

The solution in this case is to use a meta stylesheet that based on updates.xml produces as output another stylesheet that when applied on test.xml will give you the final output. See below a full working example.

Suppose we have updates.xml as below, where test3 is updated only if it is inside some-node:

updates.xml

Code: Select all


<root>
<name>test1,newValue1|test2,newValue2|some-node/test3,newValue3|test4,newValue4|</name>
</root>
Then we take a test.xml that has two occurences of test3, one inside a some-node element and one inside a some-node2 element - that means we expect only the first occurrence to be updated.

test.xml

Code: Select all


<root>
<test1>value1</test1>
<some-node>
<child-node>note</child-node>
</some-node>
<test2>value2</test2>
<some-node>
<child-node>note</child-node>
... more child nodes
<test3>value3</test3>
</some-node>
<some-node2>
<child-node>note</child-node>
... more child nodes
<test3>KEEP THIS VALUE</test3>
</some-node2>
<test4>value4</test4>
</root>
The updates.xsl below will generate a stylesheet.
updates.xsl

Code: Select all


<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"
xmlns:a="alias-for-xsl-namespace">
<xsl:namespace-alias stylesheet-prefix="a" result-prefix="xsl"/>

<xsl:output indent="yes"/>
<xsl:variable name="updates" as="element()">
<result>
<xsl:apply-templates select="doc('updates.xml')/root/name" mode="readUpdates"/>
</result>
</xsl:variable>
<xsl:template match="name" mode="readUpdates">
<xsl:for-each select="tokenize(., '\|')">
<xsl:variable name="nameValue" select="tokenize(., ',')"/>
<xsl:if test="$nameValue[2]">
<update match="{$nameValue[1]}">
<xsl:value-of select="$nameValue[2]"/>
</update>
</xsl:if>
</xsl:for-each>
</xsl:template>

<xsl:template match="/">
<a:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<a:template match="node() | @*">
<a:copy>
<a:apply-templates select="node() | @*"/>
</a:copy>
</a:template>
<xsl:apply-templates select="$updates/*"/>
</a:stylesheet>
</xsl:template>

<xsl:template match="update">
<a:template match="{@match}">
<a:copy>
<a:apply-templates select="@*"/>
<a:text><xsl:value-of select="."/></a:text>
</a:copy>
</a:template>
</xsl:template>
</xsl:stylesheet>
For the content of updates.xml from above the generated stylesheet (let's name it test.xsl) is

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="test1">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:text>newValue1</xsl:text>
</xsl:copy>
</xsl:template>
<xsl:template match="test2">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:text>newValue2</xsl:text>
</xsl:copy>
</xsl:template>
<xsl:template match="some-node/test3">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:text>newValue3</xsl:text>
</xsl:copy>
</xsl:template>
<xsl:template match="test4">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:text>newValue4</xsl:text>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Now, when you apply this test.xsl on the test.xml file the result is

Code: Select all


<?xml version="1.0" encoding="UTF-8"?><root>
<test1>newValue1</test1>
<some-node>
<child-node>note</child-node>
</some-node>
<test2>newValue2</test2>
<some-node>
<child-node>note</child-node>
... more child nodes
<test3>newValue3</test3>
</some-node>
<some-node2>
<child-node>note</child-node>
... more child nodes
<test3>KEEP THIS VALUE</test3>
</some-node2>
<test4>newValue4</test4>
</root>
where you can see that only the node3 inside some-node was updated.

Regards,
George
George Cristian Bina
sandeepm
Posts: 10
Joined: Tue Feb 24, 2009 1:54 pm

Re: String parsing and matching

Post by sandeepm »

Hi George,

I really appreciate the time and effort you are putting into this. Nothing can I really say will do justice for your help.

However, the system I am working on has got limitations and it can NOT process 2 XSL transformations, so the solution of 2 XSLs does not fulfill the requirement.
Also, the test.xml can NOT include the XPath Expression(some-node/test3) but only the node name (test3) as its the output of some other system which is later becoming input for my system.

Thanks once again, sorry to bother you again.

Cheers,
Sandeep
phoebe
Posts: 1
Joined: Wed Feb 25, 2009 6:19 pm

Re: String parsing and matching

Post by phoebe »

Hi All,

I have a similar requirement.

Input.xml
<sender_details>
<sender_company>sender company name</sender_company>
<sender_name>Name</sender_name>
<sender_email>test@test.com</sender_email>
<sender_phone>123456789</sender_phone>
<sender_mobile>8888888</sender_mobile> <remarks>sender_address_street,streetname|sender_address_county,countyname</remarks>
</sender_details>
and my output should look like below:
<information>
<sender_address_street>streetname</sender_address_street>
<sender_address_county>countyname</sender_address_county> <!-- The XSL should replace this value -->
<someothernode>
<sender_address_county>KEEP THIS VALUE</sender_address_county> <!-- This value should NOT be replaced from the input -->
</someothernode>
</information>
Can you suggest please.

thanks a million, P
sandeepm
Posts: 10
Joined: Tue Feb 24, 2009 1:54 pm

Re: String parsing and matching

Post by sandeepm »

Dear All,

I finally got the solution to my problem, although its not full proof yet it gives us a good approach.

I have created a function using <xsl:function> which takes 2 parameters namely the input string and the key for which value needs to be extracted.

So, suppose if we have a string as
key1,value1|key2,value2|key3,value3 and we want to derive the value for key1.
Then the function call to 'string-parse-match' with this string and 'key1' will return 'value1'. Similarly, if key2 is passed, valu2 is returned and if key3 is passed then value3 is got as output.
The function code is shown below, it needs some optimization however as I said, it can give us a good approach.
Also, please note that this function is tested with XSLT 2.0 and Saxon-B.9.1.0.3.

Code: Select all


<xsl:function name="default:string-parse-match" as="xs:string">
<xsl:param name="inputStr" as="xs:string"/>
<xsl:param name="inputVar" as="xs:string"/>

<xsl:variable name="returnValue">

<xsl:if test="contains($inputStr, $inputVar)">

<xsl:variable name="tempIndex" select="string-length(substring-before($inputStr, $inputVar))+1" as="xs:integer"/>

<xsl:variable name="tempString" select="substring($inputStr, $tempIndex + string-length($inputVar), string-length($inputStr))" as="xs:string"/>

<xsl:variable name="test" select="substring-before($tempString, '|')"/>

<xsl:variable name="tempIndex2">
<xsl:choose>
<xsl:when test="normalize-space($test) = ''">
<xsl:value-of select="number(0)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="string-length(substring-before($tempString, '|'))+1"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>

<xsl:choose>
<xsl:when test="$tempIndex2 > 0">
<xsl:value-of select="substring($tempString, 2, $tempIndex2 - 2)"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="substring($tempString, 2, string-length($tempString))"/>
</xsl:otherwise>
</xsl:choose>
</xsl:if>

</xsl:variable>

<xsl:sequence select="$returnValue"/>

</xsl:function>
Cheers,
Sandeep

:D
Post Reply