Page 1 of 1

delete duplicate elements

Posted: Tue Sep 11, 2018 3:08 pm
by bee123
Hello,

I have a large file and have found that some elements are in it twice, now I would like to delete the duplicates. any ideas what I could do? Would appreciate any help!

The xml looks like this:

Code: Select all

<Toptag>
<text coordinates="" country="" date="yyyy-mm-dd" lang="" place="xyc" time="" id=" 123" name="xyz" >
<div>
This is text
</div>
</text>
<text coordinates="" country="" date="yyyy-mm-dd" lang="" place="xyc" time="" id=" 124" name="xyz" >
<div>
This is text
</div>
</text>
<text coordinates="" country="" date="yyyy-mm-dd" lang="" place="xyc" time="" id=" 123" name="xyz" >
<div>
This is text
</div>
</text>
....
</toptag>
In the duplicates, everything from the <text...............> <div> </div> </text> is exactly the same!

Thank you!!!!!!

Re: delete duplicate elements

Posted: Wed Sep 12, 2018 9:14 am
by Radu
Hi,

There is an XSLT function called "deep-equal" which might help.
Something like this:

Code: Select all

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>

<xsl:template match="*">
<xsl:variable name="current" select="."/>
<xsl:if test="not(preceding-sibling::*[deep-equal(., $current)])">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Regards,
Radu