delete duplicate elements

Questions about XML that are not covered by the other forums should go here.
bee123
Posts: 6
Joined: Thu Dec 22, 2016 2:22 pm

delete duplicate elements

Post by bee123 » Tue Sep 11, 2018 3:08 pm

Hello,

I have a large file and have found that some elements are in it twice, now I would like to delete the duplicates. any ideas what I could do? Would appreciate any help!

The xml looks like this:

Code: Select all

<Toptag>
<text coordinates="" country="" date="yyyy-mm-dd" lang="" place="xyc" time="" id=" 123" name="xyz" >
<div>
This is text
</div>
</text>
<text coordinates="" country="" date="yyyy-mm-dd" lang="" place="xyc" time="" id=" 124" name="xyz" >
<div>
This is text
</div>
</text>
<text coordinates="" country="" date="yyyy-mm-dd" lang="" place="xyc" time="" id=" 123" name="xyz" >
<div>
This is text
</div>
</text>
....
</toptag>
In the duplicates, everything from the <text...............> <div> </div> </text> is exactly the same!

Thank you!!!!!!

Radu
Posts: 6211
Joined: Fri Jul 09, 2004 5:18 pm

Re: delete duplicate elements

Post by Radu » Wed Sep 12, 2018 9:14 am

Hi,

There is an XSLT function called "deep-equal" which might help.
Something like this:

Code: Select all

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>

<xsl:template match="*">
<xsl:variable name="current" select="."/>
<xsl:if test="not(preceding-sibling::*[deep-equal(., $current)])">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

Post Reply