[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] How to do this tricky elimination on XML using XSLT ?


Subject: [xsl] How to do this tricky elimination on XML using XSLT ?
From: Jo Na <jkoe888@xxxxxxxxx>
Date: Mon, 18 Jun 2012 11:54:17 +0700

Hi,
I have this input xml:
    <map>
        <region>
            <gridA id="1">
                <blockA id="01" method="build">
                    <building1 id="x" method="build">
                        <otherchild>a</otherchild>
                    </building1>
                    <building1 id="x" method="build"> <!-- this one
will be removed -->
                        <otherchild>a</otherchild>
                    </building1>
                </blockA>

                <blockA id="01">
                    <building1 id="x" method="modify">
                        <otherchild>a</otherchild>
                    </building1>
                    <building1 id="x" method="build"> <!-- this one
will be kept (prev node have same id but diff method so it's not
considered as successive -->
                        <otherchild>a</otherchild>
                    </building1>
                </blockA>

                <blockA id="02">
                    <building3 id="y" method="modify">
                        <otherchild>b</otherchild>
                    </building3>
                    <building2 id="x" method="demolish"/>
                </blockA>

                <blockA id="01">
                    <building1 id="y" method="build"> <!-- this one
will be kept (diff id) -->
                        <otherchild>a</otherchild>
                    </building1>
                    <building1 id="x" method="build"> <!-- this one
will be removed -->
                        <otherchild>a</otherchild>
                    </building1>
                </blockA>

                <blockA id="02">
                    <building3 id="y" method="modify"> <!-- this one
will be removed -->
                        <otherchild>b</otherchild>
                    </building3>
                    <building2 id="x" method="demolish"/> <!-- this
one will be removed -->
                </blockA>
            </gridA>

            <gridA id="2">
                <blockA id="01" method="build">
                    <building1 id="x" method="build">
                        <otherchild>a</otherchild>
                    </building1>
                    <building1 id="x" method="build"> <!-- this one
will be removed -->
                        <otherchild>a</otherchild>
                    </building1>
                    <building1 id="x" method="build"> <!-- this one
will be kept (diff children) -->
                        <otherchild>b</otherchild>
                    </building1>
                </blockA>
                <blockA id="01">
                    <building1 id="x" method="build"> <!-- this one
will be removed -->
                        <otherchild>b</otherchild>
                    </building1>
                </blockA>
            </gridA>
            <gridB id="1">
                ...and so on..
            </gridB>
        </region>
    </map>

Expected Output:

    <map>
        <region>
            <gridA id="1">
                <blockA id="01" method="build">
                    <building1 id="x" method="build">
                        <otherchild>a</otherchild>
                    </building1>
                </blockA>

                <blockA id="01">
                    <building1 id="x" method="modify">
                        <otherchild>a</otherchild>
                    </building1>
                    <building1 id="x" method="build"> <!-- this one
will be kept (prev node have same id but diff method so it's not
considered as successive -->
                        <otherchild>a</otherchild>
                    </building1>
                </blockA>

                <blockA id="02">
                    <building3 id="y" method="modify">
                        <otherchild>b</otherchild>
                    </building3>
                    <building2 id="x" method="demolish"/>
                </blockA>

                <blockA id="01">
                    <building1 id="y" method="build"> <!-- this one
will be kept (diff id) -->
                        <otherchild>a</otherchild>
                    </building1>
                </blockA>

                <blockA id="02"/>
            </gridA>

            <gridA id="2">
                <blockA id="01" method="build">
                    <building1 id="x" method="build">
                        <otherchild>a</otherchild>
                    </building1>

                    <building1 id="x" method="build"> <!-- this one
will be kept (diff children) -->
                        <otherchild>b</otherchild>
                    </building1>
                </blockA>
                <blockA id="01"/>
            </gridA>
            <gridB id="1">
                ...and so on..
            </gridB>
        </region>
    </map>
The XSLT so far:

    <xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
        <xsl:output indent="yes"/> <xsl:strip-space elements="*"/>

        <xsl:template match="node()|@*">
            <xsl:copy>
                <xsl:apply-templates select="node()|@*"/>
            </xsl:copy>
        </xsl:template>

        <xsl:template match="region/*/*/*
             [deep-equal(.,preceding::*[name()=current()/name()]
                           [@id = current()/@id]
                           [../../@id = current()/../../@id][1])]" />
    </xsl:stylesheet>

the problem with the XSLT right now is that it cannot differentiate
duplicates that happens in siblings (i.e blockA with the same id).

I need to remove a node that are considered as *repetitive*.

**Two node that have the same `name` and `id` will be considered
*repetitive* if it appears one after another and it has the same
`method` and `children`.**

**for example:**

    <elem id="1" method="a" />
    <elem id="1" method="a" /> <!-- this is repetitive for id=1-->
    <elem id="1" method="b" />
    <elem id="1" method="a" /> <!-- this is the new boundary for removal
id=1-->
    <elem id="2" method="a" />
    <elem id="1" method="a" /> <!-- this is repetitive for id=1 -->
    <elem id="2" method="a" /> <!-- this is repetitive for id=2 -->

**will be simplified into:**

    <elem id="1" method="a" />
    <elem id="1" method="b" />
    <elem id="1" method="a" /> <!-- this is the new boundary for removal
id=1-->
    <elem id="2" method="a" />

 **- Everytime a successive node with the `same id` has `different method`,
   the `boundary` for the next removal for that `id` is reset.**

 - we need to take into account duplicates that are under one parent
or siblings (two or more parents nodes that has the same element name
and id) i.e (in example: `blockX`)
 - if the two nodes being compared did not share the same `gridX`
level, then they should not be considered as duplicates to be removed

Please let me know how to achieve such transformation using XSLT.
Thanks very much for the help.


Current Thread
Keywords