Running identity transform creates unnecessary classes

Here should go questions about transforming XML with XSLT and FOP.
mdslup
Posts: 157
Joined: Tue Mar 06, 2018 1:34 am

Running identity transform creates unnecessary classes

Post by mdslup » Tue Nov 24, 2020 10:32 pm

Running the XSLT identity transform on a DITA document adds a bunch of classes to all the elements.

For example, this file:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="MyTable">
    <title>Table Test</title>
    <body>
        <section>
            <title>Section Title</title>
            <p><b id="1010title">Section Header</b><table frame="all" rowsep="1" colsep="1"
                    rowheader="headers" id="1010table">
                    <tgroup cols="4" align="left">
                        <colspec colname="c1" colnum="1" colwidth="1*"/>
                        <colspec colname="c2" colnum="2" colwidth="1*"/>
                        <colspec colname="c4" colnum="3" colwidth="1*"/>
                        <colspec colname="c5" colnum="4" colwidth="1*"/>
                        <tbody>
                            <row>
                                <entry><b>ID</b></entry>
                                <entry><b>Name</b></entry>
                                <entry><b>Part</b></entry>
                                <entry><b>Has procedure?</b></entry>
                            </row>
                            <row>
                                <entry>101010010</entry>
                                <entry>Component1</entry>
                                <entry>1234</entry>
                                <entry>yes</entry>
                            </row>
                            <row>
                                <entry>101010110</entry>
                                <entry>Component2</entry>
                                <entry>4567</entry>
                                <entry>no</entry>
                            </row>
                        </tbody>
                    </tgroup>
                </table></p>
        </section>
    </body>
</topic>
with this transform (the identity transform):

Code: Select all

<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
produces this output:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?><topic xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/" id="r1t_labor_codes" ditaarch:DITAArchVersion="1.3" domains="(topic abbrev-d)                            a(props deliveryTarget)                            (topic equation-d)                            (topic hazard-d)                            (topic hi-d)                            (topic indexing-d)                            (topic markup-d)                            (topic mathml-d)                            (topic pr-d)                            (topic relmgmt-d)                            (topic sw-d)                            (topic svg-d)                            (topic ui-d)                            (topic ut-d)                            (topic markup-d xml-d)   " class="- topic/topic ">
    <title class="- topic/title ">Table Test</title>
    <body class="- topic/body ">
        <section class="- topic/section ">
            <title class="- topic/title ">Section TItle</title>
            <p class="- topic/p "><b id="1010title" class="+ topic/ph hi-d/b ">Section Header</b><table frame="all" rowsep="1" colsep="1" rowheader="headers" id="1010table" class="- topic/table ">
                    <tgroup cols="4" align="left" class="- topic/tgroup ">
                        <colspec colname="c1" colnum="1" colwidth="1*" class="- topic/colspec "/>
                        <colspec colname="c2" colnum="2" colwidth="1*" class="- topic/colspec "/>
                        <colspec colname="c4" colnum="3" colwidth="1*" class="- topic/colspec "/>
                        <colspec colname="c5" colnum="4" colwidth="1*" class="- topic/colspec "/>
                        <tbody class="- topic/tbody ">
                            <row class="- topic/row ">
                                <entry class="- topic/entry "><b class="+ topic/ph hi-d/b ">ID</b></entry>
                                <entry class="- topic/entry "><b class="+ topic/ph hi-d/b ">Name</b></entry>
                                <entry class="- topic/entry "><b class="+ topic/ph hi-d/b ">Part</b></entry>
                                <entry class="- topic/entry "><b class="+ topic/ph hi-d/b ">Has procedure?</b></entry>
                            </row>
                            <row class="- topic/row ">
                                <entry class="- topic/entry ">101010010</entry>
                                <entry class="- topic/entry ">Component1</entry>
                                <entry class="- topic/entry ">1234</entry>
                                <entry class="- topic/entry ">yes</entry>
                            </row>
                            <row class="- topic/row ">
                                <entry class="- topic/entry ">101010110</entry>
                                <entry class="- topic/entry ">Component2</entry>
                                <entry class="- topic/entry ">4567</entry>
                                <entry class="- topic/entry ">no</entry>
                            </row>
                        </tbody>
                    </tgroup>
                </table></p>
        </section>
    </body>
</topic>
For example, notice that each entry now has a class "- topic/entry" that was not in the original. What's going on?

Radu
Posts: 7529
Joined: Fri Jul 09, 2004 5:18 pm

Re: Running identity transform creates unnecessary classes

Post by Radu » Wed Nov 25, 2020 9:32 am

Hi,

This is how XSLT processors work according to the XSLT specification, if the XML has an associated DOCTYPE, the XSLT processors also expand the DTD associated to the XML and if in the DTD there are default values for attributes, those values appear in the processed XML content.
If you do not want those default attributes you probably need to remove the DOCTYPE declaration from the XML, apply the XSLT and then add back the DOCTYPE declaration.
Oxygen has custom XML refactoring operations based on XSLT:
https://www.oxygenxml.com/doc/versions/ ... tools.html

and our custom XML refactoring operations do something similar, remove the DOCTYPE, apply the XSLT and add the DOCTYPE back in the result.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

mdslup
Posts: 157
Joined: Tue Mar 06, 2018 1:34 am

Re: Running identity transform creates unnecessary classes

Post by mdslup » Thu Nov 26, 2020 12:43 am

Thanks very much.

Post Reply