xslt in notepad++ vs XML author: +/- output

Here should go questions about transforming XML with XSLT and FOP.
deepbluesea70
Posts: 5
Joined: Tue Aug 27, 2019 3:52 pm

xslt in notepad++ vs XML author: +/- output

Post by deepbluesea70 » Tue Aug 27, 2019 4:16 pm

Hi there,

I'm working on an XSLT program to convert a dita topic (as result of oXygen's markdown2dita conversion) to a concept topic (a more specific dita topic). Started out using notepad++ and since I got tired of the lack of tool support to find bugs, now switched to XML author.

Using Saxon-PE 9.8.0.12 as transformer, I had to adjust a few things to that the XML plugin in notepad++ aparently ignored, but I could get my stylesheet to work both in XML author and in notepad++.

However, the transformation result looks very different in XML author:
  • Throughout the document, I match elements like *topic*, and *body* and replace them accordingly with *section* and *subsection*. That works in principle. However, although I skipped the source elements *topic*, *body* and the like in the output, they are added anyway for example:

    Code: Select all

    <li>- topic/li The user can either delete the user account resources manually or run
                                <codeph>+ topic/ph pr-d/codeph terraform destroy</codeph>:
    -> where is the "+ topic/ph pr-d" coming from?
It's as if XML author would like to give me an idea which nodes of the source dita document are parsed (the "+" and "-" suggest that), but how do I get rid of this text? Is that rather an output coming from saxon?

Any idea how I could solve this?

Thank you,
Jens

Radu
Posts: 6582
Joined: Fri Jul 09, 2004 5:18 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by Radu » Wed Aug 28, 2019 9:02 am

Hi Jens,

The XML document has a DOCTYPE declaration and when applying an XSLT stylesheet the XSLT processor (Saxon 9 in this case) being compliant with the XML standard needs to expand the DOCTYPE declaration and to supply all default attribute values (the DITA @class attribute values for example) to the processed XML document.
For example if I open in Oxygen this very small DITA topic:

Code: Select all

<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="id">
    <title>T</title>
    <shortdesc>S</shortdesc>
    <body>
        <p>P</p>
    </body> 
</topic>
and I apply over it this XSLT stylesheet which contains a copy template (copies all the input to the output):

Code: Select all

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    <!-- Copy all input nodes to the output -->
    <xsl:template match="node() | @*">
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>
</xsl:stylesheet>
then I obtain this result:

Code: Select all

<topic xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/" id="id" ditaarch:DITAArchVersion="1.3" domains="(topic abbrev-d)                            a(props deliveryTarget)                            (topic equation-d)                            (topic hazard-d)                            (topic hi-d)                            (topic indexing-d)                            (topic markup-d)                            (topic mathml-d)                            (topic pr-d)                            (topic relmgmt-d)                            (topic sw-d)                            (topic svg-d)                            (topic ui-d)                            (topic ut-d)                            (topic markup-d xml-d)   " class="- topic/topic ">
    <title class="- topic/title ">T</title>
    <shortdesc class="- topic/shortdesc ">S</shortdesc>
    <body class="- topic/body ">
        <p class="- topic/p ">P</p>
    </body> 
</topic>
Those extra attributes which appeared all come as default attribute values from the DTD (from the DOCTYPE declaration referenced in the original topic).

But you can write your XSLT stylesheet to remove those unwanted attributes and namespace declarations, something like this:

Code: Select all

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs ditaarch"
    xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/"
    version="2.0">
    
    <!-- Ommit attributes coming from the DTDs -->
    <xsl:template match="@class | @domains | @ditaarch:DITAArchVersion"/>
    
    <xsl:template match="node() | @*">
        <xsl:copy copy-namespaces="no">
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
    </xsl:template>
    
</xsl:stylesheet>
For example that extra copy-namespaces="no" attribute on the xsl:copy will avoid copying the namespace declarations to the output file.
About the output you obtain:

Code: Select all

<li>- topic/li .....
That particular "topic/li" is the original @class attribute set on the <li>. Probably it appears because you make a mistake somewhere in the XSLT, match the @class attribute and output it as a text node. But without having your entire XSLT stylesheet it's hard to know.

About your original task of converting a DITA topic to a concept, Oxygen already has an XML refactoring action included for this. You can just right click inside the opened DITA topic and choose "Refactoring=>Convert to concept". And we also do this with XSLT, the top level XSLT which converts any DITA topic type to concept is this one:

OXYGEN_INSTALL_DIR\frameworks\dita\refactoring\dita-files-conversion-stylesheets\convert-resource-to-concept-entrypoint.xsl

Our XSLT stylesheet has some extensions which can only be used when it's bundled with an XML refactoring operation:

https://www.oxygenxml.com/doc/versions/ ... tools.html

Also Oxygen's XML refactoring operations avoid expanding the DOCTYPE declaration and thus adding default attributes when the XML document is processed via XSLT.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

deepbluesea70
Posts: 5
Joined: Tue Aug 27, 2019 3:52 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by deepbluesea70 » Wed Aug 28, 2019 10:16 am

XML-Refactoring-Ixiasoft-DITA-CMS.png
XML-Refactoring-Ixiasoft-DITA-CMS.png (75.3 KiB) Viewed 680 times
Hi Radu,

thank you very much for your quick and helpful answer. You made my day!

I wasn't aware that there is such a refactoring action included for what I am trying to do. Note that our version of the oXygen editor (XML Author 20.1, build: 2018161313) is included in Ixiasoft's DITA CMS so things might look a little bit different in this context.

What I currently do in Ixiasoft is:
1. Use File -> Open File... to open a markdown file.
2. To convert this to DITA (topic class), I choose Export as DITA Topic in the DITA preview tab.
3. I use XML -> Apply Transformation Scenario(s) to apply my own stylesheet on the DITA topic.

So I could try to replace step 3 with what you suggest. I checked for the refactoring stylesheet you referred to by using menu XML -> XML Refactoring... . I chose xr_dita_convert_resource_to_concept as Refactoring operation, because that seemed to be the closest to what I want achieve, but I got lots of errors when I tried to execute it. Maybe this option is not available with the version of oXygen that comes with Ixiasoft? See my screenshot for a complete list of Refactoring operations that are offered to me.

Best regards,
Jens

Radu
Posts: 6582
Joined: Fri Jul 09, 2004 5:18 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by Radu » Thu Aug 29, 2019 9:24 am

Hi Jens,

Could you also paste (or give screenshots with) some of the errors you get when trying to use the refactoring action?

When our Oxygen Eclipse plugin is used with the Eclipse + Ixiasoft workbench, it's customized in various ways by Ixiasoft so this might explain why the conversion does not work for you. How about if you install a separate Oxygen (either as a standalone executable or within a new Eclipse installation as an Eclipse plugin)? Your license key should work also with the new installation and you could use that installation for conversions once you have exported the topics out of Ixiasoft.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

deepbluesea70
Posts: 5
Joined: Tue Aug 27, 2019 3:52 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by deepbluesea70 » Thu Aug 29, 2019 10:37 am

Hi Radu,

I am looking for a process that many authors could follow without the need to install additional tools. My first idea was that they at least would have to install an XSLT engine locally, but now I am glad I found the already existing option to trigger an XSLT transformation right in the Ixiasoft system - that simplifies things a lot.

Regarding the errors: I tried it again today and at least was able to see a preview of the conversion from topic to concept. Not sure why (maybe, because my MD file was simpler. The whole issue seems to be file system related. I get errors when I save my dita file (after the md to dita conversion):
02-Errors-after-saving-dita-file.png
02-Errors-after-saving-dita-file.png (37.69 KiB) Viewed 659 times
And after clicking on "Finish" I get more errors:
05-Errors-after-clicking-on-Finish.png
05-Errors-after-clicking-on-Finish.png (65.22 KiB) Viewed 659 times
I could not see the result anywhere, not in a new window, and there was no message where the result was saved (if it was saved). That's maybe a configuration issue in Ixiasoft and I'd have to talk to our administrators here to see if we can do something about it.

I will continue to work on my stylesheet for now to see if I can make that run properly in oXygen/Ixiasoft without the superfluous outputs of the DTD namespaces (thanks again for this hint!). Btw, I can see from the preview that all "topic" elements are converted to "concept" elements, even for those on a lower level. Not sure if this will work for us, the lower level elements should be "sections" or "subsections", right?
04-Transformation-Preview.png
04-Transformation-Preview.png (80.18 KiB) Viewed 659 times
Anyway, using a standard stylesheet of oXygen with own enhancements looks like an interesting option to me that I will surely check on again in more detail later.

Thanks for your good support! :)
- Jens

PS: The MD file I used:

Code: Select all

# Prerequisites

Before you start with the show...

## Sesame Street

You have the keys for the building.

As director, you have activated the following equipment:

-   *Bert's headphones* \(BH\).
-   *Ernie's drum kit* \(EDK\).
-   *Oscar's bin* \(OB\).

    1.  Go to the Sesame Street console.
    2.  Create an instance of Bert's headphones.

Be sure you have done the following:

1.  Put the following items on the stage:

    -   `headphones`
    -   `bananas`
    -   `drumKit`

1.  Invite `actor1`, `actor2`, and `actor3` to the stage.

1.  Make sure:
    -   You have fun
    -   The audience has fun

## Sesame%section_vgp_kvg_l3b

-   Do this
-   And don't forget that

## Perform

No go and show it to an audience.

The result I got from the MD to topic conversion:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="prerequisites">
    <title>Prerequisites</title>
    <body>
        <p>Before you start with the show...</p>
    </body>
    <topic id="sesame_street">
        <title>Sesame Street</title>
        <body>
            <p>You have the keys for the building.</p>
            <p>As director, you have activated the following equipment:</p>
            <ul>
                <li><i>Bert's headphones</i> (BH).</li>
                <li><i>Ernie's drum kit</i> (EDK).</li>
                <li><i>Oscar's bin</i> (OB).<ol>
                        <li>Go to the Sesame Street console.</li>
                        <li>Create an instance of Bert's headphones.</li>
                    </ol></li>
            </ul>
            <p>Be sure you have done the following:</p>
            <ol>
                <li><p>Put the following items on the stage:</p><ul>
                        <li><codeph>headphones</codeph></li>
                        <li><codeph>bananas</codeph></li>
                        <li><codeph>drumKit</codeph></li>
                    </ul></li>
                <li><p>Invite <codeph>actor1</codeph>, <codeph>actor2</codeph>, and
                            <codeph>actor3</codeph> to the stage.</p></li>
                <li><p>Make sure:</p><ul>
                        <li>You have fun</li>
                        <li>The audience has fun</li>
                    </ul></li>
            </ol>
        </body>
    </topic>
    <topic id="sesame_section_vgp_kvg_l3b">
        <title>Sesame%section_vgp_kvg_l3b</title>
        <body>
            <ul>
                <li><p>Do this</p></li>
                <li><p>And don't forget that</p></li>
            </ul>
        </body>
    </topic>
    <topic id="perform">
        <title>Perform</title>
        <body>
            <p>No go and show it to an audience.</p>
        </body>
    </topic>
</topic>


Radu
Posts: 6582
Joined: Fri Jul 09, 2004 5:18 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by Radu » Thu Aug 29, 2019 2:58 pm

Hi Jens,

About the screenshots you posted, if you go to the "Error log" to one of those "Unhandled event loop" exceptions and right click it there should be some kind of "Copy details" action, can you paste what it copies in a reply? I could also try to talk directly with Ixiasoft about this, see if we can make our refactoring actions work with content from their CMS.

About this remark:
Btw, I can see from the preview that all "topic" elements are converted to "concept" elements, even for those on a lower level. Not sure if this will work for us, the lower level elements should be "sections" or "subsections", right?
It's perfectly legal according to the DITA standard to have nested topics (with any depth level). Same goes for nested concepts. In my opinion it's more correct to convert the nested topics to nested concepts. Otherwise, the problem with converting nested topics to DITA <sections>s is that sections cannot be nested one inside the other. So if originally you have topic inside a topic inside a topic (depth three of nested topics) you cannot convert that to a DITA concept with sections because inside the section you will not be able to add another one and you will be forced to add a section after the section (as a sibling and not as a child).
Also by converting <topic> to <section> you may lose semantics plus topics can have <prolog> elements but you cannot add such an element to a DITA <section>.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

deepbluesea70
Posts: 5
Joined: Tue Aug 27, 2019 3:52 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by deepbluesea70 » Thu Aug 29, 2019 5:16 pm

Hi Radu,

I assume I hadn't seen concept nestings before, because my company uses a specialization of DITA for the concept topic. And they decided that too many nestings confuse readers, so they limited it to one section and one subsection element. So I think there is no wrong or right here, it depends how companies decide to use DITA.

Good news: Your advice about the DTD elements and namespaces helped. I now get a transformation result that I can use to import markdown to dita by first using the oXygen conversion to convert it to dita class topic and then using my transformation to convert it to concept (our specialization ;-) ).

My sanity check is always to switch from text mode to author mode, and sometimes I can't do this (maybe it's the file length, I need to investigate further). With a few other files this afternoon, all worked fine, I just got the following errors:
xslt-03-errors.png
xslt-03-errors.png (96.1 KiB) Viewed 636 times
My feeling is that whenever I create topic using Ixiasoft, the file that is created 'lives in another world' compared to files that I open using Open -> File or when files are generated using the Export as Dita topic option after the markdown conversion. Maybe the topics need to be saved at a special location?
xslt-01-conflicting-handlers.png
xslt-01-conflicting-handlers.png (33.47 KiB) Viewed 636 times
xslt-02-loop-exception.png
xslt-02-loop-exception.png (40.12 KiB) Viewed 636 times
Best regards,
Jens

Radu
Posts: 6582
Joined: Fri Jul 09, 2004 5:18 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by Radu » Fri Aug 30, 2019 7:38 am

Hi Jens,

I'm glad things are moving along.
Looking at your second screenshot containing an error's details, there is an "Exception Stack Trace" text area containing something like "SWTException: Widget is Disposed". I would be interested in having the entire contents of that text area copied and then pasted on this forum thread. Also I would need to know the precise version of Oxygen plugin you are using (main menu Window->Preferences->"Oxygen XML Editor/Author" page, there should be a version number and a build id there).

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

deepbluesea70
Posts: 5
Joined: Tue Aug 27, 2019 3:52 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by deepbluesea70 » Mon Sep 30, 2019 3:34 pm

Hi Radu,

just wanted to give you an update here: I could not reproduce these errors for a while now.

I suspected that they occur whenever I open my Ixiasoft system with the integrated version of the oXygen editor and XML author, but as said, I could not reproduce these errors.

If I come across them again, I'll post them here.

Regards,
Jens

Radu
Posts: 6582
Joined: Fri Jul 09, 2004 5:18 pm

Re: xslt in notepad++ vs XML author: +/- output

Post by Radu » Tue Oct 01, 2019 7:23 am

Hi Jens,

Thanks for the help.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

Post Reply