Page 1 of 1

DITA-OT HTML5 processing in PDF Chemistry

Posted: Sat Aug 01, 2020 5:45 pm
by chrispitude
PDF Chemistry uses the HTML5 output from the DITA-OT as a starting point for its own publishing process. And I really like that when I create my own HTML5 plugins to customize labels and structures, that PDF Chemistry picks up the changes!

But the standard DITA-OT HTML5 transformation creates an entire directory structure, whereas PDF Chemistry generates a single merged file. So this makes me wonder:
  • How does PDF Chemistry create a merged file instead of a directory structure?
  • Are there preprocess/HTML5 XSLT templates I shouldn't touch, because they could break something PDF Chemistry does differently in its own HTML5 creation?
  • Are there any steps or templates that PDF Chemistry omits?
For example, I see that PDF Chemistry has DITA-specialized attributes in its merged HTML5 file to enable CSS styling, whereas the standard HTML5 output has these removed.

Thanks!

Re: DITA-OT HTML5 processing in PDF Chemistry

Posted: Thu Aug 06, 2020 5:51 pm
by julien_lacour
Hello Chris,

In fact the PDF Chemistry transformation is based on the PDF XSL-FO transformation but uses stylesheets from the HTML5 DITA-OT plugin during the pre-process. You can see this workflow inside the DITA-OT-DIR/plugins/com.oxygenxml.pdf.css/build.xml file.
  • The directory structure is specific to our transformation and is inspired from the pdf2 plugin (as I explained above)
  • Most of the HTML5 XSLT templates are used by the pdf-css-html5 plugin (the one using Chemistry) so potentially each modification in these stylesheet will impact the PDF output (and also possibly the WebHelp Responsive output).
  • Some of the templates are modified from the HTML5 basis, in this case they are inside DITA-OT-DIR/plugins/com.oxygenxml.pdf.css/xsl
You will also see the html5 stylesheet import inside DITA-OT-DIR/plugins/com.oxygenxml.pdf.css/xsl/merged2html5/html5.xsl (it is the parent stylesheet called to create the .merged.html file)

Regards,
Julien

Re: DITA-OT HTML5 processing in PDF Chemistry

Posted: Sat Aug 22, 2020 4:02 pm
by chrispitude
Julien, this is exactly the information I needed - thank you!

Re: DITA-OT HTML5 processing in PDF Chemistry

Posted: Wed Sep 23, 2020 10:15 pm
by chrispitude
For certain cross-references, we have a special title-only link text format specified with @outputclass:

Code: Select all

<xref href="#tableid" outputclass="title"/>
We then have code applied by the dita.xsl.topicpull extension point to implement it:

Code: Select all

  <!-- a cross-reference with outputclass=='title' always uses the target title text -->
  <xsl:template match="*[contains(@class, ' topic/table ') or contains(@class, ' topic/fig ')]"
                mode="topicpull:resolvelinktext" priority="20">
    <xsl:param name="linkElement" as="element()" tunnel="yes"/>
    <xsl:choose>

      <!-- business as usual if @outputclass != 'title' -->
      <xsl:when test="not($linkElement[contains(@outputclass, 'title')])">
        <xsl:next-match/>
      </xsl:when>

      <!-- force the title as the target text -->
      <xsl:otherwise>
        <xsl:variable name="target-text" as="xs:string*">
          <xsl:apply-templates
            select="*[contains(@class, ' topic/title ')]" mode="text-only"/>
        </xsl:variable>
        <xsl:value-of select="normalize-space(string-join($target-text, ''))"/>
      </xsl:otherwise>

    </xsl:choose>
  </xsl:template>
PDF Chemistry ignores my mode="insertReferenceTitle" template; I do not see it used in the merged XML code created by the merged2merged code, and I see that PDF Chemistry has its own mode="insertReferenceTitle" target text machinery instead at

<oxygen>/frameworks/dita/DITA-OT3.x/plugins/com.oxygenxml.pdf.css/xsl/merged2merged/merged-links.xsl

Now here's where I am confused. In my plugin, if I extend com.oxygenxml.pdf.css.xsl.merged2merged and add the following simple template, then *my* mode="topicpull:resolvelinktext" code is used!

Code: Select all

<xsl:template match="*" mode="insertReferenceTitle" priority="100">
  <xsl:next-match/>
</xsl:template>
But if I comment that template out, then PDF Chemistry's mode="insertReferenceTitle" code is used instead. The template above shouldn't change anything; it should simply fall through to the next template. But its presence or absence makes my template get applied or not applied, respectively.

What would cause this?

Re: DITA-OT HTML5 processing in PDF Chemistry

Posted: Thu Sep 24, 2020 12:53 pm
by Dan
I am not sure why this happens.

You can try debugging the XSLT transformation with oXygen - it is a bit complicated to setup this, but here is how it may be done:
The file you need to debug is:

DITA-OT/plugins/com.oxygenxml.pdf.css/xsl/merged2merged/merged.xsl

First, make a DITA transformation with the parameter "clean.temp" set to "no".
Then, in Oxygen XML Editor:

1. Then open the file stage1.xml file from the temporary directory.
2. Open the merged.xsl file and comment the line importing the template extension point:<xsl:import href="template:xsl/com.oxygenxml.pdf.css.xsl.merged2merged"/>
3. In the file merged-all.xsl from the same directory comment out: <xsl:import href="../review/review-pis-to-elements.xsl"/>
4. Validate the merged.xsl file, try to fix the errors by removing the missing templates calls "<xsl:call-template name="add-review-pis-for-root"/>", and by replacing calls to the function getImage(size) with zero.
5. Comment out the include to merged-tables.xsl from merged-all.xsl
6. Make a transformation, it should complete without errors.
7. Now make the transformation in the XSLT debugger.

Place breakpoints in your template and watch how the call stack looks like.

Many regards,
Dan

Re: DITA-OT HTML5 processing in PDF Chemistry

Posted: Mon Sep 28, 2020 5:34 pm
by chrispitude
Hi Dan,

Many thanks for your instructions! I will likely need them at some point to figure something out.

But fortunately, I found my issue through code inspection. I realized that preprocess is called first, then merged2merged is called later to possibly replace it. My issue was that my <xsl:next-match/> override of the mode="insertReferenceTitle" template did not pass the parameters down to the next-matching template, causing it to fall through and leave the previous content from the mode="topicpull:resolvelinktext" template there.

And also during code inspection, I noticed that the mode="insertReferenceTitle" template leaves the existing content unmodified if there is a DITA-OT "usertext" PI in place. So, I updated my mode="topicpull:resolvelinktext" template to set this:

Code: Select all

  <!-- add usertext PI for xrefs with outputclass='title' -->
  <xsl:template match="*[contains(@outputclass, 'title')]" mode="topicpull:getlinktext">
    <xsl:param name="targetElement" as="element()"/>
    <xsl:apply-templates select="." mode="topicpull:add-usertext-PI"/>  <!-- I had to add this part -->
    <xsl:next-match>
      <xsl:with-param name="targetElement" as="element()" select="$targetElement"/>
    </xsl:next-match>
  </xsl:template>
and now PDF Chemistry leaves the preprocess-derived text in place! And I did not have to use any PDF Chemistry extension points after all.

Re: DITA-OT HTML5 processing in PDF Chemistry

Posted: Tue Sep 29, 2020 9:17 am
by Dan
Such a complex situation... Glad you work it out!