Page 1 of 1

Elements are only converted in FO/PDF output, not HTML

Posted: Fri Oct 05, 2018 3:47 pm
by Marvin
We have DITA files which are converted to PDF via FO as an intermediary format using DITA OT (v. 1.8 - hasn't been updated in a while).
As a new feature, we also want to create HTML output, for which we decided the latest version of DITA OT (v. 3.1) and the HTML 5 plugin.
Unfortunately, some elements are completely missing from the output, for example <definitionlist> elements - even though they are converted just fine for the FO/PDF output.

This is what it may look like in the source file:

Code: Select all

<definitionlist id="definitionlist_A4F5FF1D27AF4A3B8E14311EF5F9DD6D" indent="itemspace1">
<listentry id="listentry_3189354FC62A4D31AFEB885DD57756FE">
<dterm id="dterm_F33742B299E942A5BE0F2322077EDC70">f<sub id="sub_5400D2E1701F4B7FA54D077A7D6BEC76">B</sub>
</dterm>
<description id="description_1C33DC2F56D24F0395786E91B493EE37">
<p id="p_6CA593A719F742759E3AFEE0E80C2110">bending flexibility</p>
</description>
</listentry>
<listentry id="listentry_2A0FD0CC910E4E85A0CEBBEB66142CA6">
<dterm id="dterm_56DD0B393AAA416A8846DE45E4EDE6EC">f<sub id="sub_754F91D3BDF840A4874DC069E9C0BE71">Q</sub>
</dterm>
<description id="description_B38471282D874F8D99EAE90575D485B6">
<p id="p_788F290E99634B138DD3FB2051500447">shear flexibility</p>
</description>
</listentry>
</definitionlist>
Absolutely nothing is generated for this in the HTML output and there is no error message or warning either.
However, when I add a "class" attribute to the <definitionlist> element itself, at least a yellow error message is created in the output.
I am quite new to DITA OT and find the documentation to be very sparse, so if somebody could point me into the right direction, that would be highly appreciated.

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Mon Oct 08, 2018 11:10 am
by Radu
Hi,

Elements like "<definitionlist>" are not part of the DITA specification so probably the PDF output works fine because you created some kind of PDF customization plugin for it.
So you will probably need to create a DITA OT plugin for the HTML5 output in order to add your custom XSLT stylesheet which handles these new elements and produces the equivalent HTML content.
An example of a DITA OT plugin which customizes the HTML5 and XHTML outputs:

https://github.com/oxygenxml/dita-image-float

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Mon Oct 08, 2018 11:35 am
by Marvin
Thank you so much for the swift reply.
That's what I already suspected, how can we catch things like this? There is absolutely nothing in the log file, even with -d and -v specified.
With other elements/attributes, there is at least a warning in the log, like this:

Code: Select all

 [gen-list] [DOTJ030I][INFO] No 'class' attribute for was found for element '<math>'. The element will be processed as an unknown or non-DITA element.
[gen-list] [DOTJ031I][INFO] No specified rule for 'parent=book' was found in the ditaval file. This value will use the default action, or a parent prop action if specified. To remove this message, you can specify a rule for 'parent=book' in the ditaval file.
How would I go about searching for the customization in the PDF generator? I have searched the whole directory tree for "definitionlist", but didn't find anything.

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Mon Oct 08, 2018 4:16 pm
by Radu
Hi Marvin,

I'm sorry but I do not know much about your current setup. Maybe you should discuss this with the people who originally built your DTD specialization.
If you place the cursor position inside the "definitionlist" element, Oxygen's "Attributes" view should show you the value of its "class" attribute. That value should show from what DITA element it was originally derived. Usually the publishing without extra customization should treat the element as if it was it's base element.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 12:15 am
by Marvin
Hi Radu,

The problem is the person who originally wrote most of this specialization does not work for the organisation anymore, and the other one who has worked a lot on this is currently on sick leave.
But I have been able to find some things related to <definitionlist> elements in the PDF generator source code:

Code: Select all

<!ATTLIST definitionlist			  %global-atts;  class CDATA "- topic/dl     	    ourcompany-section/definitionlist ">
<!ATTLIST listitem %global-atts; class CDATA "- topic/li ourcompany-section/listitem ">
<!ATTLIST listentry %global-atts; class CDATA "- topic/dl ourcompany-section/listentry ">
<!ATTLIST dterm %global-atts; class CDATA "- topic/dt ourcompany-section/dterm ">
<!ATTLIST description %global-atts; class CDATA "- topic/dd ourcompany-section/description ">
So apparently this just maps them to standard "topic/dl" elements...? How does it do that?

I also found some .xsl files related to definition lists, but to me it seems like they only set some formatting-related things.

Where can I find more information about DITA customization like this? What would I have to do to replicate this for the HTML output?
Any keywords I can search for will help, as I am currently quite lost.

I have already tried to replace <definitionlist> with <dl> etc., and I have also tried to simply specify the class, imitating what I have seen other places in the document (<definitionlist class="- topic/dl">), but neither had any effect on the output. It's still completely missing...

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 12:41 am
by Marvin
Ok, somehow it did work now by renaming the elements in the source document:

Code: Select all

		<dl id="definitionlist_A4F5FF1D27AF4A3B8E14311EF5F9DD6D">
<dlentry id="listentry_3189354FC62A4D31AFEB885DD57756FE">
<dt id="dterm_F33742B299E942A5BE0F2322077EDC70">f<sub id="sub_5400D2E1701F4B7FA54D077A7D6BEC76">B</sub></dt>
<dd id="description_1C33DC2F56D24F0395786E91B493EE37">
<p id="p_6CA593A719F742759E3AFEE0E80C2110">bending flexibility</p>
</dd>
</dlentry>
<dlentry id="listentry_2A0FD0CC910E4E85A0CEBBEB66142CA6">
<dt id="dterm_56DD0B393AAA416A8846DE45E4EDE6EC">f<sub id="sub_754F91D3BDF840A4874DC069E9C0BE71">Q</sub></dt>
<dd id="description_B38471282D874F8D99EAE90575D485B6">
<p id="p_788F290E99634B138DD3FB2051500447">shear flexibility</p>
</dd>
</dlentry>
</dl>
I have no idea why it didn't work five minutes ago, but now I am a big step closer to getting this right.
But I would still love to understand what is really going on in the PDF generation.

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 8:39 am
by Radu
Hi,

Usually the publishing stylesheets match the DITA element based on their @class attribute value.
As an example in the XSLT stylesheet DITA-OT/plugins/org.dita.html5/xsl/topic.xsl there is the XSLT template which matches the DITA <dl> element:

Code: Select all

  <xsl:template match="*[contains(@class, ' topic/dl ')]" name="topic.dl">
Because your "definitionlist" element has the class="- topic/dl ourcompany-section/definitionlist ", the XSLT template above should match "definitionlist" as well. You can add xsl:messages inside it to see if this is the case.
Same goes for elements which are usually inside a <dl> element, you can find the XSLT templates matching them in the same place, add xsl:messages there and see if they are matched or not.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 11:04 am
by Marvin
Thank you. So alternatively, I could just add that class to the <definitionlist> element?
I think I tried that and it didn't work, but maybe I didn't have enough whitespace around the class.
Why are all classes specified with so much whitespace around (" topic/dl ") and what does the leading "-" mean? Why is it specified with a dash in the DTD, but without one in the XSL file?

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 11:23 am
by Radu
Hi Marvin,

So:
So alternatively, I could just add that class to the <definitionlist> element?
This is not how things usually work, the @class attribute usually has a default value coming from the DTD associated with the XML document. So your DITA DTD specialization (which added these new elements like <definitionlist> also specified in the DTD default values for them). For example if you open a topic containing <definitionlist> and switch to the Text editing mode, you can right click on the XML element's tag and choose "Go to Definition" to see where this XML element was defined in the DTD.
I think I tried that and it didn't work, but maybe I didn't have enough whitespace around the class.
It's hard to give advice without being able to reproduce the situation on my side. Try to add some xsl:messages in the XSLT in that template which matches "topic/dl" to see if it gets called, if it does try to add xsl:messages in the templates which match "topic/dt" and "topic/dd" to see if those are called as well.
Why are all classes specified with so much whitespace around (" topic/dl ") and what does the leading "-" mean?
That whitespace is just a side effect of how the value was specified in the DTDs. But you need at least one single space between those tokens. So the minimal @class attribute value would be something like - topic/dl ourcompany-section/definitionlist .
That leading "-" is a convention:

https://www.oxygenxml.com/dita/1.3/spec ... ibute.html

but it does not matter much.
Why is it specified with a dash in the DTD, but without one in the XSL file?
As your element has class="- topic/dl ourcompany-section/definitionlist " and the XSLT template matches one of the tokens in the @class attribute value using the contains() function:

Code: Select all

<xsl:template match="*[contains(@class, ' topic/dl ')]" name="topic.dl">
then the xsl:template should also match your <definitionlist specialization element and treat it as if it is a <dl>.
In a way that's the essence of the DITA vocabulary, you add various new elements but specify in their @class attribute value that they extend some base elements, then the XSLT stylesheets without any changes should match these new elements and treat them as if they were the base elements. Or you can add new XSLT customizations to handle these new elements in a more particular way.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 12:54 pm
by Marvin
Thank you, that makes a little more sense to me now.

What would be the best way to verify that the DTD is in fact loaded correctly, with all its class mappings?
The class mappings are actually defined in a separate .mod file which is loaded like this:

Code: Select all

<!ENTITY % ourcompany-section-typemod
PUBLIC "-//OURCOMPANY//ELEMENTS DITA Section//EN"
"ourcompany.mod">
%ourcompany-section-typemod;
Somehow I feel that the PDF generator successfully loads this (it's a massive project with several custom plugins etc.), while my small HTML generator doesn't. At least that would explain why the <definitionlist> elements weren't rendered.
But I do reference the same DTD in my plugin catalog...?

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 3:55 pm
by Radu
Hi Marvin,

If the DITA OT publishing would have problems finding a certain DTD module when parsing the XML content it would show some relevant errors that it has failed to parse the topics and it also might probably fail.
Let's look again at the differences between a DITA <dl> and your custom <definitionlist>:

Code: Select all

        <dl>
<dlentry>
<dt></dt>
<dd></dd>
</dlentry>
</dl>
versus:

Code: Select all

<definitionlist>
<listentry>
<dterm/>
<description/>
</listentry>
and the relevant class attributes:

Code: Select all

<!ATTLIST definitionlist           %global-atts;  class CDATA "- topic/dl            ourcompany-section/definitionlist ">
<!ATTLIST listitem %global-atts; class CDATA "- topic/li ourcompany-section/listitem ">
<!ATTLIST listentry %global-atts; class CDATA "- topic/dl ourcompany-section/listentry ">
<!ATTLIST dterm %global-atts; class CDATA "- topic/dt ourcompany-section/dterm ">
<!ATTLIST description %global-atts; class CDATA "- topic/dd ourcompany-section/description ">
So the <definitionlist> has class="- topic/dl ourcompany-section/definitionlist " which means that it should be processed as a <dl>.
Now the <listentry> has class="- topic/li ourcompany-section/listitem " but <dlentry> has class topic/dlentry meaning that <listentry> is not a "kind of" <dlentry> so going back to the XSLT stylesheet DITA-OT\plugins\org.dita.html5\xsl\topic.xsl which has a template like this:

Code: Select all

  <xsl:template match="*[contains(@class, ' topic/dlentry ')]" name="topic.dlentry">
<xsl:apply-templates/>
</xsl:template>
this template will not match the <listentry> because it's not an extension of <dlentry>.
So maybe you need to add an extra xsl:template matching your listentry and does the same thing, maybe something like:

Code: Select all

  <xsl:template match="*[contains(@class, ' ourcompany-section/listentry ')]" name="topic.dlentry" priority="10">
<xsl:apply-templates/>
</xsl:template>
But this is hard to figure out as I do not have something to test on my side.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 09, 2018 5:38 pm
by Marvin
Thank you, that helped. You meant topic/dl, not topic/dlentry, right?
I think I now found the relevant code in the PDF generator:

Code: Select all

    <!--Definition list-->
<xsl:template match="*[contains(@class, ' topic/dl ')]">
<fo:table xsl:use-attribute-sets="dl">
<xsl:call-template name="commonattributes"/>
<xsl:apply-templates select="*[contains(@class, ' topic/dlhead ')]"/>
<fo:table-body xsl:use-attribute-sets="dl__body">
<xsl:choose>
<xsl:when test="contains(@otherprops,'sortable')">
<xsl:apply-templates select="*[contains(@class, ' topic/dlentry ')]">
<xsl:sort select="opentopic-func:getSortString(normalize-space( opentopic-func:fetchValueableText(*[contains(@class, ' topic/dt ')]) ))" lang="{$locale}"/>
</xsl:apply-templates>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="*[contains(@class, ' topic/dlentry ')]"/>
</xsl:otherwise>
</xsl:choose>
</fo:table-body>
</fo:table>
</xsl:template>
As we don't have something like that in the HTML generator (yet), it makes sense that it wasn't rendered.
I think I might stick with manipulating the source directly in this case, I'm not a big fan of XSL, which I find difficult to read.
Thank you again!

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Wed Oct 10, 2018 10:24 am
by Radu
Hi,

It seems I was not looking at the proper class value for <listentry>, so it's class value is:

Code: Select all

<!ATTLIST listentry                    %global-atts;  class CDATA "- topic/dl          ourcompany-section/listentry ">
but in my opinion it should be:

Code: Select all

<!ATTLIST listentry                    %global-atts;  class CDATA "- topic/dlentry          ourcompany-section/listentry ">
in order to derive from <dlentry>.
If you make this DTD change there should be no longer publishing problems because the structure and custom elements in the <definitionlist> would properly extend the base elements used in the <dl>.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Wed Oct 10, 2018 8:00 pm
by Marvin
Yes, I noticed that, too. But I suspect it might have been done like this on purpose. :?
After all, it's working well in the PDF generator, so I think there must have been a reason for doing it like this.

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Mon Oct 22, 2018 2:20 pm
by Marvin
I am having the same issue with some more elements:

Code: Select all


<!ATTLIST emailaddress %global-atts; class CDATA "+ topic/data xnal-d/emailaddress ">
<!ATTLIST url %global-atts; class CDATA "+ topic/data xnal-d/url ">

<!ATTLIST section %global-atts; class CDATA "- topic/section " >
<!ATTLIST classnotation %global-atts; class CDATA "+ topic/data ourorganization-section/classnotation " >
(Copied&pasted from several files together)

From what I understand, they are all derived from topic/data, but the HTML5 plugin does not define a template for that.
How would I define a template for that and configure it in my custom plugin (derived from the HTML5 plugin)?

In our PDF generator, I have found this:

Code: Select all


    <xsl:template match="*[contains(@class, ' topic/data ')]"/>
<xsl:template match="*[contains(@class, ' topic/data ')]" mode="insert-text"/>
<xsl:template match="*[contains(@class, ' topic/data-about ')]"/>
And also this among the FO-specific files:

Code: Select all


    <!-- data-about -->
<xsl:template match="*[contains(@class,' topic/data-about ')]" mode="TEXT_ONLY">
</xsl:template>

<!-- data -->
<xsl:template match="*[contains(@class,' topic/data ')]" mode="TEXT_ONLY">
</xsl:template>
I have tried adding a "commons.xsl" to "plugins\com.ourorganisation.html5\xsl":

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="*[contains(@class, ' topic/data ')]" mode="insert-text"/>

</xsl:stylesheet>
But that doesn't seem to help. I'm guessing that I have to reference that XSL file somewhere, but where?
The DITA OT documentation is pretty much nonexistant... :?

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Tue Oct 23, 2018 8:49 am
by Radu
Hi,

I do not know anything about what your "com.ourorganisation.html5" does.
If I were to customize the HTML 5 transformation, I would have a plugin similar to this:

OXYGEN_INSTALL_DIR\frameworks\dita\DITA-OT2.x\plugins\com.oxygenxml.html.embed

In the example above the "plugin.xml" uses the HTML5 extension point to contribute an extra XSLT stylesheet:

Code: Select all

<feature extension="dita.xsl.html5" value="xhtmlEmbed.xsl" type="file"/>
Once you run the DITA OT Integrator the "xhtmlEmbed.xsl" will be used with high priority and you will be able to override in it XSLT templates defined in the base XSLT stylesheets.
So if in your case you add an extra "commons.xsl", it can be used in two ways:

1) You add in the "plugin.xml" a new feature extension for it:

Code: Select all

<feature extension="dita.xsl.html5" value="commons.xsl" type="file"/>
2) If in the plugin.xml you already have a reference to some "custom.xsl", in that "custom.xsl" you can add an xsl:import to your "commons.xsl".

So what next? In my opinion in the "commons.xsl" you could have an XSLT template looking like this:

Code: Select all

<xsl:template match="*[contains(@class, ' xnal-d/emailaddress ')]">
<xsl:message>Matched element <xsl:copy-of select="."/></xsl:message>
<span><xsl:apply-templates/></span>
</xsl:template>
That xsl:message will be shown in the DITA OT console view at the end of the transformation so it will help you debug, see precisely if your XSLT template matches anything.
I do not think you should add the mode="insert-text" to the xsl template.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Thu Oct 25, 2018 2:49 pm
by Marvin
Thank you much! That is exactly what we needed.
It would be so great if more documentation on stuff like this was available somewhere...

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Thu Oct 25, 2018 3:04 pm
by Radu
Hi Marvin,

The DITA Open Toolkit documentation contains an example for adding an XSLT extension point for the XHTML output:

http://www.dita-ot.org/dev/topics/plugi ... aid-title1

We'll host the DITA Open Toolkit Day in about a week in Rotterdam:

https://www.oxygenxml.com/events/2018/dita-ot_day.html

and I'll have a presentation about customizing the DITA Open Toolkit there, I will try after this to add some documentation issues on the DITA OT project and maybe improve on its existing documentation.
We also plan to record all presentations and make them available on the Oxygenxml YouTube channel.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Thu Oct 25, 2018 5:27 pm
by Marvin
Hi Radu,

Yes, I did find that article, but it doesn't mention that you can use this to add element templates.
It also doesn't mention where "gen-user-header" comes from (I assume that is a pre-defined extension point, but I didn't find it in any of those lists).
In fact, I didn't find a simple example where it was shown how custom element templates can be defined for HTML, I only found samples for adding custom headers and footers etc.
While the HTML plugin documentation probably is sufficient when you have done DITA-OT development for years, it's quite difficult for someone who is new to this. I am slowly beginning to understand how this stuff works, but it's really quite frustrating.

I have been thinking about going to the DITA OT Day and also the DITA Europe Conference in Rotterdam, but I wasn't sure if it would be a good fit.
It seems like most talks are either for people who have never heard of DITA before, or for very experienced users.
We already use a heavily customized DITA OT to create PDF documents from DITA XML, but we have some new team members who need to get up to speed. Like I said, I feel that we are slowly getting there, it is just a bit frustrating because I feel there isn't a lot of documentation and hands-on tutorials available (again, it's either very basic or very advanced stuff, and it's difficult to put things together for a newbie).
Going to the conference would mean I wouldn't be able to work instead, so I need to have good reasons for my superiors to allow me to go.
Your talk does sound very interesting, but it sounds like a lot to cover in just 30 minutes?
The biggest advantage I see with going is being able to connect with other DITA users.
Do you think it would make sense to go for somebody like me, possibly only to the DITA OT Day and not the DITA Europe Conference?
If so, I could talk to my manager again.
Or I could just watch the presentations on YouTube, but I usually prefer meeting in person...

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Thu Oct 25, 2018 6:02 pm
by Radu
Hi Marvin,

I think you would really benefit from attending the DITA OT Day conference. Also it's free, you get free breakfast and lunch and you get to talk to the main DITA Open Toolkit developers and to DITA-OT consultants or to people using the DITA-OT in their company.
If you manage to obtain permission to come, you can register here:

https://www.oxygenxml.com/events/2018/dita-ot_day.html

but even if we close the registration you can still come because we always plan for a couple of extra people showing up.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Wed Jan 16, 2019 4:35 pm
by Marvin
Unfortunately I didn't make it to the conference, but now the issue from the beginning is back:

In a preprocessing step, all our <definitionlist> elements are renamed to <dl>, including the child elements (dlentry/dt/dd).
This works fine, the HTML5 plugin usually renders some output.

However, some of these <dl> elements are completely missing from the HTML output.
Nothing is rendered for them, at all. Only some elements are effected, and I haven't been able to figure out the pattern.
There is nothing about this in the DITA log either, even though verbose debugging is turned on.

Any idea how I could find the cause for this?

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Thu Jan 17, 2019 3:28 pm
by Radu
Hi Marvin,

If I were you, I would identify such a case, identify that topic which contains the missing dl, then try to create a very small DITA project with just that topic, no other outside links, try to publish it. If the problem persists, try to add some xsl:messages in the XSLT template which matches the <definitionlist> elements, see how the processing goes.
If at some point you come up with a sample DITA project to test with, you can send it to support@oxygenxml.com along with some details and I could try to find some time to look into this.

Regards,
Radu

Re: Elements are only converted in FO/PDF output, not HTML

Posted: Thu Jan 17, 2019 11:44 pm
by Marvin
Thank you, Radu.
Like I said, I went without the template at first and simply renamed our <definitionlist> elements to match the DITA standard - but then some elements simply would not be rendered.

I have now created a custom template and it looks like all elements are rendered.
Not quite the way we want them yet, but at least the mystical error has disappeared...