CDATA with extended characters

Post here questions and problems related to editing and publishing DITA content.
rdelong
Posts: 72
Joined: Tue Oct 21, 2014 10:01 pm

CDATA with extended characters

Post by rdelong »

I have a document as follows that I'm converting to HTML:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE reference PUBLIC "-//OASIS//DTD DITA Reference//EN" "reference.dtd">
<reference id="reference_zmz_3md_dbb">
<title>Documentation Library</title>
<shortdesc/>
<refbody>
<section id="section_yts_4md_dbb">
<p><![CDATA[<?php include_once '/includes/document-list.php'; ?>]]></p>
</section>
</refbody>
</reference>
However, the <?php codes ends up like this:

Code: Select all

        ...
<div class="section">
<p class="p"><?php include_once '/includes/document-list.php'; ?></p>
</div>
...
How can I get < and > to render as < and >, respectively?
Radu
Posts: 9057
Joined: Fri Jul 09, 2004 5:18 pm

Re: CDATA with extended characters

Post by Radu »

Hi,

If you want to encode a piece of HTML in the DITA Topic and have it go directly to the output HTML unchanged, there is a plugin already included in Oxygen DITA Open Toolkit publishing engines, you can see how you need to encode your DITA content for it to be activated here:

https://github.com/oxygenxml/dita-embed-html

It relies on a specific @outputclass value set on a <foreing> element containing a CDATA with the HTML content inside.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
rdelong
Posts: 72
Joined: Tue Oct 21, 2014 10:01 pm

Re: CDATA with extended characters

Post by rdelong »

Thanks for pointing out the html-embed plugin. I tried it and it failed because the code that I'm inserting in the CDATA is not well formed.

Here's the error that I get:

Code: Select all


dita.inner.topics.xhtml:
[xslt] Transforming into C:\_working\build\DITA-OT-1.8.5\out\en\xhtml\set_up_asr_map
[xslt] Processing C:\_working\build\DITA-OT-1.8.5\temp\temp20170907110422282\fscli_reference\DocumentationLibrary.dita to C:\_working\build\DITA-OT-1.8.5\out\en\xhtml\set_up_asr_map\DocumentationLibrary.html
[xslt] Loading stylesheet C:\_working\build\DITA-OT-1.8.5\plugins\org.dita.xhtml\xsl\dita2xhtml.xsl
[xslt] Unknown file: Fatal Error! Error reported by XML parser Cause: org.xml.sax.SAXParseException; Premature end of file.
[xslt] C:\_working\build\DITA-OT-1.8.5\plugins\org.dita.xhtml\xsl\xslhtml\dita2htmlImpl.xsl:522: Fatal Error! net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; Premature end of file. Cause: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; Premature end of file.
[xslt] Failed to process null
When I use the text as suggested in the plugin README, the output is fine, so I know that the plugin is working.

Code: Select all

        <section id="section_yts_4md_dbb">
<p><foreign outputclass="html-embed"><![CDATA[<div><b>bold</b> and <i>italic</i></div>]]></foreign></p>
</section>
I'm using DITA-OT 1.8.5. Any ideas for a resolution?
Radu
Posts: 9057
Joined: Fri Jul 09, 2004 5:18 pm

Re: CDATA with extended characters

Post by Radu »

Hi,

The code that you are inserting in the CDATA needs to be wellformed.
If you have a problem making it wellformed maybe you can post some sample code.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
rdelong
Posts: 72
Joined: Tue Oct 21, 2014 10:01 pm

Re: CDATA with extended characters

Post by rdelong »

Thanks
The OP contains the code that I want to use.

Code: Select all

<![CDATA[<?php include_once '/includes/document-list.php'; ?>]]>
Radu
Posts: 9057
Joined: Fri Jul 09, 2004 5:18 pm

Re: CDATA with extended characters

Post by Radu »

Hi,

You can either use a <span> around the processing instruction from inside the CDATA to make it wellformed XML or you can change the XSLT stylesheet:

OXYGEN_INSTALL_DIR\frameworks\dita\DITA ... lEmbed.xsl

and replace in it this line:

Code: Select all

<xsl:copy-of select="saxon:parse(text())"/>
with this one:

Code: Select all

<xsl:copy-of select="saxon:parse(concat('<root>', text(), '</root>'))/*/node()"/>
I will make this change in the XSLTs distributed by default with Oxygen as they are more robust and would also allow you to embed multiple sibling HTML elements in the CDATA.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
rdelong
Posts: 72
Joined: Tue Oct 21, 2014 10:01 pm

Re: CDATA with extended characters

Post by rdelong »

This is awesome and works great! Thanks! This is going to be helpful for our documentation workflow.

I don't know how the <root> code works because it doesn't show up in the output (just a curious observation).
Radu
Posts: 9057
Joined: Fri Jul 09, 2004 5:18 pm

Re: CDATA with extended characters

Post by Radu »

Hi,

Wrapping everything in the <root> element makes the entire XML string wellformed. After the saxon:parse is applied on it you get a document node and then the /*/node() xpath expression appended to it selects and uses the entire contents of the <root> node.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply