CDATA with extended characters

rdelong
Posts: 61

CDATA with extended characters

Wed Sep 06, 2017 9:15 pm

I have a document as follows that I'm converting to HTML:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE reference PUBLIC "-//OASIS//DTD DITA Reference//EN" "reference.dtd">
<reference id="reference_zmz_3md_dbb">
    <title>Documentation Library</title>
    <shortdesc/>
    <refbody>
        <section id="section_yts_4md_dbb">
            <p><![CDATA[<?php include_once '/includes/document-list.php'; ?>]]></p>
        </section>
    </refbody>
</reference>


However, the <?php codes ends up like this:

Code: Select all

        ...
        <div class="section">
            <p class="p">&lt;?php include_once '/includes/document-list.php'; ?&gt;</p>
        </div>
        ...


How can I get &lt; and &gt; to render as < and >, respectively?
Radu
Posts: 5578

Re: CDATA with extended characters

Thu Sep 07, 2017 11:46 am

Hi,

If you want to encode a piece of HTML in the DITA Topic and have it go directly to the output HTML unchanged, there is a plugin already included in Oxygen DITA Open Toolkit publishing engines, you can see how you need to encode your DITA content for it to be activated here:

https://github.com/oxygenxml/dita-embed-html

It relies on a specific @outputclass value set on a <foreing> element containing a CDATA with the HTML content inside.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
rdelong
Posts: 61

Re: CDATA with extended characters

Thu Sep 07, 2017 9:25 pm

Thanks for pointing out the html-embed plugin. I tried it and it failed because the code that I'm inserting in the CDATA is not well formed.

Here's the error that I get:

Code: Select all

dita.inner.topics.xhtml:
     [xslt] Transforming into C:\_working\build\DITA-OT-1.8.5\out\en\xhtml\set_up_asr_map
     [xslt] Processing C:\_working\build\DITA-OT-1.8.5\temp\temp20170907110422282\fscli_reference\DocumentationLibrary.dita to C:\_working\build\DITA-OT-1.8.5\out\en\xhtml\set_up_asr_map\DocumentationLibrary.html
     [xslt] Loading stylesheet C:\_working\build\DITA-OT-1.8.5\plugins\org.dita.xhtml\xsl\dita2xhtml.xsl
     [xslt] Unknown file: Fatal Error! Error reported by XML parser Cause: org.xml.sax.SAXParseException; Premature end of file.
     [xslt] C:\_working\build\DITA-OT-1.8.5\plugins\org.dita.xhtml\xsl\xslhtml\dita2htmlImpl.xsl:522: Fatal Error! net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; Premature end of file. Cause: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; Premature end of file.
     [xslt] Failed to process null


When I use the text as suggested in the plugin README, the output is fine, so I know that the plugin is working.

Code: Select all

        <section id="section_yts_4md_dbb">
            <p><foreign outputclass="html-embed"><![CDATA[<div><b>bold</b> and <i>italic</i></div>]]></foreign></p>
        </section>


I'm using DITA-OT 1.8.5. Any ideas for a resolution?
Radu
Posts: 5578

Re: CDATA with extended characters

Fri Sep 08, 2017 8:48 am

Hi,

The code that you are inserting in the CDATA needs to be wellformed.
If you have a problem making it wellformed maybe you can post some sample code.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
rdelong
Posts: 61

Re: CDATA with extended characters

Fri Sep 08, 2017 10:26 am

Thanks
The OP contains the code that I want to use.

Code: Select all

<![CDATA[<?php include_once '/includes/document-list.php'; ?>]]>
Radu
Posts: 5578

Re: CDATA with extended characters

Fri Sep 08, 2017 10:49 am

Hi,

You can either use a <span> around the processing instruction from inside the CDATA to make it wellformed XML or you can change the XSLT stylesheet:

[url]OXYGEN_INSTALL_DIR\frameworks\dita\DITA-OT2.x\plugins\com.oxygenxml.html.embed\xhtmlEmbed.xsl[/url]

and replace in it this line:

Code: Select all

<xsl:copy-of select="saxon:parse(text())"/>


with this one:

Code: Select all

<xsl:copy-of select="saxon:parse(concat('&lt;root>', text(), '&lt;/root>'))/*/node()"/>


I will make this change in the XSLTs distributed by default with Oxygen as they are more robust and would also allow you to embed multiple sibling HTML elements in the CDATA.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
rdelong
Posts: 61

Re: CDATA with extended characters

Fri Sep 08, 2017 7:06 pm

This is awesome and works great! Thanks! This is going to be helpful for our documentation workflow.

I don't know how the <root> code works because it doesn't show up in the output (just a curious observation).
Radu
Posts: 5578

Re: CDATA with extended characters

Mon Sep 11, 2017 7:47 am

Hi,

Wrapping everything in the <root> element makes the entire XML string wellformed. After the saxon:parse is applied on it you get a document node and then the /*/node() xpath expression appended to it selects and uses the entire contents of the <root> node.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

Return to “DITA (Editing and Publishing DITA Content)”

Who is online

Users browsing this forum: No registered users and 1 guest