XML (DocBook 5) -> HTML (w/ CSS) -> PDF

Here should go questions about transforming XML with XSLT and FOP.
sematula
Posts: 1
Joined: Fri Sep 29, 2023 7:51 pm

XML (DocBook 5) -> HTML (w/ CSS) -> PDF

Post by sematula »

I have a content library stored in DocBook 5 xml, and I'm using Oxygen XML Editor to apply transformation scenarios. I have a long history with HTML & CSS, and the XML->HTML custom transformation scenario is working well (i.e., the resulting HTML document looks the way I want, and has the content I want).

However, when I edit the transformation scenario and enable "Perform FO Processing" on the "FO Processor" tab (Input: XSLT result as input; Method: pdf; Processor: Apache FOP), I get the following error:

Code: Select all

DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
The resulting html file has "<!DOCTYPE HTML>" at the beginning of the document (before the opening <html> tag). I've tried preventing this by adding the following to my XSL file:

Code: Select all

<xsl:stylesheet>
	<xsl:output method='html' omit-xml-declaration='yes' indent='yes' doctype-public='' doctype-system=''/>
</xsl:stylesheet>
But the DOCTYPE declaration persists. I've also edited my fop.xconf to include:

Code: Select all

  <useragent>

      <!-- Set features for the XML parser -->
      <parser>
          <features>
              <!-- Allow DOCTYPE declarations -->
              <feature name="http://apache.org/xml/features/disallow-doctype-decl" value="false"/>
          </features>
      </parser>

  </useragent>
  
But that similarly has no effect on the outcome (i.e., still getting the same error). Thoughts and suggestions are appreciated.
Radu
Posts: 9059
Joined: Fri Jul 09, 2004 5:18 pm

Re: XML (DocBook 5) -> HTML (w/ CSS) -> PDF

Post by Radu »

Hi,
I'm sorry we missed this initial post you made. We'll look into this to see if the problem stems from some setting we make in the XML parser. In the meantime maybe in the XSLT stylesheet you can switch the XSLT output method to xhtml to avoid serializing also the DOCTYPE declaration:

Code: Select all

<xsl:output method='xhtml' omit-xml-declaration='yes' indent='yes' doctype-public='' doctype-system=''/>
About your intention, the "Perform FO Processing" checkbox works only by applying the FO processing over an XSL-FO document, so it does not work to apply FO processing to an HTML document. To obtain PDF output either directly from XML or from HTML you would need to apply our Oxygen Chemistry processor instead: https://www.oxygenxml.com/doc/versions/ ... ation.html
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply