Apache FOP/Saxon: Content is not allowed in prolog.

Here should go questions about transforming XML with XSLT and FOP.
amix
Posts: 81
Joined: Sat Aug 05, 2006 10:43 pm

Apache FOP/Saxon: Content is not allowed in prolog.

Post by amix »

Hi,
this is my first XSL-FO try and this is what I get:

Code: Select all

[Apache FOP] The process 'Apache FOP' ended with code: 1. The error was:  Transformer is net.sf.saxon.IdentityTransformer@1786286 Error on line 5 column 9 of sample.html_xslt:   SXXP0003: Error reported by XML parser: Content is not allowed in prolog. ERROR - Exception net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException: Content is not allowed in prolog. 	at org.apache.fop.cli.InputHandlerFOP.transformTo(Unknown Source) 	at org.apache.fop.cli.InputHandlerFOP.renderTo(Unknown Source) 	at org.apache.fop.cli.Main.startFOP(Main.java:166) 	at org.apache.fop.cli.Main.main(Main.java:197) Caused by: net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException: Content is not allowed in prolog. 	at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:417) 	at net.sf.saxon.event.Sender.send(Sender.java:156) 	at net.sf.saxon.IdentityTransformer.transform(IdentityTransformer.java:32) 	... 4 more Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog. 	at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) 	at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLScannerXerces.reportFatalError(Unknown Source) 	at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source) 	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) 	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) 	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 	at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:397) 	... 6 more  ---------  net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException: Content is not allowed in prolog. 	at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:417) 	at net.sf.saxon.event.Sender.send(Sender.java:156) 	at net.sf.saxon.IdentityTransformer.transform(IdentityTransformer.java:32) 	at org.apache.fop.cli.InputHandlerFOP.transformTo(Unknown Source) 	at org.apache.fop.cli.InputHandlerFOP.renderTo(Unknown Source) 	at org.apache.fop.cli.Main.startFOP(Main.java:166) 	at org.apache.fop.cli.Main.main(Main.java:197) Caused by: org.xml.sax.SAXParseException: Content is not allowed in prolog. 	at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) 	at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLScannerXerces.reportFatalError(Unknown Source) 	at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source) 	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) 	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) 	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 	at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:397) 	... 6 more --------- org.xml.sax.SAXParseException: Content is not allowed in prolog. 	at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source) 	at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source) 	at org.apache.xerces.impl.XMLScannerXerces.reportFatalError(Unknown Source) 	at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source) 	at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 	at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) 	at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) 	at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) 	at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) 	at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:397) 	at net.sf.saxon.event.Sender.send(Sender.java:156) 	at net.sf.saxon.IdentityTransformer.transform(IdentityTransformer.java:32) 	at org.apache.fop.cli.InputHandlerFOP.transformTo(Unknown Source) 	at org.apache.fop.cli.InputHandlerFOP.renderTo(Unknown Source) 	at org.apache.fop.cli.Main.startFOP(Main.java:166) 	at org.apache.fop.cli.Main.main(Main.java:197) 
The location is being stated as "line -1", so I assume this must happen before the first character of my file. Now I checked for any illegal characters with

Code: Select all

od -cx <filename>
and this is the result of the first line

Code: Select all

0000000    <   ?   x   m   l       v   e   r   s   i   o   n   =   "   1
3f3c 6d78 206c 6576 7372 6f69 3d6e 3122
I am using the Eclipse plugin on OS X 10.6.4, Helios Release
Build id: 20100618-0524, Saxon 6.5.5 and no special options for the transform.

The input file is a HTML-quirksmode (has several illegal attribute additions to some elements, so I created my own XSD via "Learn Document Structure" -> "Save DTD" -> "Convert DTD to XSD") but it validates without errors.

Am I doing something wrong?
Andreas
Radu
Posts: 9041
Joined: Fri Jul 09, 2004 5:18 pm

Re: Apache FOP/Saxon: Content is not allowed in prolog.

Post by Radu »

Hi Andreas,

There are two steps in getting a PDF output from an XML file:
1) The original XML (your HTML file) must be transformed with an XSL stylesheet to FO which is an XML vocabulary understood by the FO processor (Apache FOP in this case).
2) The FO Processor interprets the FO file and builds the PDF.

Usually an Oxygen transformation scenario combines these steps by allowing you to perform FO processing on the temporary output generated by applying the XSLT stylesheet.
In your case there seems to be a problem in the generated FO file which prevents Apache FOP from building the PDF.

So you have to split the transformation process in two stages to see the problem more clearly.
1) Create a transformation scenario which applies the stylesheet to the XML and saves the result as a file called result.fo
2) Open the result.fo in Oxygen and try to validate it. Its xml structure should be something like:

Code: Select all


<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:xep="http://www.renderx.com/XEP/xep" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<fo:layout-master-set>
..........................
Probably in your case the FO file was not correctly generated by the XSL stylesheet.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
amix
Posts: 81
Joined: Sat Aug 05, 2006 10:43 pm

Re: Apache FOP/Saxon: Content is not allowed in prolog.

Post by amix »

Thanks! Found it :-) I could have known it myself (but didn't). Since my XSL was not finished, yet (it was a first test run), some text() of the input file got into the way as plain text, outside of any XML elements. Duh! :oops:
Andreas
Post Reply