PDF/A with FOP

Here should go questions about transforming XML with XSLT and FOP.
Marc R.
Posts: 7
Joined: Fri Aug 31, 2007 4:15 pm
Location: France

PDF/A with FOP

Post by Marc R. »

Hello,

I'm looking into creating PDF/A-1b files from FOP via oXygen.

So far, it looks like I succeeded in creating the font metrics files for the base 14 files to embed, and properly set up the user fop config file.

I have two questions:
1- how to pass the argument to FOP via oXygen so that it produces PDF/A-1b files? Apparently it is feasible in Java (see:http://xmlgraphics.apache.org/fop/0.93/pdfa.html ), but from oXygen, I couldn't find it (I looked into Options>...>FO processor and also in the "Configure Transformation Scenario" tool, under the FO PDF section)
2- how to add the needed metadata in the generated files? I could only add a few like title or creator using definitions in the xsl-fo file (Dublin Core and XMP properties)

Any help would be appreciated!
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Post by sorin_ristache »

Hello,

In the current version it is not possible to generate PDF/A-1b files with FOP in oXygen because the parameter "-pdfprofile PDF/A-1b" cannot be set yet for FOP in oXygen. We will allow setting this parameter to FOP in a future version of oXygen.


Regards,
Sorin
Marc R.
Posts: 7
Joined: Fri Aug 31, 2007 4:15 pm
Location: France

Post by Marc R. »

ok, thanks!
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Post by sorin_ristache »

Until we add the -pdfprofile parameter to the builtin FOP processor you can define an external FO processor based on the same jar files as the builtin one and add the parameter to the command line of the external processor. For example:

Code: Select all

"C:\Program Files\Java\jre1.6.0_01\bin\java" -Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl -Djavax.xml.parsers.SAXParserFactory=org.apache.xerces.jaxp.SAXParserFactoryImpl -Dorg.xml.sax.driver=org.apache.xerces.parsers.SAXParser -cp "C:\Program Files\Oxygen XML Editor 8.2\lib\xml-apis.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\xercesImpl.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\fop.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\avalon-framework-4.2.0.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\batik-all-1.6.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\commons-io-1.3.1.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\xmlgraphics-commons-1.2.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\commons-logging-1.0.4.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\saxon8.jar;C:\Program Files\Oxygen XML Editor 8.2\lib\saxon8-dom.jar" org.apache.fop.cli.Main -fo ${fo} -${method} ${out} -pdfprofile PDF/A-1b
Edit the transformation scenario which must generate PDF/A-1b documents and set this external processor in the Processor combo box of the FO Processor tab of the transformation scenario edit dialog.

Please note that you must embed in the PDF document all the fonts necessary for displaying the PDF document. This is true for custom fonts and also for the default FOP fonts (Helvetica, Times Roman). You must do this in a FOP configuration file set in Options -> Preferences -> XML -> XSLT/FO/XQuery -> FO Processors -- Enable the output of the builtin FOP. If you do not embed a font needed in the PDF document Apache FOP issues the error:
For PDF/A-1b, all fonts, even the base 14 fonts, have to be embedded
This is not the only restriction for PDF/A-1b documents. With a correct configuration file FOP is able to generate PDF/A-1b documents.


Regards,
Sorin
Marc R.
Posts: 7
Joined: Fri Aug 31, 2007 4:15 pm
Location: France

Post by Marc R. »

Hello again,

I just came to the same idea, and as I had trouble with the syntax, I actually wrote a batch file (and even set it to point to fop 0.94 instead of the bundled 0.93).
But thanks a lot for the tip, will be usefull!

I knew about the restrictions on PDF/A-1b. So I did succeed in embedding the fonts (generated the font metrics using free fonts out of OpenOffice...), but I am now stumbling on a blocking exception (not to mention the non-blocking warnings about "Line 1 of a paragraph overflows the available area.", but this is off topic), seem to be caused by the xsl-fo, as the starting point is xalan:

Code: Select all

javax.xml.transform.TransformerException: java.lang.IllegalArgumentException: The number of this PDFNumber must not be empty
at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2416)
at org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2281)
at org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1367)
at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:709)
at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1284)
at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1262)
at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:165)
at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:115)
at org.apache.fop.cli.Main.startFOP(Main.java:166)
at org.apache.fop.cli.Main.main(Main.java:197)
Caused by: java.lang.IllegalArgumentException: The number of this PDFNumber must not be empty
at org.apache.fop.pdf.PDFNumber.toPDFString(PDFNumber.java:110)
at org.apache.fop.pdf.PDFObject.toPDF(PDFObject.java:176)
at org.apache.fop.pdf.PDFObject.output(PDFObject.java:165)
at org.apache.fop.pdf.PDFDocument.output(PDFDocument.java:899)
at org.apache.fop.render.pdf.PDFRenderer.renderPage(PDFRenderer.java:766)
at org.apache.fop.area.RenderPagesModel.addPage(RenderPagesModel.java:120)
at org.apache.fop.layoutmgr.PageSequenceLayoutManager.finishPage(PageSequenceLayoutManager.java:424)
at org.apache.fop.layoutmgr.PageSequenceLayoutManager.makeNewPage(PageSequenceLayoutManager.java:377)
at org.apache.fop.layoutmgr.PageBreaker.handleBreakTrait(PageBreaker.java:502)
at org.apache.fop.layoutmgr.PageBreaker.getNextBlockList(PageBreaker.java:131)
at org.apache.fop.layoutmgr.AbstractBreaker.doLayout(AbstractBreaker.java:301)
at org.apache.fop.layoutmgr.AbstractBreaker.doLayout(AbstractBreaker.java:263)
at org.apache.fop.layoutmgr.PageSequenceLayoutManager.activateLayout(PageSequenceLayoutManager.java:144)
at org.apache.fop.area.AreaTreeHandler.endPageSequence(AreaTreeHandler.java:233)
at org.apache.fop.fo.pagination.PageSequence.endOfNode(PageSequence.java:145)
at org.apache.fop.fo.FOTreeBuilder$MainFOHandler.endElement(FOTreeBuilder.java:378)
at org.apache.fop.fo.FOTreeBuilder.endElement(FOTreeBuilder.java:194)
at org.apache.xml.serializer.ToXMLSAXHandler.endElement(ToXMLSAXHandler.java:261)
at org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1399)
at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2411)
... 9 more

java.lang.IllegalArgumentException: The number of this PDFNumber must not be empty
at org.apache.fop.pdf.PDFNumber.toPDFString(PDFNumber.java:110)
at org.apache.fop.pdf.PDFObject.toPDF(PDFObject.java:176)
at org.apache.fop.pdf.PDFObject.output(PDFObject.java:165)
at org.apache.fop.pdf.PDFDocument.output(PDFDocument.java:899)
at org.apache.fop.render.pdf.PDFRenderer.renderPage(PDFRenderer.java:766)
at org.apache.fop.area.RenderPagesModel.addPage(RenderPagesModel.java:120)
at org.apache.fop.layoutmgr.PageSequenceLayoutManager.finishPage(PageSequenceLayoutManager.java:424)
at org.apache.fop.layoutmgr.PageSequenceLayoutManager.makeNewPage(PageSequenceLayoutManager.java:377)
at org.apache.fop.layoutmgr.PageBreaker.handleBreakTrait(PageBreaker.java:502)
at org.apache.fop.layoutmgr.PageBreaker.getNextBlockList(PageBreaker.java:131)
at org.apache.fop.layoutmgr.AbstractBreaker.doLayout(AbstractBreaker.java:301)
at org.apache.fop.layoutmgr.AbstractBreaker.doLayout(AbstractBreaker.java:263)
at org.apache.fop.layoutmgr.PageSequenceLayoutManager.activateLayout(PageSequenceLayoutManager.java:144)
at org.apache.fop.area.AreaTreeHandler.endPageSequence(AreaTreeHandler.java:233)
at org.apache.fop.fo.pagination.PageSequence.endOfNode(PageSequence.java:145)
at org.apache.fop.fo.FOTreeBuilder$MainFOHandler.endElement(FOTreeBuilder.java:378)
at org.apache.fop.fo.FOTreeBuilder.endElement(FOTreeBuilder.java:194)
at org.apache.xml.serializer.ToXMLSAXHandler.endElement(ToXMLSAXHandler.java:261)
at org.apache.xalan.templates.ElemLiteralResult.execute(ElemLiteralResult.java:1399)
at org.apache.xalan.transformer.TransformerImpl.executeChildTemplates(TransformerImpl.java:2411)
at org.apache.xalan.transformer.TransformerImpl.applyTemplateToNode(TransformerImpl.java:2281)
at org.apache.xalan.transformer.TransformerImpl.transformNode(TransformerImpl.java:1367)
at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:709)
at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1284)
at org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.java:1262)
at org.apache.fop.cli.InputHandler.transformTo(InputHandler.java:165)
at org.apache.fop.cli.InputHandler.renderTo(InputHandler.java:115)
at org.apache.fop.cli.Main.startFOP(Main.java:166)
at org.apache.fop.cli.Main.main(Main.java:197)
Maybe it's hard to interpret this without the xml and xsl-fo? Can you please explain how to send them?
Btw. I did some "googleing" to find about
The number of this PDFNumber must not be empty
but I wasn't really successful...
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Post by sorin_ristache »

Please send them through the Technical support page.


Regards,
Sorin
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Post by sorin_ristache »

Thank you for the files. I get a different error from Apache FOP:

Code: Select all

org.apache.fop.pdf.PDFConformanceException: For PDF/A-1b, all fonts, even the base 14 fonts, have to be embedded! Offending font: Helvetica
at org.apache.fop.pdf.PDFFont.validate(PDFFont.java:199)
at org.apache.fop.pdf.PDFFont.toPDFString(PDFFont.java:210)
at org.apache.fop.pdf.PDFObject.toPDF(PDFObject.java:176)
at org.apache.fop.pdf.PDFObject.output(PDFObject.java:165)
at org.apache.fop.pdf.PDFDocument.output(PDFDocument.java:899)
at org.apache.fop.pdf.PDFDocument.outputTrailer(PDFDocument.java:972)
at org.apache.fop.render.pdf.PDFRenderer.stopRenderer(PDFRenderer.java:506)
at org.apache.fop.area.RenderPagesModel.endDocument(RenderPagesModel.java:245)
at org.apache.fop.area.AreaTreeHandler.endDocument(AreaTreeHandler.java:283)
...
I think the Helvetica font is defined correctly in your userconfig.xml file set as FOP configuration file. I tested it with other XML document in which the Helvetica font was used both for titles and for the body content and the PDF/A-1b result was generated correctly. It seems an Apache FOP problem. The FOP mailing list may be more helpful.


I hope this helps,
Sorin
Dynapen
Posts: 1
Joined: Wed Oct 03, 2007 5:57 pm
Location: Kansas City, MO

Did you find a solution?

Post by Dynapen »

I am getting a very similar error from my FOP conversion process. I have not done any font specific work, and have had the same XSL:FO document work in the past, and currently works in a different server version (JBoss 4.2.1 vs. JBoss 4.0.5)

I assume that it all boils down to a dependency being of a different version, but have been unable to track anything down.

What did you find actually caused this error.
Marc R.
Posts: 7
Joined: Fri Aug 31, 2007 4:15 pm
Location: France

Post by Marc R. »

Sorry for not getting back to this topic earlier.
I've not resolved this issue so far and unfortunately I cannot currently work on this project.

I'll probably get back to it in a few weeks.

Please do not hesitate to share your findings in the meantime.

Good luck.
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: PDF/A with FOP

Post by sorin_ristache »

Hello,

I just wanted to let you know that the built-in FO processor can be configured now to generate PDF/A-1b output by setting an option in Preferences / XML / XSLT-FO-XQuery / FO Processors.


Regards,
Sorin
Post Reply