DocBook XML to OOXML

Oxygen general issues.
xsaero00
Posts: 58
Joined: Sat Aug 01, 2009 12:57 am

DocBook XML to OOXML

Post by xsaero00 »

Our users are asking for a simple way to convert DocBook XML to OOXML (DocBook to Word DOCX) and back. The editing in house is done in Oxygen but sometimes they have to send files for editing to other people and Word is still the de facto standard.

I imagine this will be a major undertaking so any help is appreciated.
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: DocBook XML to OOXML

Post by sorin_ristache »

Hello,

The Docbook XSL package which comes with Oxygen includes [Oxygen-folder]/frameworks/docbook/xsl/roundtrip/dbk2wordml.xsl stylesheet for this conversion but it creates only the document.xml file from the DOCX document. A DOCX document is in fact a Zip archive which contains a document.xml file and some styling files as the ones you can see inside any DOCX when you open and browse it as a Zip archive. You have to create the Zip archive yourself because Docbook XSL does not include this feature.

You can follow these steps for creating a DOCX starting from a Docbook XML document:
  • Create the document.xml file from the target DOCX using the [Oxygen-folder]/frameworks/docbook/xsl/roundtrip/dbk2wordml.xsl stylesheet. You have to set a Word XML template file as the value of the wordml.template parameter. An example is the file [Oxygen-folder]/frameworks/docbook/xsl/roundtrip/template.xml.
  • Create the folder structure of the target DOCX by unzipping another DOCX document which you know how it looks by opening it in MS Word and from which you want to reuse the styling.
  • Replace the word/document.xml file in this folder structure with the document.xml file which you created in step 1.
  • Zip the folder structure for creating your target DOCX document.

Regards,
Sorin
xsaero00
Posts: 58
Joined: Sat Aug 01, 2009 12:57 am

Re: DocBook XML to OOXML

Post by xsaero00 »

I ended up using roundtrip XSL. I had to heavily modify it and fix a number of bugs, but now it works. I can covert DocBook to Word. Thanks.

I dropped the idea of importing Word to XML for now. I think that is just too much hassle.
levent
Posts: 2
Joined: Wed Jun 13, 2012 3:22 pm

Re: DocBook XML to OOXML

Post by levent »

Hi,

Using the same method, would it be possible to convert the MS PowerPoint files (OOXML) to Docbook XML and vice versa?

Any help would be very appreciated.
Levent
xsaero00
Posts: 58
Joined: Sat Aug 01, 2009 12:57 am

Re: DocBook XML to OOXML

Post by xsaero00 »

levent wrote: Using the same method, would it be possible to convert the MS PowerPoint files (OOXML) to Docbook XML and vice versa?
Possible? Yes. Easy or practical? Not in my opinion.

I am not aware of any XSL stylesheets out there that already work with OOXML, so you are going to have to write it from scratch. OOXML is a package format, but luckily Oxygen opens it. So, you could open the OOXML file then find the right file within the package and apply your transformation. If you need more that one source file for transformation, you can pull them in using XSL document() function.
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: DocBook XML to OOXML

Post by sorin_ristache »

Hello,
levent wrote:Using the same method, would it be possible to convert the MS PowerPoint files (OOXML) to Docbook XML and vice versa?
The DocBook XSL distribution includes some XSLT stylesheets for the Word (OOXML) -> DocBook conversion but not for the PowerPoint -> DocBook one. Sorry, I think nobody created such stylesheets (and published them) so far.


Regards,
Sorin
levent
Posts: 2
Joined: Wed Jun 13, 2012 3:22 pm

Re: DocBook XML to OOXML

Post by levent »

Hi,

Thank you for your replies.

I've used the DOCX > TEI P5 transformation scenario to convert the MS Word documents instead of converting them to Docbook format, as this method has been already included in standard oXygen package. This works fine and allows me to edit the TEI P5.xml document and/or also re-generate the original MS Word document by applying the TEI P5 > DOCX transformation scenario.

Now, my question is, if I could create new transformation scenarios like PPTX > TEI P5 and TEI P5 > PPTX to be able to work with MS PowerPoint files, starting from the C:\Program Files\Oxygen XML Editor 13\frameworks\tei\xml\tei\stylesheet\docx files, create a pptx folder by copying the content and edit the files in accordance with the MS PowerPoint?

Regards,
Levent
Costin
Posts: 828
Joined: Mon Dec 05, 2011 6:04 pm

Re: DocBook XML to OOXML

Post by Costin »

Hello,

In case you want to modify the docx files, as it is possible that you might not have write access in the "Program Files" directory, you should first copy the entire "tei" folder from C:\Program Files\Oxygen XML Editor 13\frameworks\
to another location, where you have write access and make your customizations in that location. After that, you can copy the tei folder back into the oXygen frameworks in Program Files.

Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
amsimms
Posts: 4
Joined: Wed Jul 26, 2017 6:48 pm

Re: DocBook XML to OOXML

Post by amsimms »

This feature seems to have disappeared--there is no OOXML that I can see in the transformation scenarios. This is a key feature, otherwise we'll need to add something like XMLMind to our toolbox.
Radu
Posts: 9018
Joined: Fri Jul 09, 2004 5:18 pm

Re: DocBook XML to OOXML

Post by Radu »

Hi,

We just ship with Oxygen the default Docbook 1.79.2 XSLTs. We have no other special support for Docbook to Word.
The Docbook project seems to have a topic on that:

http://www.sagehill.net/docbookxsl/MSWord.html

It seems that particular XSLT stylesheet was removed from the Docbook XSLs at version "1.72" but you can download older Docbook XSL versions from their website:

https://sourceforge.net/projects/docboo ... cbook-xsl/

About the advice to convert the Docbook to XSL-FO and then to create RTF from it, you do not necessarily need to use the XML Mind convertor.
You can duplicate an existing "Docbook PDF" transformation scenario, in the "FO Processor" tab choose the output method to be "rtf" and in the Output tab set the "rtf" extension for the published file.
I'm not sure of the limitations that the Apache FOP processor has when converting to RTF so depending on your project's complexity you might need to try another processor.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply