Migrating WORD Styles over to XML

Here should go questions about transforming XML with XSLT and FOP.
matrix
Posts: 5
Joined: Fri Sep 27, 2013 7:14 am

Migrating WORD Styles over to XML

Post by matrix »

Fellow Forum Members,
I have a 800 page tech manual I was required to write using Microsoft Word some four years ago. Now it needs to be converted over to XML that conforms to a MIL SPEC 40051 DTD the Army provides.

I would be very grateful if anyone out there can post the steps on how to map Microsoft WORD Styles to match with the Elements the MIL SPEC 40051 DTD uses. Or a link to where such steps exist My goal is to avoid having to manually tag the data the WORD file contains with the Elements the MIL SPEC 40051 DTD requires.

Does OxygenXML Editor have a feature that will make this task possible? If OxygenXML Editor is not the solution can anyone out there offer an open source solution that can transform Microsoft WORD Styles over to MIL SPEC 40051 Elements. Any info will be greatly appreicated. Thanks in advance.
Radu
Posts: 9018
Joined: Fri Jul 09, 2004 5:18 pm

Re: Migrating WORD Styles over to XML

Post by Radu »

Hi,

Here are some possible approaches:

1) Open the Word document in Libre Office and save it as a Docbook XML document.

2) Or using Oxygen you could try to convert the Word document to Docbook or DITA XML by creating for example a Docbook article, switching to the Author visual editing mode, selecting all the content in the Word document and pasting in the article.

Then you would create an XSLT stylesheet which maps from the Docbook elements to the target vocabulary. It's probably easier than mapping directly from OOXML.

Conversions are never perfect, you will probably encounter problems with images, links, styles which will not be properly mapped in the XML and so on.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
matrix
Posts: 5
Joined: Fri Sep 27, 2013 7:14 am

Re: Migrating WORD Styles over to XML

Post by matrix »

Radu,
Thanks for your reply. Of the two approaches you propose, something tells me option #1 is the path of least resistance. From what I have Googled, Libre Office is a free open source app that resembles Microsoft Office. Can you or anyone out there please elaborate more on what the steps are after the Libre Office DocBook export file is created. Let's say for example sake Libre Office performs a very accurate DocBook conversion of my WORD document. What OxygenXML tool do I rely on to convert all of the DocBook elements over to the MIL SPEC 40051 elements as an automated process? If OxygenXML does not offer any automated element conversion tool do other apps such as FrameMaker, ArborText or a free open source option provide element conversion capabilities? Any info that relates to what the steps are after I have my WORD fle converted to DocBook will be very much appreciated. Thank you in advance.
Radu
Posts: 9018
Joined: Fri Jul 09, 2004 5:18 pm

Re: Migrating WORD Styles over to XML

Post by Radu »

Hi,

We do not have any automated conversion in Oxygen for this. You would need to write an XSLT stylesheet and perform the conversion yourself.
If you want to ask for feedback on this from our Oxygen users, maybe you should write on our users list:

http://www.oxygenxml.com/mailinglists.html#oxygen-user

The forum is static, usually other users do not register to see updates on other forum discussions.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply