MS docx to XML

Questions about XML that are not covered by the other forums should go here.
ditatizer
Posts: 28
Joined: Thu Jun 30, 2011 2:05 am

MS docx to XML

Post by ditatizer »

I have a template that I created in MS Word. My objective is to develop a stylesheet from that template.
Is there an easier way to open this in Oxygen XML and convert it to XML so I can get a base-line for required stylesheet. I eventually plan to be used it to generate DITA outputs.
Any help is greatly appreciated
Radu
Posts: 9434
Joined: Fri Jul 09, 2004 5:18 pm

Re: MS docx to XML

Post by Radu »

Hi,

A file with the .docx extension is actually a ZIP archive.
If you open such a file in Oxygen, it will be opened in the Archive Browser view. In that view you can expand the word directory and open from there the document.xml file which contains the entire content of the Word document.

We have some video demonstrations related to this which you may find interesting:

http://www.oxygenxml.com/videos.html#vt ... LDocuments

Actually the DITA For Publishers project:
http://sourceforge.net/projects/dita4publishers/
contains plugins which can be installed in the DITA OT bundled with Oxygen. One of those plugins is a plugin called Word to DITA.
The transformation is actually done by applying XSLT 2.0 stylesheets on that document.xml file mentioned above.
So it may not be necessary for you to reinvent the wheel.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply