Page 1 of 1

MS docx to XML

Posted: Fri Jun 08, 2012 11:53 pm
by ditatizer
I have a template that I created in MS Word. My objective is to develop a stylesheet from that template.
Is there an easier way to open this in Oxygen XML and convert it to XML so I can get a base-line for required stylesheet. I eventually plan to be used it to generate DITA outputs.
Any help is greatly appreciated

Re: MS docx to XML

Posted: Mon Jun 11, 2012 11:13 am
by Radu
Hi,

A file with the .docx extension is actually a ZIP archive.
If you open such a file in Oxygen, it will be opened in the Archive Browser view. In that view you can expand the word directory and open from there the document.xml file which contains the entire content of the Word document.

We have some video demonstrations related to this which you may find interesting:

http://www.oxygenxml.com/videos.html#vt ... LDocuments

Actually the DITA For Publishers project:
http://sourceforge.net/projects/dita4publishers/
contains plugins which can be installed in the DITA OT bundled with Oxygen. One of those plugins is a plugin called Word to DITA.
The transformation is actually done by applying XSLT 2.0 stylesheets on that document.xml file mentioned above.
So it may not be necessary for you to reinvent the wheel.

Regards,
Radu