Using the Word to DITA transform

Here should go questions about transforming XML with XSLT and FOP.
Miriam
Posts: 14

Using the Word to DITA transform

Fri Nov 01, 2013 8:40 pm

I'm trying to figure out if I can use oXygen Editor to convert Word documents to DITA directly, but I can't find information in the User Guide to explain how this transform works.

The background is that I work in an environment where I regularly use Mif2Go to convert Frame files to DITA. I am good with DITA markup, but I do not have any XSLT stylesheet skills (although I'm familiar with the very basic principles). I have a small, but lengthy, set of Word documents that I need to get into DITA. If I have no alternative, I can go Word - Frame - DITA. However, the Word docs have a lot of tables that do not import cleanly into Frame. If I can use oXygen's Word OOXML to DITA transform to skip the Frame step, that would save me a lot of time.

Do I need to be able to write a custom stylesheet to take advantage of this transform? Or am I supposed to be able to use some kind of configuration file to map my Word styles to their corresponding DITA elements? If the latter, is there a section of the user guide that explains the procedure for this?
Radu
Posts: 5585

Re: Using the Word to DITA transform

Mon Nov 04, 2013 10:18 am

Hi Miriam,

Oxygen 15.1 comes with the entire DITA For Publishers suite of plugin pre-installed in its bundled DITA OT distribution.
The DITA For Publishers set of plugins contains a plugin which can be used to convert Word to DITA content.
If you open an OOXML document in Oxygen (in the Archive view) you can open from it the document.xml which contains the content of the document. If you press the Configure Transformation Scenario toolbar button you will see a preconfigured transformation scenario called DOCX DITA.
That scenario applies this ANT build file:

OXYGEN_INSTALL_DIR/dita/DITA-OT/plugins/net.sourceforge.dita4publishers.word2dita/build-word2dita.xml

over the OOXML document.

The documentation for the Word-to-DITA plugin can be found in the DITA FOr Publishers documentation:

http://dita4publishers.sourceforge.net/d4p-user-guide/user_docs/d4p-users-guide/word2dita/word2dita-intro.html#chapter-id

Another alternative for you would be to open the OOXML document in MS Word, select all, copy the content and then paste it in a DITA topic opened in the visual Author editing mode. Oxygen will try to convert the HTML content from the clipboard to the DITA vocabulary.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Radu
Posts: 5585

Re: Using the Word to DITA transform

Mon Nov 04, 2013 10:54 am

Hi,

I also gave some more alternatives here:

http://www.oxygenxml.com/forum/viewtopic.php?f=8&t=10137&p=28188#p28188

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Miriam
Posts: 14

Re: Using the Word to DITA transform

Tue Nov 05, 2013 9:43 pm

Hi Radu,

Thanks for the link to the DITA 4 Publishers user guide. That's very helpful. However, I'm still unclear on where I do the style to tag mapping. The user guide indicates that there should be a configurable parameter in oXygen called styleMapUri. I should use that parameter to point to my style-to-tag mapping file. However, when I create an editable duplicate transformation, I only see parameters for w2d.out.dir, w2d.temp.dir, and word.doc. So I've got the explanation of how I map my styles to tags and confirmation that this was, indeed, the missing link, but I am still unclear on WHERE I do the mapping and how I point the oXygen transform to it.

I tried the copy-paste, and I am grateful to you for telling me about it. My needs are a little more complex than the simple copy-paste can handle, but in a worst case scenario where I can't get the OOXML to DITA transform to work, I might be able to work some sort of hybrid method where I use Word-Frame-DITA for everything but tables, and then copy-paste my tables in. I really, really don't want to have to deal with rebuilding the tables in Frame. :)
Radu
Posts: 5585

Re: Using the Word to DITA transform

Wed Nov 06, 2013 11:32 am

Hi Miriam,

You can look directly in the build file used for Word to DITA:

OXYGEN_INSTALL_DIR\frameworks\dita\DITA-OT\plugins\net.sourceforge.dita4publishers.word2dita\build-word2dita.xml

The build file looks at a default mapping XML file located in:

OXYGEN_INSTALL_DIR\frameworks\dita\DITA-OT\plugins\net.sourceforge.dita4publishers.word2dita\xsl\word-builtin-styles-style2tagmap.xml

So you either modify that one or you edit the transformation scenario and set the parameter w2d.style-to-tag-map to point to your own XML containing the styles mapping.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Miriam
Posts: 14

Re: Using the Word to DITA transform

Wed Nov 06, 2013 9:00 pm

Thanks! I was able to add the required parameter to the transformation scenario, and the transform is using the correct file. I still haven't gotten the mapping to work successfully, but I'm a lot closer than I was and am optimistic I can debug the problem style.

If you can pass feedback onto your oXygen documentation team, it would be great to have the reference to the DITA4Publisher's user guide and the information about the correct parameter to add to the transformation scenario added into the oXygen documentation itself.

Miriam
Radu
Posts: 5585

Re: Using the Word to DITA transform

Thu Nov 07, 2013 9:52 am

Hi Miriam,

Will do, thanks for the feedback. You can also ask questions on the DITA Users List if you run into any problems with the mapping. Eliot Kimber, the author of the DITA 4 Publishers plugins is registered there.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
DeiterF15
Posts: 1

Re: Using the Word to DITA transform

Thu Aug 11, 2016 6:50 pm

I wanted to add a lesson-learned for anyone else trying to do Word-to-DITA transforms using Oxygen XML Editor (OXE). I am using OXE 17.1 on Windows 10.

Disclaimer--I am NOT a DITA/XML expert. Far from it. I'm just trying to move a couple hundred pages of test procedures from Word to something more powerful (I'm still working out how I'll do that, currently thinking of something using the MEAN stack, though perhaps using a dedicated XML database rather than MongoDB).


Lesson-Learned

- Radu's post in this thread, from 2013-11-04, is GREAT for getting that first Word-to-DITA transform done, but the build kept failing for me yesterday. Why?
- It turns out that I had OXE looking in two different places for info...
---Under Options -> Preferences -> DITA -> DITA Open Toolkit I had "Built-in DITA-OT 2.x" selected.
---Under Document -> Transformation -> Configure Transformation Scenario(s), the OOXML -> DOCX DITA Scenario had the Build File pointing to the file build-word2dita.xml in the plugins folder for the other built-in DITA OT version!. Once I pointed the Build File at \Oxygen XML Editor 17\frameworks\dita\DITA-OT2.x\plugins\org.dita4publishers.word2dita\build-word2dita.xml I got a successful build.

Sorry if all the navigation detail is a bit much, but OXE is so powerful that the hardest time I'm having is finding everything in the menus that is referenced in the online help and in the forums. So, I left a lot of breadcrumbs for the next n00b to follow :)

Now I get to dig into tweaking the transformation itself. Good luck!
Radu
Posts: 5585

Re: Using the Word to DITA transform

Wed Aug 17, 2016 9:53 am

Hi,

Oxygen 17.1 and 18.0 come with both DITA OT 1.8 and DITA OT 2.x bundled.
But the org.dita4publishers plugins are not installed in any of them. So you probably installed them yourself.
Oxygen 18.1 (Autumn this year) will have the org.dita4publishers plugins installed in its bundled DITA OT 2.x distribution so the DOCX to DITA transformation scenario will point by default to the DITA OT 2.x installation.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

Return to “XSLT and FOP”

Who is online

Users browsing this forum: No registered users and 2 guests