Page 1 of 1

Word to DITA Conversion Batch Converter Add-On Style Mapping

Posted: Tue May 24, 2022 11:55 pm
by Aidan400
Hi,

Question: I am converting Word documents to DITA XML for work using the Batch Converter Add-On 4.0.0. Am I able to map Word styles to specific XML elements?

Additional Questions: Would there be a way to map a style to an element with a specific attribute, e.g. map a custom Word style "cool" --> <note type = "cool"></note>? I also do have my own XML document with mappings used previously with a different conversion system. Is there any way to import this?

What I have done so far: I know styles can be mapped to HTML elements, and I found the file containing the style mapping in oxygen-batch-converter-core-24.1-SNAPSHOT.jar by opening it with the Archive Browser. Is there a way for me to add a mapping from Word --> HTML, then add a mapping (somewhere) that takes the element made in the intermediary Word --> HTML step (in a Word --> DITA conversion) to convert that HTML element to an XML element of my choosing? I found some XSL files that appear to convert HTML (or XHTML) to XML in the same JAR file, but I am not well versed in XSL.

Thank you very much.

Re: Word to DITA Conversion Batch Converter Add-On Style Mapping

Posted: Wed May 25, 2022 1:54 pm
by Cosmin Duna
Hello,

The "wordStyleMap.xml" file that you found in the "oxygen-batch-converter-core" jar contains the default configuration of the conversion. You don't need to modify it because these styles mapping can be set using the "Word styles mapping" option from the "Plugins / Batch Documents Converter" preferences page (You can open the preferences dialog by invoking "Options" > 'Preferences...').
Here you can find more information about this option: https://www.oxygenxml.com/doc/versions/ ... w5_vw4_3rb

As you said this option controls the first step of conversion (Word to HTML) and the HTML element that you configure should be handled by the next step (HTML to DITA). But if you want a certain DITA element you can do the following without modifying that complex XSL file that you found:
  1. Configure the first step (Word to HTML) using "Word styles mapping" option to create an element with a certain class attribute for your word style.
  2. In the second step, this 'class' attribute will be converted to the 'outputclass' DITA attribute.
  3. Create a custom refactoring operation (https://www.oxygenxml.com/doc/versions/ ... _operation) based on a simple xslt that will convert the element with the outputclass attribute to the element you want in the resulted DITA content
For your example:
  1. Add the following row in the "Word styles mapping" table:
    | p | cool | p.cool:fresh |
  2. Unzip this archive
    batchConverter-refactoring.zip
    (941 Bytes) Downloaded 91 times
    in the '{Oxygen_installation_directory}/refactoring' directory.
  3. Restart Oxygen.
  4. Convert the document.
  5. Execute the custom refactoring operation (named "Post-processing Batch Documents Converter") on the resulted dita documents (https://www.oxygenxml.com/doc/versions/ ... tools.html)
About your old document with mappings. I'm afraid our converter doesn't support this format. But the "Batch Documents Converter' preferences page contains actions for importing and exporting the "Word styles mapping" configuration in XML format. If it's easier for you than adding the configuration using the table from the user interface, you can convert you xml file in our format using a custom refactoring operation.

Best regards,
Cosmin

Re: Word to DITA Conversion Batch Converter Add-On Style Mapping

Posted: Wed May 25, 2022 6:49 pm
by Aidan400
Thank you, Cosmin, that works perfectly!
:D
Have a great day!