[oXygen-user] Feature request - Copy-and-paste from Word/Excel/HTML to DITA
Wendell Piez
Fri Jun 20 11:05:09 CDT 2008
Hi,
At 09:13 AM 6/20/2008, Sorin wrote:
>Version 9.3 which will be released in a couple of weeks will include
>an Archive Browser view that is able to open and browse Word and
>Excel documents saved in XML format, that is .docx files and .xlsx
>files. In the Archive Browser view the files that are included in
>such a Word or Excel document can be opened and edited in Oxygen so
>migrating the data to a DITA document will be easy: just apply an
>XSLT stylesheet to the XML file containing the data that must be imported.
I like this approach, as there are several other XML vocabularies
that might be wanted as targets for upconversion. Nothing against
DITA, of course, but it makes sense to consider requirements for
other tag sets as well.
However, those of us who have even glanced at .docx format know that
it's a ravenous beast of unorthodox tagging practice, for which will
be a challenge to write stylesheets.
One solution to this problem would entail a generic stylesheet that
will upconvert .docx into a more regular and proper sort of XML, in
which (just to mention the most glaring problem) mixed content is
actually mixed content. Such a plain vanilla word-processing XML
would make a much more tractable source format for conversion into
arbitrary targets such as DITA or what have you.
I dare say this stylesheet will be a devil to write, especially if it
aimed to be comprehensive. All the more reason to solve this problem
once instead of making everyone solve it on their own.
An alternative (which might be more feasible) might be a library of
XSLT templates and functions that would help take care of the hard parts.
Cheers,
Wendell
======================================================================
Wendell Piez mailto:
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
More information about the oXygen-user
mailing list