[oXygen-user] [oXygen XML Editor Blog] - Batch converting HTML to XHTML
oXygen XML Editor Blog
noreply+feedproxy at google.com
Mon Jun 12 23:51:04 CDT 2017
oXygen XML Editor Blog
///////////////////////////////////////////
Batch converting HTML to XHTML
Posted: 12 Jun 2017 02:03 AM PDT
http://feedproxy.google.com/~r/AboutOxygenXmlEditor/~3/aF0B0Z1Zw1I/batch-converting-html-to-xhtml.html?utm_source=feedburner&utm_medium=email
Let's say you have a bunch of possible not-wellformed HTML
documents already created and you want to process them using
XSLT. For example you may want to migrate the HTML to DITA
using the predefined XHTML to DITA Topic transformation scenario
available in Oxygen. So you need to create valid XML
wellformed XHTML documents from the existing HTML documents
and you need to do this in a batch processing automated
fashion. There are lots of open source projects which deliver
processors which can convert HTML to its wellformed XHTML
equivalent. For this blog post we'll use
HTML Tidy. Here are a couple of steps to automate this
process: Create a new folder on your hard drive (for example
I created one on my Desktop:
C:\Users\radu_coravu\Desktop\tidy) and download there
the HTML Tidy executable specific for your platform:
http://binaries.html-tidy.org/. In the same folder with the
Tidy executable create an ANT build file called
build.xml having the following content: <project
basedir="." name="TidyUpHTMLtoXHTML" default="main">
<basename property="filename" file="${file}"/>
<target name="main">
<exec command="tidy.exe -o ${output.dir}/${filename} ${file}"/>
</target>
</project> Link in the Oxygen Project view the entire
folder where the original HTML documents are
located. Right click the folder, choose
Transform->Configure Transformation Scenarios...
and create a new transformation scenario of type
ANT Scenario. Modify the following properties in
the transformation scenario:
Change the scenario name to something relevant like HTML
to XHTML. Change the
Working Directory to point to the folder where the
ANT build file is located, in my
case:
C:\Users\radu_coravu\Desktop\tidy. Change the Build
file to point to your custom build.xml, in my
case:
C:\Users\radu_coravu\Desktop\tidy\build.xml. In the
Parameters tab add a parameter called file with
value ${cf} and a parameter called output.dir with
value the path to the output folder where the
equivalent XHTML files will be stored, in my
case I set it to:
C:\Users\radu_coravu\Desktop\testOutputXHTML.
Apply the newly transformation scenario on the entire folder containing the
HTML documents. At the end in the output folder you
will find the XHTML equivalents of the original HTML
files, XHTML documents which can later be processed
using XML technologies like XSLT or XQuery.
--
You are subscribed to email updates from "oXygen XML Editor Blog."
To stop receiving these emails, you may unsubscribe now:
https://feedburner.google.com/fb/a/mailunsubscribe?k=y_tRXtumvTurKTedh51JnlYsGXw
Email delivery powered by Google.
Google Inc., 1600 Amphitheatre Parkway, Mountain View, CA 94043, United
States
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.oxygenxml.com/pipermail/oxygen-user/attachments/20170613/c14924e9/attachment.html>
More information about the oXygen-user
mailing list