[oXygen-user] p:unescape-markup - missing htmlparser library ?

George Cristian Bina george at oxygenxml.com
Thu Jan 19 07:34:23 CST 2012


Hi Jostein,

Calabash documents only tagsoup on its main documentation page as 
required for this step
http://xmlcalabash.com/docs/

However, looking into this it seems that it defaults to the HTML parser 
that you mentioned. There are two options now:

1. Edit the engine.xml file from
[oXygen]/lib/xproc/calabash/engine.xml
and add a line
<system-property name="com.xmlcalabash.html-parser" value="tagsoup"/>
inside the runtime element.

2. Add the htmlparser-1.3.1.jar inside [oXygen]/lib/xproc/calabash/ and 
edit again the [oXygen]/lib/xproc/calabash/engine.xml file to add a 
library entry pointing to this jar in the runtime element
<library name="htmlparser-1.3.1.jar"/>

Option 1 is easier and it is what I tested but option 2 should work as well.

Best Regards,
George
--
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com

On 1/18/12 2:23 PM, Jostein Austvik Jacobsen wrote:
> I'm trying to use the XProc step p:unescape-markup to parse some HTML,
> but I don't think tagsoup got replaced with htmlparser[1] with the newer
> versions of calabash[2]? At least I can find tagsoup-1.2.jar but not
> htmlparser-1.3.1.jar in my oXygen directory.
>
> Or have I just misconfigured something? (quite likely...)
>
> (using <oXygen/> XML Editor 13.2, build 2012011017)
>
> Regards
> Jostein
>
> [1] http://about.validator.nu/htmlparser/
> [2] http://lists.w3.org/Archives/Public/xproc-dev/2011Oct/0010.html
>
>
> _______________________________________________
> oXygen-user mailing list
> oXygen-user at oxygenxml.com
> http://www.oxygenxml.com/mailman/listinfo/oxygen-user


More information about the oXygen-user mailing list