Page 1 of 1

Slow Transformation Scenarios in Projects

Posted: Thu Dec 07, 2017 6:35 pm
by bds
Hi all --

When we are working with oXygen Projects and tranforming large directories of files, we seem to be running into a slowdown, especially versus running saxon9he from the command line. This occurs on multiple OSs -- MacOS X, Ubuntu, and FreeBSD. I haven't had an opportunity to test this on Windows.

For example, a directory of 154 files takes almost a minute, while running from the command line takes about 4.5 seconds. It would be nice to have a better understanding of how to make the project execute the transform faster. The transform being applied in both cases is a simple, minimal identity transform.

Is there an option to apply a transformation against a directory, as with the saxon on the command line?

Here is the transform:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">

<!-- basic identity transform -->
<xsl:output method="xml" indent="yes" encoding="UTF-8"/>
<xsl:strip-space elements="*"/>

<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Here is the project file -- I have stored the transformation scenario in the project:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<project version="19.1">
<meta>
<filters directoryPatterns="" filePatterns="proj-test.xpr" positiveFilePatterns="" showHiddenFiles="false"/>
<options>
<serialized version="19.1" xml:space="preserve">
<serializableOrderedMap>
<entry>
<String>scenario.associations</String>
<scenarioAssociation-array>
<scenarioAssociation>
<field name="url">
<String>sample-mods-154/</String>
</field>
<field name="scenarioIds">
<list>
<String>sample-mods-154-proj</String>
</list>
</field>
<field name="scenarioTypes">
<list>
<String>XSL</String>
</list>
</field>
</scenarioAssociation>
</scenarioAssociation-array>
</entry>
<entry>
<String>scenarios</String>
<scenario-array>
<scenario>
<field name="advancedOptionsMap">
<null/>
</field>
<field name="name">
<String>sample-mods-154-proj</String>
</field>
<field name="baseURL">
<String></String>
</field>
<field name="footerURL">
<String></String>
</field>
<field name="fOPMethod">
<String>pdf</String>
</field>
<field name="fOProcessorName">
<String>Apache FOP</String>
</field>
<field name="headerURL">
<String></String>
</field>
<field name="inputXSLURL">
<String>file:/home/bridger/Documents/xslt-stuff/oxygen-project-test/basic-identity-transform.xsl</String>
</field>
<field name="inputXMLURL">
<String>${currentFileURL}</String>
</field>
<field name="defaultScenario">
<Boolean>false</Boolean>
</field>
<field name="isFOPPerforming">
<Boolean>false</Boolean>
</field>
<field name="type">
<String>XSL</String>
</field>
<field name="saveAs">
<Boolean>true</Boolean>
</field>
<field name="openInBrowser">
<Boolean>false</Boolean>
</field>
<field name="outputFile">
<File>${pd}/sample-mods-154-out-proj/${cfne}</File>
</field>
<field name="outputResource">
<String>${pd}/sample-mods-154-out-proj/${cfne}</String>
</field>
<field name="openOtherLocationInBrowser">
<Boolean>false</Boolean>
</field>
<field name="locationToOpenInBrowserURL">
<null/>
</field>
<field name="openInEditor">
<Boolean>false</Boolean>
</field>
<field name="showInHTMLPane">
<Boolean>false</Boolean>
</field>
<field name="showInXMLPane">
<Boolean>false</Boolean>
</field>
<field name="showInSVGPane">
<Boolean>false</Boolean>
</field>
<field name="showInResultSetPane">
<Boolean>false</Boolean>
</field>
<field name="useXSLTInput">
<Boolean>true</Boolean>
</field>
<field name="xsltParams">
<list/>
</field>
<field name="cascadingStylesheets">
<String-array/>
</field>
<field name="xslTransformer">
<String>Saxon-HE</String>
</field>
<field name="extensionURLs">
<String-array/>
</field>
</scenario>
</scenario-array>
</entry>
</serializableOrderedMap>
</serialized>
</options>
</meta>
<projectTree name="proj-test.xpr">
<folder path="."/>
</projectTree>
</project>
Here's the command line:

Code: Select all

java -jar /home/bridger/src/saxonHE/saxon9he.jar -s:oxygen-project-test/sample-mods-154 -xsl:oxygen-project-test/basic-identity-transform.xsl -o:oxygen-project-test/sample-mods-154-out
I've checked the transformation scenario for the project and I'm not showing any results, as I know that can cause some slow behavior. This example was ~150 files, but we will sometimes transform several thousand files.

Thanks for your time.

Re: Slow Transformation Scenarios in Projects

Posted: Thu Dec 07, 2017 7:06 pm
by adrian
Hello,

This could be a Saxon issue, but requires further investigation.
1. What version of Oxygen are you using (guessing v19.1 which means Saxon 9.7.0.19)?
2. What version of saxon9he are you using in the command line?
3. Do the XML files from the directory have any references to remote resources (DTDs, schemas, included files, etc)?
Is there an option to apply a transformation against a directory, as with the saxon on the command line?
You seem to have already associated the transformation scenario to the directory, so you're already running it that way. The difference is Oxygen runs a separate transformation for each individual file from the directory, but no, there is no support for running a single transformation on a directory as in the command line.

Regards,
Adrian

Re: Slow Transformation Scenarios in Projects

Posted: Thu Dec 07, 2017 7:44 pm
by bds
Hi Adrian - thanks for the reply.
adrian wrote:Hello,

This could be a Saxon issue, but requires further investigation.
1. What version of Oxygen are you using (guessing v19.1 which means Saxon 9.7.0.19)?
XML Editor 19.1, build 2017102417
2. What version of saxon9he are you using in the command line?
Saxon-HE 9.7.0.15J from Saxonica
3. Do the XML files from the directory have any references to remote resources (DTDs, schemas, included files, etc)?
They do, but I have DTD and Schema validations turned off in Options > Preferences > XML > XSLT/XSLT-FO/XQuery > XSLT > Saxon > Saxon-HE/PE/EE. Is there another place those options can be set?
Is there an option to apply a transformation against a directory, as with the saxon on the command line?
You seem to have already associated the transformation scenario to the directory, so you're already running it that way. The difference is Oxygen runs a separate transformation for each individual file from the directory, but no, there is no support for running a single transformation on a directory as in the command line.
Hm, okay. Is that a feature request, or not on the roadmap at all? Alternately, (and I'll work up a test for this) do you think that executing the transform against a catalog file would speed things up at all? Thanks again for your help!
Regards,
Adrian
Best,
Bridger

Re: Slow Transformation Scenarios in Projects

Posted: Thu Dec 07, 2017 9:08 pm
by bds
bds wrote:..snip...
Hm, okay. Is that a feature request, or not on the roadmap at all? Alternately, (and I'll work up a test for this) do you think that executing the transform against a catalog file would speed things up at all? Thanks again for your help!
...snip...
That seems to be a good approach -- speed is comparable to CLI saxon -- we'll just need to work on the transform process.

Re: Slow Transformation Scenarios in Projects

Posted: Wed Dec 13, 2017 7:21 pm
by adrian
They do, but I have DTD and Schema validations turned off in Options > Preferences > XML > XSLT/XSLT-FO/XQuery > XSLT > Saxon > Saxon-HE/PE/EE. Is there another place those options can be set?
As you have observed, turning them off is not enough, especially for DTD which is considered part of the XML. Remote DTD/schemas need to be resolved through XML catalogs.

Hopefully that clarifies the issue

Regards,
Adrian