XSLT transformation as batch operation - performance
Oxygen general issues.
-
- Posts: 14
- Joined: Tue Jan 19, 2016 5:55 pm
XSLT transformation as batch operation - performance
Hi all,
I have read the topics in the thread "XSLT in batch" and I was glad to learn something about XML Refactoring.
We are building some DTD migration XSLT which ultimately must be applied to thousands of XML documents distributed into a folder structure. So, right-clicking the folder and the Transform > Transform with command is an easy solution – at least for testing the XSLT on a certain number of files. (BTW, we can not use {$cf} as a target, because we need all files intact for access during the processing of later files.)
I am somewhat not satisfied with the performance of this batch transformation. As I am under the impression that Oxygen parses the XSLT new for every file found. Is that correct?
Are there ways to cache the parsed XSLT?
Or, are there ways using Saxon EE to create and execute an SEF file?
Or, do you have other suggestion for increasing batch performance? (Apart from making sure the XSLT is well-written.)
Before I have one of our developers write a simple transformation runner Java application, I am wondering if I already get the maximum performance from Oxygen or if there are options to enhance this. We would not blame OxygenXML if this use case is not supported, I just don't want to miss a feature. There is so much to learn…
Thanks a lot for your time,
- Michael
I have read the topics in the thread "XSLT in batch" and I was glad to learn something about XML Refactoring.
We are building some DTD migration XSLT which ultimately must be applied to thousands of XML documents distributed into a folder structure. So, right-clicking the folder and the Transform > Transform with command is an easy solution – at least for testing the XSLT on a certain number of files. (BTW, we can not use {$cf} as a target, because we need all files intact for access during the processing of later files.)
I am somewhat not satisfied with the performance of this batch transformation. As I am under the impression that Oxygen parses the XSLT new for every file found. Is that correct?
Are there ways to cache the parsed XSLT?
Or, are there ways using Saxon EE to create and execute an SEF file?
Or, do you have other suggestion for increasing batch performance? (Apart from making sure the XSLT is well-written.)
Before I have one of our developers write a simple transformation runner Java application, I am wondering if I already get the maximum performance from Oxygen or if there are options to enhance this. We would not blame OxygenXML if this use case is not supported, I just don't want to miss a feature. There is so much to learn…
Thanks a lot for your time,
- Michael
-
- Posts: 9436
- Joined: Fri Jul 09, 2004 5:18 pm
Re: XSLT transformation as batch operation - performance
Hi Michael,
Indeed Oxygen does not reuse the parsed XSLT stylesheet between transformations. So the XSLT is parsed and compiled each time. We had some plans to reuse the parsed XSLT and I will try to increase the internal issue's priority based on your feedback.
About this remark:
Or indeed compile the XSLT to an "SEF" file (in Oxygen 19.0 you can find a specific action for this in the Tools menu).
And you can use the "sef" file as the "XSLT" file in the transformation scenario dialog, it should work when transforming with Saxon PE or EE. I'm not sure about the performance increase but you could try this.
Or you could try to create an ANT build file and use the <xslt> task to convert each of the XML documents:
https://ant.apache.org/manual/Tasks/style.html
From what I remember the <xslt> task by default reuses the XSLT transformer between runs.
Regards,
Radu
Indeed Oxygen does not reuse the parsed XSLT stylesheet between transformations. So the XSLT is parsed and compiled each time. We had some plans to reuse the parsed XSLT and I will try to increase the internal issue's priority based on your feedback.
About this remark:
Maybe your XSLT could read itself the XML documents from the folder using the XSLT 2.0 "collection" function and then use the "result-document" functionality to process and write them to disk. So the XSLT itself would do the batch processing operation.Or, do you have other suggestion for increasing batch performance? (Apart from making sure the XSLT is well-written.)
Or indeed compile the XSLT to an "SEF" file (in Oxygen 19.0 you can find a specific action for this in the Tools menu).
And you can use the "sef" file as the "XSLT" file in the transformation scenario dialog, it should work when transforming with Saxon PE or EE. I'm not sure about the performance increase but you could try this.
Or you could try to create an ANT build file and use the <xslt> task to convert each of the XML documents:
https://ant.apache.org/manual/Tasks/style.html
From what I remember the <xslt> task by default reuses the XSLT transformer between runs.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service