Batch validation minor annoyances
Post here questions and problems related to editing and publishing DITA content.
-
- Posts: 42
- Joined: Mon Aug 18, 2014 11:50 pm
Batch validation minor annoyances
After upgrading from oXygen 17.1 to 18.1, there were a few new validation issues our DITA content was not conforming to, so I started to try and use the batch validation to cleanup our content a bit. Here are a few very minor quarks about the batch validation I just wanted to point out.
Platform: Standalone - Windows 32-bit
Version : <oXygen/> XML Author 18.1, build 2017020917
Java Ver: Java SE 8u102
- I had opened our project xpr file by lowercase URL path. This caused every single reference to result in a validation error of "...incorrect path capitalization.." or "File not found...". While minor, it results in over 160k validation errors that went away when I opened the xpr with correct case path.
- No way of knowing the batch validation progress, it takes my PC over an hour to run through all our files. I couldn't even detect any logic in the file order it was scanning to guesstimate how much longer it would take. It would be nice if it at least counted the files first and then used that count to indicate the progress bar position.
- After waiting over an hour, I went to try and save the ~200k validation results to text and xml file. Sadly, oXygen kept running out of memory when I tried to do that. After fixing the xpr path issue though, and getting the validation issues under control, it worked. Kinda annoying though the first time not having any way to export the results that took an hour to generate.
- No way to automatically filter and exclude some messages from batch scan. We have a plug-in that has a ValidationProblemsFilter listener for the current editor, but there does not appear to be any way to attach that to the batch validation routine. There are a few errors such as the ones complaining about "-dita-use-conref-target" attribute values that are not issues and should not be filtered.
- I added a ValidationProblemsFilter to the map editor to exclude a few messages. It correctly filters the probables, however the problem view panel automatically opens up and is empty when all messages are filtered.
- When renaming a colname in the colspec of a table in author view, all the references to the old colname in the table are not updated. It would be nice if oXygen automatically updated the colname references in the table as well. I could probably incorporate this into our plug-in but seems like something oXygen would do automatically.
Platform: Standalone - Windows 32-bit
Version : <oXygen/> XML Author 18.1, build 2017020917
Java Ver: Java SE 8u102
-
- Posts: 9446
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Batch validation minor annoyances
Hi,
These are a lot of good observations, please see some answers below:
Regards,
Radu
These are a lot of good observations, please see some answers below:
I will try to reproduce this on my side. So the project references a folder which contains the DITA resources, right? Why did you not open the main DITA Map in the DITA Maps Manager and use the "Validate and check for completeness" action? It shows more problems than just validating each topic individually...I had opened our project xpr file by lowercase URL path. This caused every single reference to result in a validation error of "...incorrect path capitalization.." or "File not found...". While minor, it results in over 160k validation errors that went away when I opened the xpr with correct case path.
We also do not have a good way to know how much this will take. I think that at some point we counted all files and encountered a situation in which someone linked their entire harddrive in the project. So counting all the files lasted a long time... but I understand where you stand and I will add an internal issue, maybe we can better report how long this will take.No way of knowing the batch validation progress, it takes my PC over an hour to run through all our files. I couldn't even detect any logic in the file order it was scanning to guesstimate how much longer it would take. It would be nice if it at least counted the files first and then used that count to indicate the progress bar position.
I will add an internal issue to test if our batch validation properly releases memory in the end. So these were mostly DITA topics/maps/concepts and so on right? Were they DTD, XML Schema, or RNG based?After waiting over an hour, I went to try and save the ~200k validation results to text and xml file. Sadly, oXygen kept running out of memory when I tried to do that. After fixing the xpr path issue though, and getting the validation issues under control, it worked. Kinda annoying though the first time not having any way to export the results that took an hour to generate.
I see below that you managed to get the filter API working. Could you tell us about those particular issues that Oxygen should not report anymore? Maybe we can avoid reporting them on our side if they are nonsense...No way to automatically filter and exclude some messages from batch scan. We have a plug-in that has a ValidationProblemsFilter listener for the current editor, but there does not appear to be any way to attach that to the batch validation routine. There are a few errors such as the ones complaining about "-dita-use-conref-target" attribute values that are not issues and should not be filtered.
I will add a filter on my side and test this.I added a ValidationProblemsFilter to the map editor to exclude a few messages. It correctly filters the probables, however the problem view panel automatically opens up and is empty when all messages are filtered.
Yes, we do not yet have this feature, we have plans for this and I fully agree it would be a good improvement. I will add your contact details to the opened internal issue.When renaming a colname in the colspec of a table in author view, all the references to the old colname in the table are not updated. It would be nice if oXygen automatically updated the colname references in the table as well. I could probably incorporate this into our plug-in but seems like something oXygen would do automatically.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 42
- Joined: Mon Aug 18, 2014 11:50 pm
Re: Batch validation minor annoyances
Hi thanks for the quick and detailed reply.
I supose we could just hardcode the col number but not sure what would happen if the source table co num changed.
Thanks again.
We don't have a main map and some files may be orphaned. I suppose I could have listed all our DITA files and made a master map and checked that.Radu wrote:I will try to reproduce this on my side. So the project references a folder which contains the DITA resources, right? Why did you not open the main DITA Map in the DITA Maps Manager and use the "Validate and check for completeness" action? It shows more problems than just validating each topic individually...
makes sense now, although the check would have to eventually go though all the files and folders anyway right?Radu wrote:We also do not have a good way to know how much this will take. I think that at some point we counted all files and encountered a situation in which someone linked their entire harddrive in the project. So counting all the files lasted a long time... but I understand where you stand and I will add an internal issue, maybe we can better report how long this will take.
Yeah, just topics and maps. i think it properly released the memory and everything, the memory usage indicator in the tray was normal. It just seemed to be the "save as text/xml" routines that were struggling with over 200k results the first time around (mainly due to filename case issue)Radu wrote: I will add an internal issue to test if our batch validation properly releases memory in the end. So these were mostly DITA topics/maps/concepts and so on right? Were they DTD, XML Schema, or RNG based?
Most frequent issue is "Element is not a content reference but has attribute "(cols|status)" with value "-dita-use-conref-target"." Source code DITA is like the following:Radu wrote:I see below that you managed to get the filter API working. Could you tell us about those particular issues that Oxygen should not report anymore? Maybe we can avoid reporting them on our side if they are nonsense...
Code: Select all
<table conref="file2.dita#ref/table_abc">
<title></title>
<tgroup cols="-dita-use-conref-target">
....
Thanks again.
-
- Posts: 9446
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Batch validation minor annoyances
Hi,
So:
This happens, because when you batch validate topics, each topic is validated with its associated validation scenario which also includes some Schematron checks and Schematron checks are quite consuming. When you use the "Validate and check for completeness" you can also specify a Schematron file to use for validation, but by default validation is done only against the DTD/Schema + by checking various other rules.
Actually that's a list of what the Validate and check for completeness does:
http://blog.oxygenxml.com/2015/12/dita- ... k-for.html
and it's very fast, with a fast computer in 20 seconds it can validate about 1000-2000 topics and maps.
Indeed it does not validate orphan resources. But I guess you don't really care much about those anyway.
[url]http://docs.oasis-open.org/dita/v1.2/os ... argetvalue[/quote]
seems to state that this special "-dita-use-conref-target" value can be used only on the element which has the @conref attribute so I think that Oxygen is correct in signaling this as a problem (although the problem is benign).
So indeed Oxygen hard codes the value "1" for the @cols attribute. That "tgroup" basically needs to be added there because it is required by the DTDs, but it will not be used for anything as long as the @conref resolves to a proper table. So it is just a fallback element in case the conref does not resolve and it is also required by validation.
Regards,
Radu
So:
The "Validate and check for completeness" done on a DITA Map does a much faster validation of all the topics referenced in the map (or submaps) than batch validating each topic.We don't have a main map and some files may be orphaned. I suppose I could have listed all our DITA files and made a master map and checked that.
This happens, because when you batch validate topics, each topic is validated with its associated validation scenario which also includes some Schematron checks and Schematron checks are quite consuming. When you use the "Validate and check for completeness" you can also specify a Schematron file to use for validation, but by default validation is done only against the DTD/Schema + by checking various other rules.
Actually that's a list of what the Validate and check for completeness does:
http://blog.oxygenxml.com/2015/12/dita- ... k-for.html
and it's very fast, with a fast computer in 20 seconds it can validate about 1000-2000 topics and maps.
Indeed it does not validate orphan resources. But I guess you don't really care much about those anyway.
I think that the out of memory occurred more in your case because of the many errors that were reported. But we will do more tests on our side.Yeah, just topics and maps. i think it properly released the memory and everything, the memory usage indicator in the tray was normal. It just seemed to be the "save as text/xml" routines that were struggling with over 200k results the first time around (mainly due to filename case issue)
The specs:Most frequent issue is "Element is not a content reference but has attribute "(cols|status)" with value "-dita-use-conref-target"."
[url]http://docs.oasis-open.org/dita/v1.2/os ... argetvalue[/quote]
seems to state that this special "-dita-use-conref-target" value can be used only on the element which has the @conref attribute so I think that Oxygen is correct in signaling this as a problem (although the problem is benign).
If you use Oxygen's "Reuse Content" action to insert a conref to a table, Oxygen generates this DITA content:I supose we could just hardcode the col number but not sure what would happen if the source table co num changed.
Code: Select all
<table conref="#introduction/tableID">
<tgroup cols="1">
<tbody>
<row>
<entry/>
</row>
</tbody>
</tgroup>
</table>
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 9446
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Batch validation minor annoyances
Hi,
I had some more time to look into this:
In the Oxygen Help menu->About dialog there is also a "Build ID" value, something with a pattern like "yyyymmddhh". Could you tell me what that value is on your side?
I constructed a validation problems filter which removes all reported problems but I cannot reproduce this problem on my side, could you give me some Java sample code with what your plugin does?
Regards,
Radu
I had some more time to look into this:
So you are using Oxygen 18.1, right?I added a ValidationProblemsFilter to the map editor to exclude a few messages. It correctly filters the probables, however the problem view panel automatically opens up and is empty when all messages are filtered.
In the Oxygen Help menu->About dialog there is also a "Build ID" value, something with a pattern like "yyyymmddhh". Could you tell me what that value is on your side?
I constructed a validation problems filter which removes all reported problems but I cannot reproduce this problem on my side, could you give me some Java sample code with what your plugin does?
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 42
- Joined: Mon Aug 18, 2014 11:50 pm
Re: Batch validation minor annoyances
Sure, I think I mentioned that info in my first post, is this what you are looking for?Radu wrote:So you are using Oxygen 18.1, right?
In the Oxygen Help menu->About dialog there is also a "Build ID" value, something with a pattern like "yyyymmddhh". Could you tell me what that value is on your side?
I constructed a validation problems filter which removes all reported problems but I cannot reproduce this problem on my side, could you give me some Java sample code with what your plugin does?
Platform: Standalone - Windows 32-bit
Version : <oXygen/> XML Author 18.1, build 2017020917
Java Ver: Java SE 8u102
It's very minor thing, I wouldn't spend a lot of time on it. It looks like I can prevent the panel popup by turning off automatic DITA validation for DITA maps.
Here is an example map:
Code: Select all
<?xml version="1.0"?>
<!DOCTYPE map PUBLIC "-//OASIS//DTD DITA Map//EN" "map.dtd">
<map>
<topicref href="example.xml#ref/section_2"/>
</map>
Code: Select all
...
if(msg.indexOf("Topic references should only be made to topic IDs") > -1)
iterator.remove();
-
- Posts: 9446
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Batch validation minor annoyances
Hi,
Does this problem with the empty problems list occur when the DITA Map is opened in the DITA Maps Manager view or in the main editor?
If possible could you post your entire code from the implementation of "ValidationProblemsFilter.filterValidationProblems(ValidationProblems)"?
Regards,
Radu
Does this problem with the empty problems list occur when the DITA Map is opened in the DITA Maps Manager view or in the main editor?
If possible could you post your entire code from the implementation of "ValidationProblemsFilter.filterValidationProblems(ValidationProblems)"?
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Return to “DITA (Editing and Publishing DITA Content)”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ Artificial Intelligence (AI Positron Assistant add-on)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service