Hi,
I'd like to be able to update documents with xquery update and maintain the DTD declaration.
The DTD declaration (including entities, etc) gets stripped when I apply an update and I notice that this happens even when reading it into a variable and returning it, ie fn:doc('my_DTD_disappears.xml').
Any help is greatly appreciated. I've heard that the parser doesn't consider the doctype declaration to be part of the document, but if it interprets it, then it must be able to "reinstate" it... ?
I'd rather not resort to brute-force options like wrapping the whole thing in an element, updating, then unwrapping
Thanks,
Colin
xquery options for handling / persisting DTD declarations?
-
- Posts: 6
- Joined: Thu Dec 01, 2011 6:44 pm
Re: xquery options for handling / persisting DTD declarations?
Hi,
During XML parsing the DTD "melds" with the XML document and forms an XML model. The problem is the XML model does not retain which parts are from the DTD file and which from the XML file (this information is considered irrelevant for the XML model) which makes the reverse procedure (regenerating the XML+DTD from the model) impractical.
Things can actually get worse than just a missing DOCTYPE declaration. If you have attributes with default values declared in the DTD, these attributes will appear in the resulting XML even though they weren't in the XML source file.
e.g. attribute contr for element person has the default value false Source XML: Result XML after XQuery update:
Solutions:
1. Don't use XQuery update, use a text search tool that is also XML aware. e.g. Oxygen's Find/Replace tools with XPath
2. If you must use XQuery update (or want things automated), add another post processing step (apply an XSLT stylesheet) that puts things back in place and cleans up the result. This means putting the DOCTYPE back and removing any redundant attributes (if any).
Regards,
Adrian
During XML parsing the DTD "melds" with the XML document and forms an XML model. The problem is the XML model does not retain which parts are from the DTD file and which from the XML file (this information is considered irrelevant for the XML model) which makes the reverse procedure (regenerating the XML+DTD from the model) impractical.
Things can actually get worse than just a missing DOCTYPE declaration. If you have attributes with default values declared in the DTD, these attributes will appear in the resulting XML even though they weren't in the XML source file.
e.g. attribute contr for element person has the default value false
Code: Select all
<!ATTLIST person contr (true|false) 'false'>
Code: Select all
<person id="one.worker">
Code: Select all
<person id="one.worker" contr="false">
1. Don't use XQuery update, use a text search tool that is also XML aware. e.g. Oxygen's Find/Replace tools with XPath
2. If you must use XQuery update (or want things automated), add another post processing step (apply an XSLT stylesheet) that puts things back in place and cleans up the result. This means putting the DOCTYPE back and removing any redundant attributes (if any).
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
-
- Posts: 6
- Joined: Thu Dec 01, 2011 6:44 pm
Re: xquery options for handling / persisting DTD declarations?
Thanks very much for the explanation!
The task I'm looking at right now involves a long list of unique updates so I don't think oxygen's search/replace can be used this way, ie scripted. Using bash/perl/python etc is what will happen, but it would be nice not to treat the xml like text...
Thanks again,
Colin
The task I'm looking at right now involves a long list of unique updates so I don't think oxygen's search/replace can be used this way, ie scripted. Using bash/perl/python etc is what will happen, but it would be nice not to treat the xml like text...
Thanks again,
Colin