Page 1 of 1

xquery options for handling / persisting DTD declarations?

Posted: Tue Jul 10, 2012 11:10 pm
by cmcenearney

I'd like to be able to update documents with xquery update and maintain the DTD declaration.

The DTD declaration (including entities, etc) gets stripped when I apply an update and I notice that this happens even when reading it into a variable and returning it, ie fn:doc('my_DTD_disappears.xml').

Any help is greatly appreciated. I've heard that the parser doesn't consider the doctype declaration to be part of the document, but if it interprets it, then it must be able to "reinstate" it... ?

I'd rather not resort to brute-force options like wrapping the whole thing in an element, updating, then unwrapping


Re: xquery options for handling / persisting DTD declarations?

Posted: Wed Jul 11, 2012 3:15 pm
by adrian

During XML parsing the DTD "melds" with the XML document and forms an XML model. The problem is the XML model does not retain which parts are from the DTD file and which from the XML file (this information is considered irrelevant for the XML model) which makes the reverse procedure (regenerating the XML+DTD from the model) impractical.

Things can actually get worse than just a missing DOCTYPE declaration. If you have attributes with default values declared in the DTD, these attributes will appear in the resulting XML even though they weren't in the XML source file.

e.g. attribute contr for element person has the default value false

Code: Select all

<!ATTLIST person contr (true|false) 'false'>
Source XML:

Code: Select all

<person id="one.worker">
Result XML after XQuery update:

Code: Select all

<person id="one.worker" contr="false">

1. Don't use XQuery update, use a text search tool that is also XML aware. e.g. Oxygen's Find/Replace tools with XPath

2. If you must use XQuery update (or want things automated), add another post processing step (apply an XSLT stylesheet) that puts things back in place and cleans up the result. This means putting the DOCTYPE back and removing any redundant attributes (if any).


Re: xquery options for handling / persisting DTD declarations?

Posted: Wed Jul 11, 2012 7:00 pm
by cmcenearney
Thanks very much for the explanation!

The task I'm looking at right now involves a long list of unique updates so I don't think oxygen's search/replace can be used this way, ie scripted. Using bash/perl/python etc is what will happen, but it would be nice not to treat the xml like text...

Thanks again,