xquery options for handling / persisting DTD declarations?

Issues related to W3C XQuery.
cmcenearney
Posts: 6
Joined: Thu Dec 01, 2011 6:44 pm

xquery options for handling / persisting DTD declarations?

Post by cmcenearney »

Hi,

I'd like to be able to update documents with xquery update and maintain the DTD declaration.

The DTD declaration (including entities, etc) gets stripped when I apply an update and I notice that this happens even when reading it into a variable and returning it, ie fn:doc('my_DTD_disappears.xml').

Any help is greatly appreciated. I've heard that the parser doesn't consider the doctype declaration to be part of the document, but if it interprets it, then it must be able to "reinstate" it... ?

I'd rather not resort to brute-force options like wrapping the whole thing in an element, updating, then unwrapping

Thanks,
Colin
adrian
Posts: 2855
Joined: Tue May 17, 2005 4:01 pm

Re: xquery options for handling / persisting DTD declarations?

Post by adrian »

Hi,

During XML parsing the DTD "melds" with the XML document and forms an XML model. The problem is the XML model does not retain which parts are from the DTD file and which from the XML file (this information is considered irrelevant for the XML model) which makes the reverse procedure (regenerating the XML+DTD from the model) impractical.

Things can actually get worse than just a missing DOCTYPE declaration. If you have attributes with default values declared in the DTD, these attributes will appear in the resulting XML even though they weren't in the XML source file.

e.g. attribute contr for element person has the default value false

Code: Select all

<!ATTLIST person contr (true|false) 'false'>
Source XML:

Code: Select all

<person id="one.worker">
Result XML after XQuery update:

Code: Select all

<person id="one.worker" contr="false">
Solutions:
1. Don't use XQuery update, use a text search tool that is also XML aware. e.g. Oxygen's Find/Replace tools with XPath

2. If you must use XQuery update (or want things automated), add another post processing step (apply an XSLT stylesheet) that puts things back in place and cleans up the result. This means putting the DOCTYPE back and removing any redundant attributes (if any).

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
cmcenearney
Posts: 6
Joined: Thu Dec 01, 2011 6:44 pm

Re: xquery options for handling / persisting DTD declarations?

Post by cmcenearney »

Thanks very much for the explanation!

The task I'm looking at right now involves a long list of unique updates so I don't think oxygen's search/replace can be used this way, ie scripted. Using bash/perl/python etc is what will happen, but it would be nice not to treat the xml like text...

Thanks again,
Colin
Post Reply