[oXygen-user] Custom action to process a document and update an element?

Mark Baker mbaker at analecta.com
Mon Dec 2 08:38:30 CST 2013


One way to preserve entities during a process is to pre-process the file to
double-escape the entity references. That is, transform < into <
before processing and transform them back again afterwards. This was a
standard pattern in OmniMark, where you could easily stream the file through
a filter on the way into and out of the parser. It should be doable in
XSLT2, perhaps a little less elegantly, using the unparsed-text function. In
*NIX you could do it with pipes on the command line.

Mark

> -----Original Message-----
> From: oxygen-user-bounces at oxygenxml.com [mailto:oxygen-user-
> bounces at oxygenxml.com] On Behalf Of Wendell Piez
> Sent: December 2, 2013 9:22 AM
> To: Alex Jitianu
> Cc: oxygen-user at oxygenxml.com
> Subject: Re: [oXygen-user] Custom action to process a document and update
> an element?
> 
> Hi,
> 
> On Thu, Nov 28, 2013 at 1:45 AM, Alex Jitianu <alex_jitianu at sync.ro>
wrote:
> > Hello,
> >
> > One aspect of using XQuery Update is that it will discard the DOCTYPE
> > and will also expand entities. So you might not end up with what you've
> > expected. We're investigating ways on our part to compensate for this
> > inherent losses.
> 
> Yes. The same issue arises of course using XSLT.
> 
> In the case of XSLT, I can imagine a post-process that would work by
> performing a (non-XML) parse of the source document to scan for a
> DOCTYPE declaration and for entity references, find the declarations
> for the latter, and generate an XSLT near-identity transformation that
> would serve as a post-process to restore them.
> 
> It wouldn't be perfect, as it would effectively normalize entity
> references, and it would only work for entities that expand to single
> characters. On the other hand it might be good enough for daily wear.
> 
> (I guess it would have to map all declared entities, not only those
> used, since the results could conceivably include content not in the
> source. It would also have to be optional, since there would be cases
> when switching a document type or cleaning up entities might be the
> point of the process.)
> 
> I would be sympathetic to those who felt this sort of complication
> demonstrates that the functionality isn't a good idea in the
> architecture. Certainly, it would be a good way to shoot yourself in
> the foot.
> 
> Cheers, Wendell
> _______________________________________________
> oXygen-user mailing list
> oXygen-user at oxygenxml.com
> http://www.oxygenxml.com/mailman/listinfo/oxygen-user



More information about the oXygen-user mailing list