[oXygen-user] Bug in Oxygen XPath evaluator for leaf elements containing a comment or a CDATA section

Thu Feb 10 06:51:44 CST 2022

Hi Roger,

I see you also started a discussion on the xml dev list:

http://lists.xml.org/archives/xml-dev/202202/msg00016.html

Oxygen uses the Saxon XSLT processor to run the XPath but it creates a 
nodes structure using the Xerces parser which various features set to it 
so running an XPath in Oxygen is most of the times similar to running it 
inside an XSLT stylesheet (most of the times).

1) About the first example, two text nodes separated by an XML comment.

> <root>
>     abc<!-- aaa -->def
> </root>
even running an XSLT stylesheet returns two text nodes:

> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>     xmlns:xs="http://www.w3.org/2001/XMLSchema"
>     exclude-result-prefixes="xs"
>     version="2.0">
>     <xsl:template match="root">
>         <xsl:message>cnt <xsl:value-of 
> select="count(text())"/></xsl:message>
>     </xsl:template>
> </xsl:stylesheet>
So our XPath results are similar with the XSLT stylesheet's.

The XPath is a sequence of two text nodes. To always return one node, I 
would probably use this XPath "/root/string-join(text(), '')" instead.

2) About the text and cdata example:

> <root>
>     abc<![CDATA[cdata]]>def
> </root>
Indeed the same XSLT stylesheet returns a single text node, so in this 
case our XPath evaluation (which returns 3 separate notes) is different 
than how the Xpath would be evaluated inside an XSLT engine. Probably 
it's different because to run the XPath and be able to precisely 
localize the nodes we create the nodes structure instead of delegating 
the creation to the Saxon XSLT Processor. And in that nodes structure 
there is a separate node for cdata.

Regards,

Radu

Radu Coravu
Oxygen XML Editor

On 2/10/22 00:53, Roger L Costello wrote:
>
> Hello Oxygen Team,
>
> The XPath evaluator returns a series of text nodes for a leaf element 
> containing a comment or a CDATA section. That is not correct. The 
> content is a single text node.
>
> Here is a <Test> element whose content is some text, a comment, and 
> more text. The XPath evaluator for /Test/text() returns 2 items, which 
> is not correct. The content of <Test> is a single text node containing 
> all of the text concatenated.
>
> The XPath evaluator erroneously returns 3 items when <Test> contains a 
> CDATA section. Further, there is no such thing as cdata-section node.
>
>
> _______________________________________________
> oXygen-user mailing list
> oXygen-user at oxygenxml.com
> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.oxygenxml.com/pipermail/oxygen-user/attachments/20220210/0be65dba/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 106491 bytes
Desc: not available
URL: <http://www.oxygenxml.com/pipermail/oxygen-user/attachments/20220210/0be65dba/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image004.png
Type: image/png
Size: 125776 bytes
Desc: not available
URL: <http://www.oxygenxml.com/pipermail/oxygen-user/attachments/20220210/0be65dba/attachment-0003.png>