Xpath evaluation normalize

Post by **nickebowen** » Fri Apr 21, 2006 7:38 pm

I'd expect the following xpath statement to not return any results, but it does. There are no text nodes after <status>. I assume that the parser is seeing spaces and linefeeds as a text nodes. How can I configure oXygen to ignore these or normalize the XML before evaluating the xpath?

Xpath statement evaluated when highlighting the <output> element: status/text()

<output>
<status event-id="pwd-subscribe" level="error" type="password-set-operation">
<description>Error Set Password failed</description>
<operation-data>
<password-subscribe-status>
<association>\VLABDOM\VSINGH</association>
</password-subscribe-status>
</operation-data>
</status>
</output>

Post by **sorin_ristache** » Tue Apr 25, 2006 12:02 pm

Hello,

The parser sees white spaces between <status> and <description>, </description> and <operation-data>, </operation-data> and </status> as text nodes. This behavior is correct because the spaces form a text node. The result is the same as an XPath query invoked from an XSLT stylesheet and I think the same XPath query invoked from the XPath toolbar or from the XPath Builder view should give the same result.

Normalizing text nodes does not remove white spaces, only replaces them with a single space, the character with the decimal code 32.

Regards,
Sorin

Post by **nickebowen** » Sat May 06, 2006 7:45 pm

Is there a way to configure oXygen to automatically remove the spaces, CR/LF, tabs and other "presentation" related characters between elements before evaluating the Xpath statement? The reason I ask is that I've used XMLSpy for a few years and have switched to oXygen. When evaluating this same Xpath expression against the XML document, no results are returned. XMLSpy must remove the spaces, CR/LF, etc. prior to evaluating the Xpath expression.

Post by **sorin_ristache** » Mon May 08, 2006 10:46 am

Hello,

<oXygen/> returns the result of evaluating the XPath expression with a conformant XPath processor. Even if some text nodes contain only whitespace removing them before the XPath expression is evaluated or ignoring them while evaluating the expression is not conformant with the XPath specification. If you want to match text nodes but ignore the nodes containing only whitespace you should do that in the application which uses the expression, for example an XSLT stylesheet. If you post more details about the application where you need that you could receive a more specific answer.

Regards,
Sorin

Post by **george** » Mon May 08, 2006 12:54 pm

Dear Nick,

We cannot implement incorrect behavior in oXygen in order to make it work similar with other applications. In your case the correct XPath expression is

status/text()[translate(normalize-space(.), ' ', '')!='']

The normalize-space function replaces all whitespace sequences with a single space and the translate space to nothing removes spaces.

Best Regards,
George

Post by **nickebowen** » Mon May 08, 2006 7:25 pm

Here is a little background behind my question. I do alot of work with Novell's IDM (Identity Management) product. Basically, it functions by sending XML documents between different directories and databases with information on events that have happened. My job is to enforce business logic to these XML documents as they flow through the system. The IDM engine does the normalization of the XML documents that pass through the system prior to applying any of my business logic that is in the form of Xpath expressions or XSLT. Therefore, the XML documents that I am dealing with are already in a "normalized" state and I don't need to compensate for this in my Xpath expressions or XSLT. When I copy these documents into oXygen, they are put into a more human readable form.

Believe me, I'm very glad that you adhere to the standards. I think it will help make me a better developer. My only question was if there was a way to have oXygen "clean up" the XML document prior to applying my Xpath expressions so I don't have to build the normalization into my Xpath expressions. Therefore, I could use my Xpath expressions developed in oXygen unaltered in my business logic.

Thanks for you assistance.
....Nick

Post by **george** » Tue May 09, 2006 9:11 am

Hi,

That is not an usual usecase and oXygen does not have support for something like that. What you can do is to pass your document through a transformation like

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

  version="1.0"> 



  <xsl:template match="node() | @*">

    <xsl:copy>

      <xsl:apply-templates select="node() | @*"/>

    </xsl:copy>

  </xsl:template>

  

  <xsl:template match="text()">

    <xsl:if test="translate(normalize-space(.), ' ', '')!=''">

      <xsl:value-of select="."/>

    </xsl:if>

  </xsl:template>

</xsl:stylesheet>

and apply the XPath on the new document resulted after transformation instead of your initial document.

Hope that helps,
George

Post by **nickebowen** » Tue May 09, 2006 7:31 pm

That will work perfectly. Thanks for the followup.
...Nick