Diff Files XPath Intelligence?

Oxygen general issues.
JimiB
Posts: 1
Joined: Mon Oct 11, 2021 8:20 pm

Diff Files XPath Intelligence?

Post by JimiB »

Hi,

I am using Oxygen XML Author version 18 to compare two xml files. I'd like to ignore a number of elements where the names are common between the files. However the syntax that I can apply seems to be looking for defined values rather than comparing the content held by an attribute.

Here's an example:

File 1
<object id="73526" name="Relay 15A">
</object>
<object id="73891" name="Relay 30A">
</object>

File 2
<object id="85412" name="Relay 70A">
</object>
<object id="84232" name="Relay 15A">
</object>

So I'd like an xpath expression that looks at all objects, interrogates the name attribute and ignores instances where the name is found in both files. In the example relay 15A should be ignored.

I thought something like "//object/@name" would be sufficient but the tool wants me to specify a name rather than looking for matches of all names. If I use //object[@name] it will ignore all elements that have a name attribute, regardless of the content.

Am I expecting too much of the tool or am I lacking in xpath knowledge?

Thanks,
-Jimi
teo
Posts: 81
Joined: Wed Aug 30, 2017 3:56 pm

Re: Diff Files XPath Intelligence?

Post by teo »

Hi Jimi,

An XPath expression like the 1st one you used, i.e. //object/@name, forces the comparison tool to ignore the value of the 'name' attribute when comparing two 'object' elements.
But it will take into account the value of the 'id' attribute. And it will report a difference if the 'id's are different.
The 2nd XPath you used, i.e. //object[@name], forces the comparison tool to ignore from comparison all elements that have a 'name' attribute.
Therefore, In your simple example no differences will be reported.

So, the 'exclusion by XPath' in our diff tool works as if the specified attributes or the elements denoted by specified attributes are removed from both documents prior to their comparison.
Your use case is different, I don't figure out what XPath expression I could provide.
It will only work if specifying concrete values of the attributes to be considered.
As with: //object[contains(@name, '15A')] (with Ignore Whitespaces option checked).

So, given your use case, I must admit some limitations of our diff tool.

Best regards,
Teo
Teodor Timplaru
<oXygen/> XML Editor
http://www.oxygenxml.com
patjporter
Posts: 53
Joined: Sat May 22, 2021 6:04 pm

Re: Diff Files XPath Intelligence?

Post by patjporter »

Hi Teo,
I have a question based on your reply. In my case I have a set of files I want to compare, but I want to ignore what is in the <prolog> element, and I also want to ignore any element <data> that has an @rev attribute.

I have little skill with xpath. Is there a way to ignore <prolog> and <data rev="x"> where "x" could be any integer value?

Thank you!
Pat
teo
Posts: 81
Joined: Wed Aug 30, 2017 3:56 pm

Re: Diff Files XPath Intelligence?

Post by teo »

Hi Pat,

If I understand correctly your use case, I suggest to try an expression like: //prolog | //data[matches(@rev,'^\d+$')]
Thus:
- all <prolog> elements are excluded from comparison
- all <data> elements with 'rev' attribute consisting only of digits are also excluded from comparison
Note that <data> elements with the 'rev' attribute that do not contain only digits are not excluded from the comparison.
For example, <data rev="123"> and <data rev="123B"> are not excluded, and a difference is signaled.
If the content of the 'rev' attribute is not important in your case, you can omit the matches() function and the regexp: //prolog | //data[@rev]

Kind regards,
Teo
Teodor Timplaru
<oXygen/> XML Editor
http://www.oxygenxml.com
patjporter
Posts: 53
Joined: Sat May 22, 2021 6:04 pm

Re: Diff Files XPath Intelligence?

Post by patjporter »

Thank you!

It may be operator error on my part but the only place to put this expression is under the Dff / Files Comparison preferences in 'Ignore Nodes by XPath". When I do a Directories comparison, it does not exclude these elements. Am I doing something wrong? How do I get the XPath expression to apply to all files in a directory?

Thanks!
Pat
teo
Posts: 81
Joined: Wed Aug 30, 2017 3:56 pm

Re: Diff Files XPath Intelligence?

Post by teo »

Hi Pat,

Unfortunately you cannot use XPath exclusion for directory comparison.
We have an issue logged on this topic which sparked some discussions because from the Diff Directories dialog one can get to Preferences / Diff / Files Comparison where it seems that the option 'Exclude nodes by XPath' is available.
And yet it is not, I mean it has no effect for directory comparison.

I will not detail here, I just point out that the current implementation in Diff Directories is different (based on streams) and does not allow yet the exclusion by XPath expressions. However, the fix implies a serious code refactory, and the implementation is quite complex.

You have probably experienced the situation in the simplified example below, derived from your use case.
Suppose we have 2 folders, each containing a single XML file:

Folder-1/file.xml
<?xml version="1.0" encoding="UTF-8"?>
<root>
<prolog>boo</prolog>
<data rev="123">
<a></a>
</data>
</root>

Folder-2/file.xml
<?xml version="1.0" encoding="UTF-8"?>
<root>
<prolog>foo</prolog>
<data rev="123B">
<b></b>
</data>
</root>

and we have 'Ignore Nodes by XPath' option activated and containing the following expression: // prolog | // data [@rev]

When invoking Diff Directories for Folder-1 and Folder-2 one difference will be reported, but double-clicking on that difference will not actually reveal any difference in Diff Files dialog. That is, exclusion by XPath has effect only when directly comparing files.

I will add a comment on the mentioned issue already logged, with an express request from you, in order to increase the priority of that issue.

Best regards,
Teo
Teodor Timplaru
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply