Diff Tool and White Spaces

Oxygen general issues.
ajuan
Posts: 18
Joined: Fri Mar 27, 2015 8:17 pm

Diff Tool and White Spaces

Post by ajuan »

One of our writers was using the diff tool in oXygen to compare old files and new files (XML), but it was reporting a lot of differences, because of white spaces.

Specifically, instead of showing a middot (·) to indicate white spaces, a tab character was appearing instead () in one of the files. The strange thing is that if you open the files in Author mode, both files show that the middots were being used. If we lowered the algorithm strength, nothing happened.

We managed to get around this issue by selecting the "Formatting" button that made both files formatted the same way, but is this a bug with the diff tool?

Thanks for any help/response in advance.
Costin
Posts: 829
Joined: Mon Dec 05, 2011 6:04 pm

Re: Diff Tool and White Spaces

Post by Costin »

Hello,

Indeed, the white spaces are normalized whenever the document editing is switched to the Author mode, which was the intended behavior in order to ease visual editing of the documents.

However, please note that when comparing the files using the Diff Files tool, the white spaces can be ignored, so they won't trigger any differences while comparing the files.
To ignore the white spaces, you should go into your Diff settings, either by using the appropriate toolbar shortcut, or through the menu Options > Preferences > Diff / Files Comparison and enable the "Ignore Whitespaces" option.

This would do exactly what Author mode implicitly does and will determine the white space sequences to be normalized into a single space character.

Feel free to let us know whenever you need any other information.

Best Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
ajuan
Posts: 18
Joined: Fri Mar 27, 2015 8:17 pm

Re: Diff Tool and White Spaces

Post by ajuan »

Hi,

Sorry, I should have added in my original post that Ignore White Spaces were on. The problem was that the white spaces were different symbols and I'm guessing that the encoding was being read differently in each file. Even with this setting on, it was reporting the changes.

I did not know that white spaces are normalized in Author mode. This is good information to have on hand. Thanks!
Costin
Posts: 829
Joined: Mon Dec 05, 2011 6:04 pm

Re: Diff Tool and White Spaces

Post by Costin »

Hello,

Thank you for the additional information.

We have also tested this, but could not reproduce it on our side, using the Diff tool in oXygen XML Editor v16.1. It might be a particular situation which triggers this behavior. What version of oXygen XML are you working with?

If this is possible, please send us a sample document for which the issue is reproducible and we will investigate it further.
You should send the sample document on our support email address: support@oxygenxml.com

Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
ajuan
Posts: 18
Joined: Fri Mar 27, 2015 8:17 pm

Re: Diff Tool and White Spaces

Post by ajuan »

Hi,

I have sent the two files that were originally causing problems (before selecting the "Format and Indent Both Files" option).

The files seem to be working now, but there is a jpeg attached to the email that shows that this was happening previously.

Cheers,
Anne
Costin
Posts: 829
Joined: Mon Dec 05, 2011 6:04 pm

Re: Diff Tool and White Spaces

Post by Costin »

Hi,

We have received the files you sent on our support email address and replied there.

We also reproduced this when using the "Words" algorithm for your specific XML documents and suggested to use a different (XML Aware) algorithm (either "XML Fast" or "Auto") for XML documents.

This was logged in our internal tracking system for further investigation.

Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
Costin
Posts: 829
Joined: Mon Dec 05, 2011 6:04 pm

Re: Diff Tool and White Spaces

Post by Costin »

Hello,

We have discussed this situation internally with our developers and we reached to the conclusion that the current behavior (diff tool reporting differences for files compared using the "Word" algorithm) is even the intended one.

This is because words separated by white spaces are considered as two different entities by the Diff tool when using the "words" level comparison algorithm, even if the white spaces are set to be ignored.

For this reason I suggested you to use an XML aware algorithm instead, which ignores the white spaces between the words in your documents.

Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
ajuan
Posts: 18
Joined: Fri Mar 27, 2015 8:17 pm

Re: Diff Tool and White Spaces

Post by ajuan »

Thanks again Costin for your help with this. I'll report back to my writer.

Cheers,
Anne
xinelo
Posts: 33
Joined: Wed Oct 04, 2006 6:25 pm

Re: Diff Tool and White Spaces

Post by xinelo »

I am also getting this issue when comparing by Characters in Diff Files 18.0, when comparing PHP/HTML files.

A bit annoying.

Cheers, Manuel
xinelo
Posts: 33
Joined: Wed Oct 04, 2006 6:25 pm

Re: Diff Tool and White Spaces

Post by xinelo »

I forgot to mention: if I use Characters mode, I get the whitespace highlighted as a difference (often even though I don't see any difference, and even accepting the merge produces dissimilar documents). If I use Auto mode, I get the whole line highlighted, which is even more annoying because I can't see what is different. However, I don't get the issue if I use Words granularity.
Costin
Posts: 829
Joined: Mon Dec 05, 2011 6:04 pm

Re: Diff Tool and White Spaces

Post by Costin »

Hi xinelo,

As I also specified in the previous reply from this older post, it is intended behavior to report differences when comparing files with a non xml-aware algorithms (like Characters, Words, or Lines), depending on each specific algorithm and on the whitespaces placement in specific contexts from the document. If you need Diff to ignore such characters, you should try using an XML aware algorithm (like XML Fast or XML Accurate).

In your case however, it seems you did not set the "Ignore whitespaces" option. That should be enabled even when using XML aware algorithms.
Therefore, please double check that in Options > Preferences > Diff > Files Comparison you enable "Ignore Whitespaces" and apply the changes.
If, even after you set that option and use an XML aware algorithm, the Diff Files tool still reports differences please send us some sample files on our support (the official support email is supportAToxygenxmlDOTcom) on which the behavior is reproducible to investigate further.

Regards,
Costin
Costin Sandoi
oXygen XML Editor and Author Support
Post Reply