Regular expression case sensitivity

Having trouble installing Oxygen? Got a bug to report? Post it all here.
Frank Ralf
Posts: 485
Joined: Thu Jan 23, 2014 2:29 pm
Location: Hamburg
Contact:

Regular expression case sensitivity

Post by Frank Ralf »

Hi,

I am using the regex [a-z]-[a-z] to find unnecessary hyphens in text that was copy and pasted. The usual regex behavior should be to only match lower-case letters. However, the Oxygen search also matches "Baden-Baden" unless I also check the "Case sensitive" option. Is that the intended behavior or a bug?

Best regards,
Frank
Frank Ralf
parson AG
www.parson-europe.com
adrian
Posts: 2881
Joined: Tue May 17, 2005 4:01 pm

Re: Regular expression case sensitivity

Post by adrian »

Hi,

This is going to get a little technical, but, in short, it is as intended. I know it seems a bit off, but I'll explain...

Due to the way the Java regex pattern compiler works (default case sensitive) and the binary nature of the "Case sensitive" UI option, the search can either be case sensitive or case insensitive. There is no neutral position for this option, so as long as the box is cleared, it won't matter if the regex character range is of uppercase or lowercase letters, the Java regex pattern compiler is set up to be case insensitive (includes Unicode case insensitive). So the behavior that you are expecting only works with "Case sensitive" set.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Frank Ralf
Posts: 485
Joined: Thu Jan 23, 2014 2:29 pm
Location: Hamburg
Contact:

Re: Regular expression case sensitivity

Post by Frank Ralf »

Hi Adrian,

Many thanks for your quick reply and the explanation. So the Java regex engine is the culprit. Good to be reminded that not all regex engines are created equal ;-)

Best regards,
Frank
Frank Ralf
parson AG
www.parson-europe.com
adrian
Posts: 2881
Joined: Tue May 17, 2005 4:01 pm

Re: Regular expression case sensitivity

Post by adrian »

We can't quite blame the regex engine. The engine and the UI are hand in hand here.
The UI is thought out so that the average user can just search for something and have the best chance to get a result, so our default search is meant to be case insensitive (box cleared). If you want to fine tune it, you have to opt in for a more accurate search.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Frank Ralf
Posts: 485
Joined: Thu Jan 23, 2014 2:29 pm
Location: Hamburg
Contact:

Re: Regular expression case sensitivity

Post by Frank Ralf »

Point taken ;-)
Frank Ralf
parson AG
www.parson-europe.com
Post Reply