non-greedy regexp in Oxygen
Having trouble installing Oxygen? Got a bug to report? Post it all here.
-
- Posts: 33
- Joined: Wed Oct 04, 2006 6:25 pm
non-greedy regexp in Oxygen
Hi there,
Is it possible to use lazy/non-greedy regexp match in Oxygen?
I'm trying to match elements "trans-unit" which contain a sub-element "target" which contains the text "REMOVE", for example in:
For that I have successfully used the expression:
In the sample code, you can see that there is one newline between the trans-unit opening tag and the source opening tag, hence the \n in my expression.
However, there might be more newlines (in an undetermined quantity) somewhere else in the text I want to match, and to avoid having to use more \n in my regexp I would like to use the expression:
with option "Dot matches all" checked. The ? in the expression should make it non-greedy, however I'm getting many other full trans-unit elements matched by my .+ bit.
Hence my question: Is it possible to use lazy/non-greedy regular expressions in Oxygen? If it is, what am I doing wrong?
Thank you very much!
Cheers, Manuel
Is it possible to use lazy/non-greedy regexp match in Oxygen?
I'm trying to match elements "trans-unit" which contain a sub-element "target" which contains the text "REMOVE", for example in:
Code: Select all
<trans-unit translate="yes" id="114" reformat="yes" xml:space="default">
<source>15.7 x 16.5 x 6.7in</source><target state="translated" state-qualifier="exact-match">"REMOVE"</target>
</trans-unit>
Code: Select all
<trans-unit translate="yes".+?\n.+?"REMOVE"
Code: Select all
... xml:space="default"><-- NEWLINE HERE
<source>...
Code: Select all
<trans-unit translate="yes".+?"REMOVE"
Hence my question: Is it possible to use lazy/non-greedy regular expressions in Oxygen? If it is, what am I doing wrong?
Thank you very much!
Cheers, Manuel
-
- Posts: 2879
- Joined: Tue May 17, 2005 4:01 pm
Re: non-greedy regexp in Oxygen
Hi,
What version of Oxygen are you using (look in Help > About)?
I've tested in the Find/Replace dialog from v14.1 with the expression you have provided (with Dot matches all) and it seems to work as expected: This is non-greedy and matches only one "trans-unit" at a time.
If I use without the "?", then it gets greedy and the match spans from the first "trans-unit" to the last "REMOVE" text from the document.
Regards,
Adrian
What version of Oxygen are you using (look in Help > About)?
I've tested in the Find/Replace dialog from v14.1 with the expression you have provided (with Dot matches all) and it seems to work as expected:
Code: Select all
<trans-unit translate="yes".+?"REMOVE"
If I use
Code: Select all
<trans-unit translate="yes".+"REMOVE"
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
-
- Posts: 33
- Joined: Wed Oct 04, 2006 6:25 pm
Re: non-greedy regexp in Oxygen
Thank you very much for your answer, Adrian.
I am using version 14.1, build 201212121012 <- I could have waited a couple of hours...
If you paste this code in a document:
What do you get matched with the non-greedy expression?
I get two matches: (1) the first trans-unit element, (2) the second, third and fourth trans-unit elements together. The expected result is two matches: (1) the first trans-unit element, (2) the fourth trans-unit element.
Am I missing something?
Cheers,
I am using version 14.1, build 201212121012 <- I could have waited a couple of hours...

If you paste this code in a document:
Code: Select all
<trans-unit translate="yes" id="114" reformat="yes" xml:space="default">
<source>15.7 x 16.5 x 6.7in</source><target state="translated" state-qualifier="exact-match">"REMOVE"</target>
</trans-unit>
<trans-unit translate="yes" id="114" reformat="yes" xml:space="default">
<source>15.7 x 16.5 x 6.7in</source><target state="translated" state-qualifier="exact-match">asdfa</target>
</trans-unit>
<trans-unit translate="yes" id="114" reformat="yes" xml:space="default">
<source>15.7 x 16.5 x 6.7in</source><target state="translated" state-qualifier="exact-match">"adf"</target>
</trans-unit>
<trans-unit translate="yes" id="114" reformat="yes" xml:space="default">
<source>15.7 x 16.5 x 6.7in</source><target state="translated" state-qualifier="exact-match">"REMOVE"</target>
</trans-unit>
Code: Select all
<trans-unit translate="yes".+?"REMOVE"
Am I missing something?
Cheers,
-
- Posts: 2879
- Joined: Tue May 17, 2005 4:01 pm
Re: non-greedy regexp in Oxygen
Hi,
Yes, that's exactly what happens for this particular searched content. This is normal considering the expression you are using and the content you have. My test had a REMOVE string in each trans-unit so it found each one in turn.
The non-greedy expression doesn't mean it will search for the shortest possible match disregarding the start location (I believe that's what you're expecting, but the one you expect is the farthest), it means it will search for the first match starting from the current search position (or from the end of the previous match).
You can fine-tune this with XPath if you don't want the result to span across several trans-unit elements.
Use in the XPath field: //trans-unit
Regards,
Adrian
Yes, that's exactly what happens for this particular searched content. This is normal considering the expression you are using and the content you have. My test had a REMOVE string in each trans-unit so it found each one in turn.
The non-greedy expression doesn't mean it will search for the shortest possible match disregarding the start location (I believe that's what you're expecting, but the one you expect is the farthest), it means it will search for the first match starting from the current search position (or from the end of the previous match).
You can fine-tune this with XPath if you don't want the result to span across several trans-unit elements.
Use in the XPath field: //trans-unit
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
-
- Posts: 33
- Joined: Wed Oct 04, 2006 6:25 pm
Re: non-greedy regexp in Oxygen
Thank you very much, Adrian, for that lucid explanation. It seems my expectations were was on a misunderstanding of what non-greedy matching does.
Restricting the scope in the XPath field works fine. Thanks a lot!
Cheers, Manuel
Restricting the scope in the XPath field works fine. Thanks a lot!

Cheers, Manuel
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ Artificial Intelligence (AI Positron Assistant add-on)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service