Regex not working properly in Find and Replace
Having trouble installing Oxygen? Got a bug to report? Post it all here.
-
- Posts: 5
- Joined: Thu Nov 21, 2019 4:29 pm
Regex not working properly in Find and Replace
In trying to find and replace a simple HTML code snippet in an EPUB archive which has different text in the <p> container. Using Regex I can do this easily in two other editors (UltraEdit and Visual Studio Code) but Oxygen has issues that make this impossible.
What I'm trying to do is add <b> around the image description text for each image (a few hundred images in multiple files inside an EPUB archive):
Original code:
<p>Image Description, different in each case</p>
<figure class="image"><img alt="image" src="image.jpg"/>
</figure>
Desired result:
<p><b>Image Description, different in each case</b></p>
<figure class="image"><img alt="image" src="images.jpg"/>
</figure>
I'm not a Regex guru in any way but the code below works, as said above, in UltraEdit and Visual Studio Code.
The Regex I used is this:
Find:
<p>(.*)</p>
<figure class="image">
Replace:
<p><b>$1</b></p>
<figure class="image">
What happens in Oxygen is that it finds the first <p> in the code and then ends with the last <figure class="image"> code which of course makes this totally unworkable. Screenshot from Oxygen: https://imgur.com/7HB2okk
This behaves predictably (to me) in UltraEdit, see screenshot: https://imgur.com/hbfnKH0
Is this a bug in Oxygen or do I somehow need to tailor my Regex towards the Oxygen implementation of Regex? Or should I be using some other (Regex?) method?
P.S: I know that Regex generally is not good for HTML but I've used it in the past for simple tasks like this with good results.
What I'm trying to do is add <b> around the image description text for each image (a few hundred images in multiple files inside an EPUB archive):
Original code:
<p>Image Description, different in each case</p>
<figure class="image"><img alt="image" src="image.jpg"/>
</figure>
Desired result:
<p><b>Image Description, different in each case</b></p>
<figure class="image"><img alt="image" src="images.jpg"/>
</figure>
I'm not a Regex guru in any way but the code below works, as said above, in UltraEdit and Visual Studio Code.
The Regex I used is this:
Find:
<p>(.*)</p>
<figure class="image">
Replace:
<p><b>$1</b></p>
<figure class="image">
What happens in Oxygen is that it finds the first <p> in the code and then ends with the last <figure class="image"> code which of course makes this totally unworkable. Screenshot from Oxygen: https://imgur.com/7HB2okk
This behaves predictably (to me) in UltraEdit, see screenshot: https://imgur.com/hbfnKH0
Is this a bug in Oxygen or do I somehow need to tailor my Regex towards the Oxygen implementation of Regex? Or should I be using some other (Regex?) method?
P.S: I know that Regex generally is not good for HTML but I've used it in the past for simple tasks like this with good results.
-
- Posts: 2879
- Joined: Tue May 17, 2005 4:01 pm
Re: Regex not working properly in Find and Replace
Hello,
That happens because you're using a greedy pattern, (.*). This matches as much as possible and you've probably also checked the option "Dot matches all" which makes it greedy across multiple lines.
So, use the lazy pattern (.*?) instead of (.*), or clear the box for "Dot matches all". I would recommend the former, but also be aware of the latter.
Regards,
Adrian
That happens because you're using a greedy pattern, (.*). This matches as much as possible and you've probably also checked the option "Dot matches all" which makes it greedy across multiple lines.
So, use the lazy pattern (.*?) instead of (.*), or clear the box for "Dot matches all". I would recommend the former, but also be aware of the latter.
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
-
- Posts: 5
- Joined: Thu Nov 21, 2019 4:29 pm
Re: Regex not working properly in Find and Replace
Hi Adrian
Thank you, this works perfectly!
I had to do both, use this pattern:
Find:
<p>(.*?)</p>
<figure class="image">
and also uncheck the "Dot matches all"
My best
Gunnar
Thank you, this works perfectly!
I had to do both, use this pattern:
Find:
<p>(.*?)</p>
<figure class="image">
and also uncheck the "Dot matches all"
My best
Gunnar
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ Artificial Intelligence (AI Positron Assistant add-on)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service