Page 1 of 1

Searching for Missing Spaces in Content

Posted: Tue Jan 31, 2023 9:33 pm
by tpopp
Is there a way to globally search for missing spaces in Author or Text mode following a closing DITA element and insert the missing space?
For example, find <ph some text />X and replace it with <ph some text /> X
Basically, search for any character following a closing element tag that is not punctuation, like a comma, period, question mark, etc.?
It seems easy to insert DITA elements in Author mode and forget to put a blank space after them in you are showing Full tags.

Re: Searching for Missing Spaces in Content

Posted: Wed Feb 01, 2023 5:09 am
by chrispitude
Hi tpopp,

You could open Window > Show View > XPath/XQuery Builder, then search for the following XPath expression:

Code: Select all

//(ph|b|u|i|codeph)[matches(., '[A-Z][a-z]$')][following-sibling::node()[1][matches(., '^[A-Za-z]')]]
This searches for any <ph>, <b>, <u>, or <i> element that ends in a letter, followed by plaintext or another element that also begins with a letter. You can add more elements to the list as needed. If you know regular expressions, you can adjust the match() pattern to include additional character types.

To consider elements that might have variable text, you can include elements that have @keyref:

Code: Select all

//(ph|b|u|i|codeph)[matches(., '[A-Z][a-z]$') or @keyref][following-sibling::node()[1][matches(., '^[A-Za-z]')]]
In regular expressions, "^" matches the beginning of a string and "$" matches the end of the string. You can flip the XPath expression around as follows to search the beginning of elements too:

Code: Select all

//(ph|b|u|i|codeph)[matches(., '^[A-Z][a-z]')][preceding-sibling::node()[1][matches(., '[A-Za-z]$')]]
You could even turn these into Schematron checks so they would be interactively highlighted in the editing window in real-time.

Re: Searching for Missing Spaces in Content

Posted: Wed Feb 01, 2023 8:52 am
by Radu
Hi,

The alternative using Oxygen's "Find/Replace in Files" dialog, search for something like </(ph|b|u|i|codeph)>[a-zA-Z]+ with the "Regular expression" checkbox checked.

Regards,
Radu

Re: Searching for Missing Spaces in Content

Posted: Wed Feb 01, 2023 8:54 am
by Radu
About this useful remark from Chris:
You could even turn these into Schematron checks so they would be interactively highlighted in the editing window in real-time.
There is this article on the Oxygen XML blog about adding your own Schematron schema to a DITA framework configuration:
https://blog.oxygenxml.com/topics/shari ... rules.html

Regards,
Radu

Re: Searching for Missing Spaces in Content

Posted: Wed Feb 01, 2023 4:36 pm
by tpopp
Radu wrote: Wed Feb 01, 2023 8:52 am Hi,

The alternative using Oxygen's "Find/Replace in Files" dialog, search for something like </(ph|b|u|i|codeph)>[a-zA-Z]+ with the "Regular expression" checkbox checked.

Regards,
Radu
Radu,
This seems to work, thank you! What is the + for at the end? If I want to do this as a global search in O2, can I do that? What about if I want to insert the space in an individual occurrence or globally? Can you save regular expression find/replace strings in O2 for others to use?

Re: Searching for Missing Spaces in Content

Posted: Wed Feb 01, 2023 5:05 pm
by chrispitude
And for either method (regular-expression search or XPath search), you can create a test topic with example constructs that should be found or not found, then experiment on that single file with the scope set to Current File. This can save time and build confidence in the technique before you apply it to a larger scope.