Page 1 of 1

Detecting a zero length characters in Oxygen Editor!

Posted: Wed Dec 11, 2019 6:24 pm
by mu258770
Hi team,

We are using Oxygen XML Author 20.1 Eclipse plugin version.

We have a query that we need to identify if there is any zero-length character is present in the opened topic/map.

For example, if we copy some content from some other editor to Oxygen, there can be sometimes the zero-length character (ex:- ​) gets inserted but will not be shown in the editor as UTF-8 will not be able to show. But this can be seen if we convert the content to ANSI in notepad++.

Is there a way to detect these characters in Oxygen and either to remove or at least show a warning to the user when these are detected. May be a schematron warning?

Please suggest.

Regards,
Shabeer

Re: Detecting a zero length characters in Oxygen Editor!

Posted: Thu Dec 12, 2019 11:38 am
by tavy
Hello Shabeer,

You can create a Schematron that will detect and show you an waning when a zero-length character is detected. You can also have some quick fix actions that will replace the character with space for example, or will delete the character. The Schematron can look something like this:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt3"
    xmlns:sqf="http://www.schematron-quickfix.com/validator/process">
 <sch:pattern>
     <sch:rule context="text()">
         <sch:report test="matches(., '&#8203;')" sqf:fix="replace delete" role="warn">
            The text contains zero-length characters
         </sch:report>
         
         <sqf:fix id="replace">
             <sqf:description>
                 <sqf:title>Replace the zero-length characters with a space</sqf:title>
             </sqf:description>
             <sqf:stringReplace regex="&#8203;" select="' '"/>
         </sqf:fix>
         
         <sqf:fix id="delete">
             <sqf:description>
                 <sqf:title>Delete the zero-length characters</sqf:title>
             </sqf:description>
             <sqf:stringReplace regex="&#8203;"/>
         </sqf:fix>
     </sch:rule>
 </sch:pattern>   
</sch:schema>
Best Regards,
Octavian

Re: Detecting a zero length characters in Oxygen Editor!

Posted: Fri Dec 13, 2019 2:38 pm
by mu258770
Hi Octavian.

That works good! Thank you.

One more thing, is it possible to check for any characters above 7F (basically accepting only US-ASCII), and removing all occurrences of these? Or do we need to specify the exact matching character only as in your response?

Regards,
Shabeer

Re: Detecting a zero length characters in Oxygen Editor!

Posted: Fri Dec 13, 2019 5:51 pm
by tavy
Hello Shabeer,

You can define also a range of characters that are not allowed. You can have also a quick fix that will remove all not allowed characters from the document. I think in this case the Schematron must look something like this:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt3"
    xmlns:sqf="http://www.schematron-quickfix.com/validator/process">
 <sch:pattern>
     <sch:rule context="text()">
         <sch:report test="matches(., '[&#127;-&#65533;]')" sqf:fix="delete deleteAll" role="warn">
            The text contains not allowed character characters
         </sch:report>
         
         <sqf:fix id="delete">
             <sqf:description>
                 <sqf:title>Delete not allowed characters from current text</sqf:title>
             </sqf:description>
             <sqf:stringReplace regex="[&#127;-&#65533;]"/>
         </sqf:fix>
         
         <sqf:fix id="deleteAll">
             <sqf:description>
                 <sqf:title>Delete all not allowed characters from document</sqf:title>
             </sqf:description>
             <sqf:stringReplace match="//text()" regex="[&#127;-&#65533;]"/>
         </sqf:fix>
     </sch:rule>
 </sch:pattern>   
</sch:schema>
Best Regards,
Octavian