How to find special characters?

Questions about XML that are not covered by the other forums should go here.
fschmitt
Posts: 9
Joined: Mon Jul 20, 2015 3:31 pm

How to find special characters?

Post by fschmitt »

One of the documents i'm editing in Oxygen 21.1 triggers the "Special characters detected" warning message. I would like to search those characters to check if they are "legitimate" document content, or if they are the result of a transformation problem (it's a converted docx, so it can contain a variety of ugly content... :roll: ).

So, is there a way (regex search) to detect all characters / unicode control codes that may trigger the "Special characters detected" message?

The only "foreign" characters i was able to identify in the document were some greek letters, but since greek doesn't require bidirectional text layout, i doubt if they are responsible for triggering the message.
Radu
Posts: 9055
Joined: Fri Jul 09, 2004 5:18 pm

Re: How to find special characters?

Post by Radu »

Hi,

I'm afraid we do not yet have a way in the application to signal what those complex characters are. Usually this issue is triggered when you have situations in which characters combine (the font may render one symbol for multiple characters). This will mean for example that when moving the cursor using the arrow keys special code will be triggered to properly jump over the combining characters as if they are one symbol.
Enabling the support for complex characters is usually associated to a slowdown when opening and editing the document.

There is an Oxygen GitHub project containing lots of sample plugins which you can download as a zip:

https://github.com/oxygenxml/wsaccess-j ... le-plugins

I just uploaded there a plugin folder called "determineComplexLayoutChars" which can be copied to the "OXYGEN_INSTALL_DIR\plugins" folder. After you start Oxygen the plugin will add a new contextual menu action when an XML document is opened in the Text editing mode. This new "Determine Complex Layout Chars" action should run a detection and then report all characters in the results view.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
fschmitt
Posts: 9
Joined: Mon Jul 20, 2015 3:31 pm

Re: How to find special characters?

Post by fschmitt »

Thanks a lot @Radu - the plugin works great and i was able to solve the issue with your help :D
patjporter
Posts: 53
Joined: Sat May 22, 2021 6:04 pm

Re: How to find special characters?

Post by patjporter »

Hello, can you please provide specific instructions on how to download these files and install them on a Mac?
Thank you,
Patrick
Radu
Posts: 9055
Joined: Fri Jul 09, 2004 5:18 pm

Re: How to find special characters?

Post by Radu »

Hi Patrick,

Download a zip containing the entire project contents:
https://github.com/oxygenxml/wsaccess-j ... master.zip

Inside the zip there are folders, each folder is an Oxygen plugin.
Copy for example the folder "determineComplexLayoutChars" folder to the "OXYGEN_INSTALL_DIR\plugins" folder and then restart Oxygen.

Open an XML document in the text editing mode, right click inside it and there is a new menu item "Determine Complex Layout Chars".

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply