Preventing non-unicode symbols

<oXygen/> general issues.
kim
Posts: 11
Joined: Fri Sep 09, 2016 9:38 pm

Preventing non-unicode symbols

Post by kim » Tue Aug 13, 2019 7:46 pm

Is it possible to prevent people from selecting non-unicode symbols from the oXygen character map?

There seems to be a discrepancy between what the character map in oXygen is showing and inserting due to non-unicode symbols. For example, if I import the eta symbol from the Symbol font in the character map, it shows in the character map correctly but renders in oXygen as a couple of waves. (In HTML it renders as the unsupported blank square.) At least for the Symbol font, we've found the characters on this list have this behavior: https://www.fileformat.info/info/unicod ... nicode.htm.

I'm wondering how we can keep users from doing this. For example, can we remove fonts or characters from the character map so that it only displays those that are unicode? Or can oXygen warn/error for non-unicode characters? Or maybe there is another solution aside from just communicating a preferred font?

Thanks,
Kim

Radu
Posts: 6541
Joined: Fri Jul 09, 2004 5:18 pm

Re: Preventing non-unicode symbols

Post by Radu » Wed Aug 14, 2019 7:34 am

Hi Kim,

All those symbols we display in the Oxygen Character map are unicode symbols, the unicode standard maps all characters from all languages in the world to an unique integer value.
Problem is that most fonts do not support rendering the entire range of unicode characters, this is why in our Character map dialog we have a "Font" combo box, so that users can use it to see based on the font which will be used (for editing and maybe also for publishing) which characters are available and can be rendered by that particular font.

About this remark:
I'm wondering how we can keep users from doing this. For example, can we remove fonts or characters from the character map so that it only displays those that are unicode? Or can oXygen warn/error for non-unicode characters? Or maybe there is another solution aside from just communicating a preferred font?
I'm not sure what you mean by non-Unicode. Do you mean non ASCII? ASCII is quite a small character range and most of its symbols are already available on the keyboard. How about if you instruct your users that when the character map dialog is used they should select a certain font in it so that they only see the characters available in it?
There is also the possibility to create custom validation rules using Schematron to check if certain used characters are over a certain range but what range would that be?

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

kim
Posts: 11
Joined: Fri Sep 09, 2016 9:38 pm

Re: Preventing non-unicode symbols

Post by kim » Wed Aug 14, 2019 11:05 pm

Hi, Radu.

If you select "Symbol" from the Font combo box, you get, as options, the non-unicode characters that the Symbol font supports. So, you are seeing characters (with a correct preview in the Character Map), that are not available in Oxygen for editing (or publishing) because they are not valid unicode.

Sure, we can instruct users to select a certain (fully unicode) font (and hope they remember), but I'm wondering if, since Oxygen uses unicode, you could suppress non-unicode characters in the Character Map. For example, if, in the Character Map, you select Symbol in the font combo box, and insert Kappa, it shows up in Oxygen as a flat-mouthed face (but has the correct preview in the Character Map). I presume this is because it is a non-unicode character in the Symbol font.

Thanks,
Kim

Radu
Posts: 6541
Joined: Fri Jul 09, 2004 5:18 pm

Re: Preventing non-unicode symbols

Post by Radu » Fri Aug 16, 2019 8:09 am

Hi Kim,

As I stated before, the unicode standard contains all the letters, symbols and characters in the world. So any symbol (even the ones rendered explicitly with the "Symbols" font) is also unicode. There is no "non-unicode" symbol, any symbol, including the ones not properly rendered by various fonts, is part of the unicode standard.
In the Oxygen Preferences->"Appearance / Fonts" page there is an "Author default font" setting. This is the font used by default for Oxygen to render the content in the Author visual editing mode. There are separate fonts used to render the published content, either in the HTML or PDF-based outputs. So yes, it's possible that if your end users select in the "Character Map" a certain font and insert a character which can be properly rendered for that font, the fonts used by Oxygen for editing and by the web browser or PDF reader for rendering the published output would not support rendering that character and show an empty square instead.
Ideally you should know exactly what font you are using for the published output (because this is the most important aspect to get right, the rendering of the characters in the published output). Then go to the Oxygen Preferences->"Appearance / Fonts" page and use the same font for the "Author default font" and also instruct the end users to use the same font in the "Character map" when browsing for characters to use.
You can also try to create custom Schematron validation rules and share them with the team:

https://blog.oxygenxml.com/2017/02/shar ... rules.html

Such a validation rule could for example prohibit the use of certain character ranges.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com

Post Reply