Page 1 of 1

Special font for Unicode blocks

Posted: Sat Dec 24, 2011 8:21 am
by mcswell
We've been writing grammars in DocBook (v5), and I just downloaded the trial version of Oxygen Author to see whether it would work for us. I've hit a snag regarding fonts for one of our languages, Dhivehi. This language uses the Thaana script, which is in the Unicode Thaana block (U+0780 - U+07BF). We have a font which contains the characters in this block, but it does not contain ASCII or other characters we need for our Roman text. In other words, there is no one font that works for both ordinary text and Dhivehi text.

We embed Dhivehi text in one of two ways. The simplest is to assign an xml:lang='div' attribute to an element, such as the <para> or <phrase> element. A more complex method uses some tags which our DocBook customization defines, also in conjunction with the xml:lang='div' attribute. I'm assuming there's a way to tell OA to use our customized DocBook schema, but how do I tell it to *render* elements that have the xml:lang='div' attribute using our Thaana font? Do I have to modify a CSS?

BTW, the Thaana script is (like Arabic) displayed right-to-left. It would be nice if whatever solution there is to the font issue would automatically render the text right-to-left, although if need be we can use the dir='rtl' attr on elements containing Thaana script. We do also have grammars of languages that use Perso-Arabic script, and I suppose there are similar issues with those scripts, i.e. there isn't a single font that displays all the characters of the Unicode Arabic block plus all the Roman characters we use (some of which are in the extended Latin and IPA blocks).

I guess I should add that this is about on-screen display. We run our XML documents through dblatex and use XeTeX to format them, so there's no issue there (we don't use XSL-FO).

Mike Maxwell

Re: Special font for Unicode blocks

Posted: Tue Dec 27, 2011 6:36 pm
by sorin_ristache
Hello,

An appropriate font should be set in the CSS stylesheet for each language, that is the default font (or other Latin font) for the ASCII fragments, the Thaana font for the Dhivehi language, etc. something like:

Code: Select all

*[xml|lang='div'] {
font-family: Thaana;
}
Anyway right-to-left text is not supported yet in Author editing mode. In the current version right-to-left editing is supported only in Text mode. However we plan to add this Author feature in a future version.


Regards,
Sorin

Re: Special font for Unicode blocks

Posted: Wed Dec 28, 2011 5:09 am
by mcswell
Thanks, I now have the CSS working so that my Thaana font is used for Dhivehi/ Thaana text. However, it's displaying Dhivehi/ Thaana text left-to-right, while it displays Arabic text right-to-left just fine. I didn't need to do anything to the CSS for Arabic; I don't know how this correct behavior for Arabic is wired in. (Cursor movement is wrong in Arabic, but at least it displays correctly. I assume the cursor movement is what you're referring to when you say "right-to-left text is not supported yet in Author editing mode.")

I've tried setting the Dhivehi <para>'s dir attr to 'rtl' (right-to-left) or 'rlo' (right-to-left override), and I've also tried setting the CSS direction property for Dhivehi text to 'rtl'. None of this seems to have any effect.

I also notice that when I edit the same file in text mode, for Dhivehi (but not for Arabic) the visible cursor is different from the text insertion point--the visible cursor will be at one end of the line, but when I insert characters, they show up at the other end of the line. I'm not too worried about that, because I don't see our folks using the text-based editor, but it does point up the fact that Dhivehi/ Thaana script is treated very differently from Arabic script, despite the fact that both are right-to-left scripts.

Coming back to the Author editing mode, why does it display Arabic text right-to-left, but not Dhivehi/ Thaana text? Is this perhaps a bug in Java? And why is it ignoring the XML @dir attribute?

I can upload my CSS file and the Dhivehi+Arabic DocBook file, if that would help.

Mike Maxwell

Re: Special font for Unicode blocks

Posted: Wed Dec 28, 2011 5:04 pm
by sorin_ristache
mcswell wrote:I've tried setting the Dhivehi <para>'s dir attr to 'rtl' (right-to-left) or 'rlo' (right-to-left override), and I've also tried setting the CSS direction property for Dhivehi text to 'rtl'. None of this seems to have any effect.
In Author editing mode only partial support is available for bidirectional text and only for drawing the characters, not editing them. The Thaana Unicode range is not supported yet. We plan to enhance this support in a future version and we will update this forum topic at that time.
mcswell wrote:I also notice that when I edit the same file in text mode, for Dhivehi (but not for Arabic) the visible cursor is different from the text insertion point--the visible cursor will be at one end of the line, but when I insert characters, they show up at the other end of the line. I'm not too worried about that, because I don't see our folks using the text-based editor, but it does point up the fact that Dhivehi/ Thaana script is treated very differently from Arabic script, despite the fact that both are right-to-left scripts.
The Thaana Unicode range (0780–07BF) is not the same as the Arabic ones (0600—06FF, 0750—077F, FB50—FDFF, etc). The difference may come also from the different fonts used for these ranges. Rendering depends also on the correct character sizes reported by the font for each character. We will look into this problem in Text mode too.

What font do you use? Please try also with the Arial Unicode MS font which has good support for the Arabic Unicode ranges.
mcswell wrote:Coming back to the Author editing mode, why does it display Arabic text right-to-left, but not Dhivehi/ Thaana text? Is this perhaps a bug in Java?
The built-in Java support for bidirectional text is far from perfect, but as I said above it depends also on the font metrics reported by the font itself. We will look into the problem.
mcswell wrote:I can upload my CSS file and the Dhivehi+Arabic DocBook file, if that would help.

Please send us some test files (CSS + DocBook XML with Dhivehi+Arabic).


Thank you,
Sorin

Re: Special font for Unicode blocks

Posted: Wed Dec 28, 2011 9:18 pm
by mcswell
> Rendering depends also on the correct character sizes
> reported by the font for each character. We will look
> into this problem in Text mode too.

I don't think that's the issue with Thaana script, the font size looks fine. It's just not being displayed right-to-left.

> What font do you use?

Mv Elaaf Normal, which I got here:
http://sites.google.com/site/iheckersit ... vehi-fonts
or here:
http://mvlinux.blogspot.com/2010/02/tha ... x-deb.html

> Please try also with the Arial Unicode MS font
> which has good support for the Arabic Unicode ranges.

Arabic works fine with lots of fonts. I use the SIL Scheherazade font, which has complete coverage of code points needed for e.g. Pashto. But Arabic is not the problem, Dhivehi is, and that's in the Thaana block of Unicode, which neither Arial Unicode nor the Scheherazade font covers.

> Please send us some test files (CSS + DocBook XML with
> Dhivehi+Arabic).

Will do.

Re: Special font for Unicode blocks

Posted: Fri Dec 30, 2011 5:08 am
by mcswell
Sorin wrote:
>> Please send us some test files (CSS + DocBook XML with
>> Dhivehi+Arabic).
to which I replied:
> Will do.

I uploaded the two files.

Re: Special font for Unicode blocks

Posted: Fri Dec 30, 2011 11:23 am
by adrian
Hi,

Unfortunately the technical support form from our website is broken and we did not receive the files.

Could you please send us the sample files to our support email address:
support@oxygenxml.com

Apologies for the inconvenience.

Regards,
Adrian

Re: Special font for Unicode blocks

Posted: Fri Dec 30, 2011 1:10 pm
by adrian
On another note,

We have performed a few tests with Dhivehi/Thaana text and it seems to be a Java problem. The Thaana Unicode range (0780–07BF) character range does not trigger the BIDI/RTL mode in Java. So the Thaana script seems to usually be treated by Java as LTR.

However, we have discovered that if BIDI/RTL mode is triggered by some other means(e.g. Arabic characters) in the edited document(Text mode), the Dhivehi/Thaana text is also rendered correctly(RTL). So there is some inconsistency in this behavior.
The Author mode inherits this behavior but only at element level(not document level) so that's why the Arabic text has the correct orientation, but not the Thaana text. Though, like it was mentioned before, there is no explicit RTL support in Author mode(hence the caret navigation/editing issues).

We have logged this to our issue tracking tool and we will continue investigating. Maybe we can somehow force RTL for this character range. We'll have to check if it's possible and also if it's safe to do so.

Regards,
Adrian

Re: Special font for Unicode blocks

Posted: Tue Jan 03, 2012 2:57 am
by mcswell
adrian wrote:Hi,

Unfortunately the technical support form from our website is broken and we did not receive the files.

Could you please send us the sample files to our support email address:
support@oxygenxml.com
I emailed the files--somehow I didn't get an email notification when you posted this reply, so I'm only now sending them.

Re: Special font for Unicode blocks

Posted: Tue Jan 03, 2012 3:00 am
by mcswell
adrian wrote: We have performed a few tests with Dhivehi/Thaana text and it seems to be a Java problem.
I figured. I don't suppose Oracle is likely to fix this... Thanks for looking at it!

Re: Special font for Unicode blocks

Posted: Tue Jan 03, 2012 1:27 pm
by adrian
We will have to double check on our side and if the bug is indeed on their side(Oracle/Java), we will forward it to them.

Either way, we will attempt to find a workaround for this on our side.

Regards,
Adrian

Re: Special font for Unicode blocks

Posted: Thu Jan 12, 2012 3:33 pm
by mcswell
Is this by chance one of the "BIDI-text-related issues [that] have been fixed" in today's release?

Re: Special font for Unicode blocks

Posted: Thu Jan 12, 2012 4:01 pm
by adrian
Hi,

Today's release, v13.2, fixes some generic problems that Oxygen had with RTL/BIDI regarding orientation and drawing of text(in Text mode).
This bugfix is not related to the Dhivehi issue.

Regards,
Adrian

Re: Special font for Unicode blocks

Posted: Tue Jan 24, 2012 6:44 am
by mcswell
Any success tracking down this bug with Dhivehi/ Thaana? Do you know for sure whether it's a bug in Java itself?

Re: Special font for Unicode blocks

Posted: Tue Jan 24, 2012 5:56 pm
by sorin_ristache
Hello,

We double checked and we confirm that the Java support for bidirectional text is not triggered for Dhivehi/Thaana text alone as specified by Adrian (see below the quote). However bidirectional support is triggered for a text that includes Dhivehi/Thaana only if the same text includes also some codepoints from some other type of right-to-left Unicode range, for example Arabic. Hopefully Oracle will address this problem in a future release of their Java virtual machine, as implementing a fix/workaround for this limitation in the Oxygen code would most probably take us longer than the Oracle Java team would need to fix it in their standard libraries.
adrian wrote:We have performed a few tests with Dhivehi/Thaana text and it seems to be a Java problem. The Thaana Unicode range (0780–07BF) character range does not trigger the BIDI/RTL mode in Java. So the Thaana script seems to usually be treated by Java as LTR.

However, we have discovered that if BIDI/RTL mode is triggered by some other means(e.g. Arabic characters) in the edited document(Text mode), the Dhivehi/Thaana text is also rendered correctly(RTL). So there is some inconsistency in this behavior.
The Author mode inherits this behavior but only at element level(not document level) so that's why the Arabic text has the correct orientation, but not the Thaana text.

Regards,
Sorin

Re: Special font for Unicode blocks

Posted: Tue Jan 24, 2012 10:34 pm
by mcswell
Thanks! I can imagine that this is not something you'd want to create a work-around for; I don't imagine Dhivehi is something most of your users work with.

Has the Java bug been reported to Oracle? (I presume that's who controls the bug list for Java) Do they have an open bug tracking list, i.e. is there a way I can track the bug so I know when it's been fixed in Java? (like maybe here: http://bugs.sun.com/bugdatabase/)

Re: Special font for Unicode blocks

Posted: Wed Jan 25, 2012 4:18 pm
by sorin_ristache
Dear Mike Maxwell,

I filed a bug report to the Oracle bug database and they should make the report public on their website one of these days after they review, evaluate and reproduce the problem on their test machines. I will update this discussion with the URL of the public bug.


Regards,
Sorin

Re: Special font for Unicode blocks

Posted: Wed Jan 25, 2012 5:07 pm
by mcswell
Thank you!!!!!!!!

Re: Special font for Unicode blocks

Posted: Tue Feb 21, 2012 10:55 am
by sorin_ristache
Hello,
mcswell wrote:Has the Java bug been reported to Oracle? (I presume that's who controls the bug list for Java) Do they have an open bug tracking list, i.e. is there a way I can track the bug so I know when it's been fixed in Java? (like maybe here: http://bugs.sun.com/bugdatabase/)
Oracle did not respond in any way to our bug report about this problem after a month from reporting it. They did not confirm the bug yet, they did not say it was not a bug and they did not request additional data.


Regards,
Sorin

Re: Special font for Unicode blocks

Posted: Wed Feb 22, 2012 6:33 am
by mcswell
Thanks for the update!

Is it the case that they don't post submitted bugs until they confirm them? I don't see anything in their database about Thaana or Dhivehi (or other spellings of these names).

Re: Special font for Unicode blocks

Posted: Wed Feb 22, 2012 9:47 am
by sorin_ristache
I don't know their current policy but usually they confirmed a bug in the past or requested more data when a bug report was sent to them.

There are other bugs about Thaana already confirmed.


Regards,
Sorin