[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] BIDI problem in XSL-FO

Subject: Re: [xsl] BIDI problem in XSL-FO
From: "Eliot Kimber ekimber@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 3 May 2016 15:58:52 -0000

As it happens I just implemented some code to generate text-level analysis
based on configured character ranges.

The generated template looks like this:

<xsl:template match="text()" mode="epub:textToCharSet-ja_jp">
      <xsl:param name="doDebug" as="xs:boolean" tunnel="yes"
      <!-- Handle language ja_jp-->
      <xsl:if test="$doDebug">
         <xsl:message>+ [DEBUG] epub:textToCharSet-ja_jp:
text="<xsl:value-of select="."/>"</xsl:message>
      <xsl:analyze-string select="." regex="([c-o>]+)">
            <xsl:sequence select="."/>
            <span class="non-native-text">
               <xsl:sequence select="."/>

In this case I'm identifying text *not* in the national language in
question but the same approach can be applied to other business logic of

In an earlier version of this code I had multiple groups in the regular
expression and used a choice group to determine which group had matched by
checking each group to see if it was empty and using the one that was not.



Eliot Kimber, Owner
Contrext, LLC

On 5/3/16, 10:42 AM, "Michael MC<ller-Hillebrand mmh@xxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

>Hi Tony,
>Wow, what an interesting tool this is:
>Unfortunately, in my case the parentheses are likely to be just regular
>text and I have no direct way of knowing whether they surround Arabic or
>Western text (other than trying to find some all-purpose magic XPath
>analyzing basically every text() node). But the content inside the
>parentheses is tagged as non-translateable and I can take advantage of
><p>ARABIC <nt>Brand name</nt> (<nt>Former name</nt>) TEXT.</p>
>By playing around with the tool (and without proper understanding of the
>rules) I find some options that would make the parentheses correct, but
>the preceding or following Arabic text will be ordered in the wrong way.
>I have the impression that direction control characters in this situation
>do not as well as <fo:bidi-override> would work. Unfortunately I have not
>heard back, whether the presentation as
>.TXET (Former name) Brand name CIBARA
>is accepted by the client.
>- Michael
>BTW: I hope this is still on topic enough. That's why I mentioned XPath.
>> Am 03.05.2016 um 14:21 schrieb Tony Graham tgraham@xxxxxxxxxxxxx
>> tldr: Put &#x200E; after the ')'.
>> As Michael notes below, some characters, such as Latin letters, have a
>> 'strong' directionality, and some have a 'weak' or 'neutral'
>> directionality. The closing ')' is a 'neutral', and because it's at the
>> end of the string, it takes the 'embedding direction' [5], which is RTL
>> in Michael's example. You can see this with the bidi utility at

Current Thread