[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
[xsl] Support For Automatic Thai Word Breaking In XSL-FO
Subject: [xsl] Support For Automatic Thai Word Breaking In XSL-FO From: "W. Eliot Kimber" <eliot@xxxxxxxxxx> Date: Tue, 29 Jan 2002 09:36:32 -0600 |
I have to eventually support the composition of Thai documents. The primary challenge I see there is doing automatic word breaking of Thai. As I understand it, the Thai language does not have a well-defined notion of word and therefore Thai as normally written may not have enough break points to allow lines to be properly flowed. In my research into the issue I've found some software (written for TeX) that does automatic line breaking but I didn't find anything that had been integrated with any XSLT or XSL-FO processors. As far as I can discover, MS Word is the main non-TeX tool that provides acceptable Thai word breaking. My question: has anybody integrated any Thai word breaking algorithms into an XSL context? In looking at the free code that's out there, it looks like it wouldn't be too hard to extend Saxon, for example, to apply the word breaking algorithm to text nodes when xml:lang="th". It's not enough to do a pre-process on the XML document using the existing code because the Thai characters may be represented as numeric character references or character entities and the existing code expects some form of Unicode or Thai code page encoding. Thus, the algorithms would need to be applied post-parse. Thanks, Eliot Kimber ISOGEN International, LLC XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Remove duplicates from a , Joerg Heinicke | Thread | Re: [xsl] Support For Automatic Tha, David Carlisle |
Re: [xsl] Re: Hi Dimitre, question , Ahmad J Reeves | Date | Re: [xsl] Support For Automatic Tha, David Carlisle |
Month |