[oXygen-user] Search for Characters in Unicode *range*

Tobias Fischer | pagina GmbH tobias.fischer at pagina-tuebingen.de
Fri Jun 24 03:17:40 CDT 2016


Hi Andreas,

sure, this can be done with basic regex query:|[\u00D8-\u00F6]|
||

|And for your example: [\u0100-\u1F9FF] Unfortunately, oXygen 18 seems to 
have a bug with this query (precisely: with 5 digit hex codes) as it 
also matches characters below \u0100 (which is the following of \u00FF). 
However, you can also work with negation: [^\u0000-\u00FF] And this 
seems to work fine :) Regards, Tobias |

Tobias Fischer
XML- und E-Book-Entwicklung

Telefon: +49 (0)7071 9876-44 · Fax: -22
Mail: tobias.fischer at pagina-tuebingen.de

pagina GmbH - Publikationstechnologien
Herrenberger Straße 51 | D-72070 Tübingen
www.pagina-online.de | www.parsx.de

Handelsregister Stuttgart - HRB 380249
Geschäftsführer: Tobias Ott

Am 24.06.2016 um 09:50 schrieb Andreas Wagner:
> Dear all,
>
> In order to make sure that we have caught all special characters in an 
> externally transcribed TEI/XML file, I would like to seach for all 
> characters above Unicode Codepoint 0x00ff. Can this be done in the 
> Regular Expression Find box? (I found the search for single unicode 
> codepoints with \u, \x etc., but can't figure out if this can be used 
> to search for characters (not) in codepoint ranges.
>
> Thanks for any suggestion,
>
> Andreas
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.oxygenxml.com/pipermail/oxygen-user/attachments/20160624/bffc27dc/attachment.html>


More information about the oXygen-user mailing list