Page 1 of 1

how to use Unicode character classes and to validate

Posted: Wed Sep 03, 2008 10:23 pm
by osprofi
Hello Oxygen XML users

I am trying to use unicode character classes in an XMLschema as follows :

for english characters : this works and validates

<xs:simpleType name="enChars_v0.3">
<xs:restriction base="xs:normalizedString">
<xs:pattern value='[\p{IsBasicLatin}]+'/>
</xs:restriction>
</xs:simpleType>

for german characters : this does not work and validate

<xs:simpleType name="deChars_v0.3">
<xs:restriction base="xs:normalizedString">
<xs:pattern value="[\p{IsBasicLatin}\p{InLatin-1_Supplement}]+"/>
</xs:restriction>
</xs:simpleType>

Description: InvalidRegex: Pattern value '[\p{IsBasicLatin}\p{InLatin-1_Supplement}]+' is not a valid regular expression. The reported error was: 'Unknown property.'.

What I am doing wromg here ? I am using the Academic Version of Oxygen XML .

I thank in advance for your help .

Best regards

Peter

Re: how to use Unicode character classes and to validate

Posted: Mon Sep 08, 2008 4:09 pm
by sorin_ristache
Hello Peter,
osprofi wrote: <xs:pattern value="[\p{IsBasicLatin}\p{InLatin-1_Supplement}]+"/>
You did not spell correctly the class of codes. The correct name for the set of character codes is Latin-1Supplement. You specify the class of codes CLASS with the expression \p{isCLASS}, not \p{inCLASS} as in your example. XML parsers are very strict about such mistakes. You have to use:

Code: Select all

    <xs:pattern value="[\p{IsBasicLatin}\p{IsLatin-1Supplement}]+"/>
Regards,
Sorin

Re: how to use Unicode character classes and to validate

Posted: Tue Sep 09, 2008 2:09 pm
by osprofi
Hello Sorin

Thanks for Your great help !
Now with your proposal it all works fine !!

best regards

Peter