how to use Unicode character classes and to validate

Questions about XML that are not covered by the other forums should go here.
osprofi
Posts: 18
Joined: Wed Sep 03, 2008 10:12 pm

how to use Unicode character classes and to validate

Post by osprofi »

Hello Oxygen XML users

I am trying to use unicode character classes in an XMLschema as follows :

for english characters : this works and validates

<xs:simpleType name="enChars_v0.3">
<xs:restriction base="xs:normalizedString">
<xs:pattern value='[\p{IsBasicLatin}]+'/>
</xs:restriction>
</xs:simpleType>

for german characters : this does not work and validate

<xs:simpleType name="deChars_v0.3">
<xs:restriction base="xs:normalizedString">
<xs:pattern value="[\p{IsBasicLatin}\p{InLatin-1_Supplement}]+"/>
</xs:restriction>
</xs:simpleType>

Description: InvalidRegex: Pattern value '[\p{IsBasicLatin}\p{InLatin-1_Supplement}]+' is not a valid regular expression. The reported error was: 'Unknown property.'.

What I am doing wromg here ? I am using the Academic Version of Oxygen XML .

I thank in advance for your help .

Best regards

Peter
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: how to use Unicode character classes and to validate

Post by sorin_ristache »

Hello Peter,
osprofi wrote: <xs:pattern value="[\p{IsBasicLatin}\p{InLatin-1_Supplement}]+"/>
You did not spell correctly the class of codes. The correct name for the set of character codes is Latin-1Supplement. You specify the class of codes CLASS with the expression \p{isCLASS}, not \p{inCLASS} as in your example. XML parsers are very strict about such mistakes. You have to use:

Code: Select all

    <xs:pattern value="[\p{IsBasicLatin}\p{IsLatin-1Supplement}]+"/>
Regards,
Sorin
osprofi
Posts: 18
Joined: Wed Sep 03, 2008 10:12 pm

Re: how to use Unicode character classes and to validate

Post by osprofi »

Hello Sorin

Thanks for Your great help !
Now with your proposal it all works fine !!

best regards

Peter
Post Reply