Page 1 of 1

Xerces cos-nonambig schema validation error

Posted: Thu Jul 07, 2011 4:26 pm
by antialias
I'm struggling with an error message when validating an XSD. So I implemented an easy example of the same issue using DTD, which works fine.

Here's the DTD:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!ELEMENT ParaType1 (#PCDATA) >
<!ELEMENT ParaType2 (#PCDATA) >
<!ELEMENT List (ListItem+) >
<!ELEMENT ListItem ((ParaType1 | List)+ | (ParaType2 | List)+) >
Converting this DTD to XSD (using Oxygen's "Generate/Convert Schema" function) results in this XSD:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

<xs:element name="ParaType1" type="xs:string"/>

<xs:element name="ParaType2" type="xs:string"/>

<xs:element name="List">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="ListItem"/>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="ListItem">
<xs:complexType>
<xs:choice>
<xs:choice maxOccurs="unbounded">
<xs:element ref="ParaType1"/>
<xs:element ref="List"/>
</xs:choice>
<xs:choice maxOccurs="unbounded">
<xs:element ref="ParaType2"/>
<xs:element ref="List"/>
</xs:choice>
</xs:choice>
</xs:complexType>
</xs:element>

</xs:schema>
This XSD does not validate. The error given is as follows:
E [Xerces] cos-nonambig: List and List (or elements from their substitution group) violate "Unique Particle Attribution". During validation against this schema, ambiguity would be created for those two particles.
I don't understand why something as easy as (A | C)+ | (B | C)+ should produce an error. But I'm new to XSD so I hope there's just some trivial change needed. I would appreciate help on this matter.

Re: Xerces cos-nonambig schema validation error

Posted: Thu Jul 07, 2011 4:59 pm
by george
Hello,

Please see http://www.w3.org/TR/xmlschema-1/#cos-nonambig

If you have a model like

Code: Select all


(A | C)+ | (B | C)+
then when a parser will encounter C as the first element it will not be able to match that against only one particle because both C will from your model will match it so the model is ambiguous.

You can write that in non ambiguous way as below:

Code: Select all


(A, (A|C)*) |
(B, (B|C)*) |
(C+, (
(A, (A|C)*) |
(B, (B|C)*)
)?
)
As you can see these rewrites are not always easy in in some cases it is impossible to obtain a non ambiguous schema.

You can disable this check from the oXygen options [1] but that will not solve anything with the schema, only that you will not see these errors.

[1]
Options -> Preferences -- XML / XML Parser, disable the ...schema-full-checking option.

Best Regards,
George

Re: Xerces cos-nonambig schema validation error

Posted: Thu Jul 07, 2011 6:01 pm
by antialias
Hi George. Thank you so much! I've converted your answer into XSD and now everything works as it's supposed to. Here's the code:

Code: Select all

	<xs:element name="ListItem">
<xs:complexType>
<xs:choice>
<xs:sequence>
<xs:element ref="ParaType1"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="ParaType1"/>
<xs:element ref="List"/>
</xs:choice>
</xs:sequence>
<xs:sequence>
<xs:element ref="ParaType2"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="ParaType2"/>
<xs:element ref="List"/>
</xs:choice>
</xs:sequence>
<xs:sequence>
<xs:element ref="List" minOccurs="1" maxOccurs="unbounded"/>
<xs:choice minOccurs="0" maxOccurs="1">
<xs:sequence>
<xs:element ref="ParaType1"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="ParaType1"/>
<xs:element ref="List"/>
</xs:choice>
</xs:sequence>
<xs:sequence>
<xs:element ref="ParaType2"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="ParaType2"/>
<xs:element ref="List"/>
</xs:choice>
</xs:sequence>
</xs:choice>
</xs:sequence>
</xs:choice>
</xs:complexType>
</xs:element>
The complexity of such a simple challenge is kind of a drawback from using XSD, but then again XSD has so many other benefits compared to DTD.

Thanks again for your help!

Re: Xerces cos-nonambig schema validation error

Posted: Thu Jul 07, 2011 11:11 pm
by george
You may want to review also
http://www.w3.org/TR/xml/#determinism
and Tim Bray comments on this
http://www.xml.com/axml/notes/Determinism.html

So basically the spec says that if ambiguous content models are created some parsers may report an error.

As an alternative you may want to explore also
Relax NG
Schematron
or maybe XML Schema 1.1

Relax NG allows ambiguous content and you can express that content model directly.
Schematron can be used in addition to XML Schema and you can have a simple model in the XML Schema that will say (A | B | C)+ and add a Schematron rule that asserts that both A and B are not present, otherwise it will fail rising an error.
XML Schema 1.1 adds some features of Schematron and you can have assertions directly in XML Schema 1.1.

Best Regards,
George