Schema constraint validation

This should cover W3C XML Schema, Relax NG and DTD related problems.
Stefan_E
Posts: 18
Joined: Sat Nov 07, 2009 12:03 am

Schema constraint validation

Post by Stefan_E »

Hi all,

... currently learning about schema constraints (Keys, KeyRefs). It looks strange to me that these constraints are not validated when checking the schema. Consider the following:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
targetNamespace="www.bla.com" xmlns:ns="www.bla.com">

<xs:element name="Top">
<xs:complexType>
<xs:sequence>
<xs:element name="Sub_1">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Sub_2">
<xs:complexType>
<xs:attribute name="SubID" use="optional"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:key name="newKey">
<xs:selector xpath="Sub_2a"/>
<xs:field xpath="@SubID"/>
</xs:key>
</xs:element>
<xs:element name="Sub_1a">
<xs:complexType>
<xs:sequence>
<xs:element name="Sub_2a">
<xs:complexType>
<xs:attribute name="SubIDPtr"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:keyref name="newKeyref" refer="ns:newKey">
<xs:selector xpath="Sub_2a"/>
<xs:field xpath="SubIDPtr"/>
</xs:keyref>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
If I now generate a sample XML document such as

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<Top xmlns="www.bla.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="www.bla.com file:./TestConstraint_1.xsd">
<Sub_1>
<Sub_2/>
<Sub_2/>
</Sub_1>
<Sub_1a>
<Sub_2a/>
</Sub_1a>
</Top>
I get a validation error Identity Constraint error: identity constraint "KeyRef@1cd35c5" has a keyref which refers to a key or unique that is out of scope.
Start location: 11:14


This raises two questions for me:
  • KeyRef@1cd35c5 has 'limited readability' to me as a human
  • why can't this out-of-scope error not be detected at schema level?
Next, I remove the Sub_1a branch of the schema and have this:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="unqualified"
targetNamespace="www.bla.com" xmlns:ns="www.bla.com">

<xs:element name="Top">
<xs:complexType>
<xs:sequence>
<xs:element name="Sub_1">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Sub_2">
<xs:complexType>
<xs:attribute name="SubID" use="optional"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:key name="newKey">
<xs:selector xpath="Sub_2"/>
<xs:field xpath="@SubID"/>
</xs:key>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
with the folling sample document

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<Top xmlns="www.bla.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="www.bla.com file:./TestConstraint_2.xsd">
<Sub_1 xmlns="">
<Sub_2/>
<Sub_2/>
</Sub_1>
</Top>
I get cvc-identity-constraint.4.2.1.a: Element "Sub_1" has no value for the key "newKey".
Start location: 6:10

This is not very surprising, since the attribute was not declared 'required' in the schema. But again - why isn't this discovered at the schema validation level?

Next, I make this a required attribute and change Element Form Default to 'qualified':

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
targetNamespace="www.bla.com" xmlns:ns="www.bla.com">

<xs:element name="Top">
<xs:complexType>
<xs:sequence>
<xs:element name="Sub_1">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Sub_2">
<xs:complexType>
<xs:attribute name="SubID" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:key name="newKey">
<xs:selector xpath="Sub"/>
<xs:field xpath="@SubID"/>
</xs:key>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The sample document

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<Top xmlns="www.bla.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="www.bla.com file:./TestConstraint_3.xsd">
<Sub_1>
<Sub_2 SubID="SubID1"/>
<Sub_2 SubID="SubID1"/>
</Sub_1>
</Top>

now validates despite the identical attributes, since the key is not name-spaced. I'm surprised again: the key xpath="Sub" has evidently nothing to chew on - still at schema validation level, nothing is reported ...

I (believe) I understand all the individual problems at schema level, but what I really wonder about: If I introduce some constraints and verify them, but later modify the schema: How do I consistently guarantee that the constraints are still valid?

The other day, I used a schema 'out of my daily live' as a training vehicle. The E142 schema referenced has made it to an IEEE standard. Still, I find that the constraints are not name-spaced and hence ineffective. So it appears I'm talking a 'real live problem' here?

Am I missing something? Is my request impossible to do or are there tricks to get it done?

Thanks for your help in understanding this!

Stefan

(George, the links to Costello really help! :D )
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Schema constraint validation

Post by adrian »

Hello,

First, the "KeyRef@1cd35c5" from the message is a mistake in either the Xerces code or one of our patches. I'll look that up and see if we can fix it from our side.

Then there's the schema validation issues.
Xerces, the same parser used by Oxygen to validate schemas, performs the same XML validation against a schema even when validating a schema. So the edited schema is validated against the XMLSchema schema. Now, I'm guessing they did extend this validation to some level to actually look after schema specific issues but not that much.
So in short, it's a limitation of the validation engine, Xerces's schema validation isn't smart enough to tell you that the schema is contradicting itself.

However, if you validate the schema with the Saxon-EE validation engine, you will find out that it's a bit smarter and picks a lot of these problems and issues warnings.
e.g. for your first schema it responds with:
Warning: The complex type of element Sub_1a does not allow a child element named Sub_2a
and
Warning: The complex type of element Sub_1 does not allow a child element named Sub_2a

You can find Saxon-EE in the Custom Validation Engines combo box from the Oxygen toolbar(to the right of the Perspective and External Tools toolbars).

Or you can configure a custom validation scenario: Document -> Validate -> Configure Validation Scenario. In the Configure Validation Scenario dialog select 'Use custom validation scenario', make sure the scenarios are for XML Schemas and click New. Choose a name for the scenario and Add a validation unit.
In the validation unit dialog leave the input URL unchanged(${currentFileURL}) so you can use the scenario for various schemas, then select from the Processor combo Saxon-EE and enable "Validate as you type". Close all dialogs with OK.
Now you can associate this scenario with any schema that you want to validate with Saxon-EE.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Stefan_E
Posts: 18
Joined: Sat Nov 07, 2009 12:03 am

Re: Schema constraint validation

Post by Stefan_E »

Hi Adrian,

thanks a lot - indeed Saxon does a better job here, but not a perfect one... will place a corresponding note over at the Saxon boards.

First, I fooled myself, since the first case had @elemetFormDefault="qualified", which I didn't intend for that example :oops: So, for anybody else following here: It should have read

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="unqualified"
targetNamespace="www.bla.com" xmlns="www.bla.com">

<xs:element name="Top">
<xs:complexType>
<xs:sequence>
<xs:element name="Sub_1">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" name="Sub_2">
<xs:complexType>
<xs:attribute name="SubID" use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:key name="newKey">
<xs:selector xpath="Sub_2"/>
<xs:field xpath="@SubID"/>
</xs:key>
</xs:element>
<xs:element name="Sub_1a">
<xs:complexType>
<xs:sequence>
<xs:element name="Sub_2a" maxOccurs="unbounded">
<xs:complexType>
<xs:attribute name="SubIDPtr" form="unqualified" use="required"
/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
<xs:keyref name="newKeyref" refer="newKey">
<xs:selector xpath="Sub_2a"/>
<xs:field xpath="SubIDPtr"/>
</xs:keyref>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Saxon reports then: The child axis will never select any nodes when starting at a node with type of element Sub_2a, as this type requires empty content
Start location: 33:0

Technically correct - semantically a bit at the limit; something containing the words "out of scope" would probably better describe the situation.

The second one - key on optional attribute - is not catched by by Saxon either.

The last is reported by Saxon as The complex type of element Sub_1 does not allow a child element named Sub
Start location: 18:0
which makes sense.

Stefan
Stefan_E
Posts: 18
Joined: Sat Nov 07, 2009 12:03 am

Re: Schema constraint validation

Post by Stefan_E »

adrian wrote: You can find Saxon-EE in the Custom Validation Engines combo box from the Oxygen toolbar(to the right of the Perspective and External Tools toolbars).
Note that if used this way (as opposed to a custom validation scenario), Oxygen/Saxon reports the same warnings, but in Design View, nothing gets highlighted.

Stefan
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Schema constraint validation

Post by adrian »

Stefan_E wrote:
adrian wrote: You can find Saxon-EE in the Custom Validation Engines combo box from the Oxygen toolbar(to the right of the Perspective and External Tools toolbars).
Note that if used this way (as opposed to a custom validation scenario), Oxygen/Saxon reports the same warnings, but in Design View, nothing gets highlighted.
That's correct. Any of the "Custom Validation Engines" are executed as external tools and don't have the same level of integration as the internal engines.

This use of Saxon-EE is more of a "point and shoot"(easily accessible) way, a simple solution of manually validating in Text mode and checking the results.

The Custom Validation Scenario is the way to go if you want a solution well integrated into Oxygen for validating with Saxon-EE.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Post Reply