Restricting length of CDATA section value.

This should cover W3C XML Schema, Relax NG and DTD related problems.
Douha
Posts: 17
Joined: Fri Aug 19, 2005 6:02 pm

Restricting length of CDATA section value.

Post by Douha »

Hi all,

I have an XML file where all to elements for the data are contained in CDATA sections. I have a schema created that is working right now although it is pretty dumbed down. I have a need now to insert into the schema a restriction on the length of the values in the different elements.

The xml looks like this;

<DOC_CAT Attribute="Y"><![CDATA[AR]]></DOC_CAT>

The schema for this field currently looks like this;

<xs:element name="DOC_CAT">
<xs:complexType mixed="true">
<xs:attribute name="Attribute" use="required">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Y"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:complexType>
</xs:element>

I have tried putting a maxLength parameter just above the enumeration parameter, but nothing happens when I validate the xml with a field value that is longer than the maxLength parameter. My suspicion is that the maxLength parameter is being applied to the attribute field instead of the cdata section.

How can I apply a length restriction against the value in the cdata section?

Thanks for your help,

Doug Harding
State of Utah
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Post by george »

Hi Doug,

If you want to restrict the text content of an element that has attributes then you need to define that element as having a complex type (to allow attributes) but simple content (to allow restrictions on its text content).
See below a sample schema that does what you want.

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="twoCharactersString">
<xs:restriction base="xs:string">
<xs:length value="2"/>
</xs:restriction>
</xs:simpleType>
<xs:element name="DOC_CAT">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="twoCharactersString">
<xs:attribute name="Attribute" use="required">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Y"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>
Note that CDATA is only a convenient way to represent the information in XML as it allows < for instance to appear unescaped, but from XML Schema validation point of view there is no difference between

Code: Select all


<DOC_CAT Attribute="Y"><![CDATA[AR]]></DOC_CAT> 
and

Code: Select all


<DOC_CAT Attribute="Y">AR</DOC_CAT> 
Best Regards,
George
Douha
Posts: 17
Joined: Fri Aug 19, 2005 6:02 pm

Post by Douha »

Thanks for your help George. That did the trick.

I do have one question for you. I noticed that in the xs:complexType definition that follows the xs:element name line, you removed the mixed="true" attribute. Is the mixed attribute needed? What does it do?

Thanks again for your reply.

Doug
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Post by george »

Hi Doug,

An element can contain
* only character data
* only elements
* both elements and chatacter data - in this case the content is mixed.

If you specify only character data then you can do that with a simple type or, if you also want that element to have attributes you can use a complex type with simple content.
By default when you define a complex type with complex content the element of that type contains only the elements defined in the complex content. The mixed attribute allows you to specify that character data, that is text nodes are also allowed between the child elements.

Hope that helps,
George
Douha
Posts: 17
Joined: Fri Aug 19, 2005 6:02 pm

Post by Douha »

I get it.

Thanks again for all your help.

Doug
Douha
Posts: 17
Joined: Fri Aug 19, 2005 6:02 pm

A follow-on question

Post by Douha »

Hi Sir George,

Woohoo, I finally got a regular licence to oXygen xml today.

From my original question I have discovered a puzzle that I can't seem to figure out.

Some of the fields that I am restricting the length of will either be 1 or two characters or the value null.

Is there a way to restrict the value of a field to either a 1 character value or a null?

I have been studying my XML reference books and can't find a way to do that.

Thanks much,

Doug Harding
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Post by george »

Hi Doug,

You can use a union for combining the simple types. Below you can find a sample schema.

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:simpleType name="oneOrTwoCharacters">
<xs:restriction base="xs:string">
<xs:maxLength value="2"/>
<xs:minLength value="1"/>
</xs:restriction>
</xs:simpleType>

<xs:simpleType name="null">
<xs:restriction base="xs:string">
<xs:enumeration value="null"/>
</xs:restriction>
</xs:simpleType>

<xs:simpleType name="oneOrTwoCharactersOrNull">
<xs:union memberTypes="oneOrTwoCharacters null"/>
</xs:simpleType>


<xs:element name="DOC_CAT">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="oneOrTwoCharactersOrNull">
<xs:attribute name="Attribute" use="required">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Y"/>
</xs:restriction>
</xs:simpleType>
</xs:attribute>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:schema>
Best regards,
George
Douha
Posts: 17
Joined: Fri Aug 19, 2005 6:02 pm

Eureka!!!

Post by Douha »

:D

Hey George. You are the man. Whatever they are paying you, it is not enough.

Thank you, thank you, thank you.

Doug
Post Reply