| | | | 
XML Schema Part 1: Structures Second Edition
W3C Recommendation 28 October 2004- This version:
- http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/
- Latest version:
- http://www.w3.org/TR/xmlschema-1/
- Previous version:
-
http://www.w3.org/TR/2004/PER-xmlschema-1-20040318/
- Editors:
- Henry S. Thompson, University of Edinburgh <ht@cogsci.ed.ac.uk>
- David Beech, Oracle Corporation <David.Beech@oracle.com>
- Murray Maloney, for Commerce One <murray@muzmo.com>
- Noah Mendelsohn, Lotus Development Corporation <Noah_Mendelsohn@lotus.com>
Please refer to the errata for this document, which may include some normative corrections. XMLXHTML with visible change markupIndependent copy of the schema for schema
documentsIndependent copy of the DTD for schema documentsSee also translations. Copyright © 2004 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
AbstractXML Schema: Structures specifies the XML Schema definition language,
which offers facilities for describing the structure and constraining the contents
of XML 1.0 documents, including those which exploit the XML
Namespace facility. The schema language, which is itself represented in XML
1.0 and uses namespaces, substantially reconstructs and considerably extends the capabilities found in XML 1.0 document type
definitions (DTDs). This specification depends on XML Schema Part 2:
Datatypes.
Status of this DocumentThis section describes the status of this document at the
time of its publication. Other documents may supersede this document.
A list of current W3C publications and the latest
revision of this technical report can be found in the W3C technical reports index at
http://www.w3.org/TR/. This is a W3C
Recommendation, which forms part of the Second Edition of XML
Schema. This document has been reviewed by W3C Members and
other interested parties and has been endorsed by the Director as a
W3C Recommendation. It is a stable document and may be used as
reference material or cited as a normative reference
from another document.
W3C's role in making the Recommendation is to draw attention
to the specification and to promote its widespread deployment. This
enhances the functionality and interoperability of the Web.
This document has been produced by the W3C XML Schema Working Group
as part of the W3C XML
Activity. The goals of the XML Schema language are discussed in
the XML Schema
Requirements document. The authors of this document are the
members of the XML Schema Working Group. Different parts of this
specification have different editors.
This document was produced under the 24
January 2002 Current Patent Practice (CPP) as amended by the W3C Patent Policy
Transition Procedure. The Working Group maintains a public
list of patent disclosures relevant to this document;
that page also includes instructions for disclosing a patent.
An individual who
has actual knowledge of a patent which the individual believes
contains Essential Claim(s) with respect to this specification should
disclose the information in accordance with section
6 of the W3C Patent Policy.
The English version of this specification is the only normative
version. Information about translations of this document is available
at http://www.w3.org/2001/05/xmlschema-translations. This second edition is not a new version,
it merely incorporates the changes dictated by the corrections to
errors found in the first
edition as agreed by the XML Schema Working Group, as a
convenience to readers. A separate list of all such corrections is
available at http://www.w3.org/2001/05/xmlschema-errata.
The errata list for this second edition is available at http://www.w3.org/2004/03/xmlschema-errata.
Please report errors in this document to www-xml-schema-comments@w3.org
( archive).
Note: David Beech has retired since the publication of the first edition,
and can be reached at davidbeech@earthlink.net. Murray Maloney is no longer affiliated with Commerce One; his contact
details are unchanged. Noah Mendelsohn's affiliation has changed since the publication of the
first edition. He is now at IBM, and can be contacted at noah_mendelsohn@us.ibm.com
1 IntroductionThis document sets out the structural part (XML Schema: Structures) of the XML Schema definition language. Chapter 2 presents a 2 Conceptual Framework for XML Schemas, including
an introduction to the nature of XML Schemas and an introduction
to the XML Schema abstract data model, along with
other terminology used throughout this document. Chapter 3, 3 Schema Component Details, specifies the precise
semantics of each component of the abstract model, the representation of each
component in XML, with reference to a DTD and XML Schema
for an XML Schema document type, along with a detailed mapping between the elements and
attribute vocabulary of this representation and the components and properties
of the abstract model. Chapter 4 presents 4 Schemas and Namespaces: Access and Composition, including the
connection between documents and schemas, the import, inclusion and redefinition of declarations and definitions and
the foundations of schema-validity assessment. Chapter 5 discusses 5 Schemas and Schema-validity Assessment, including the
overall approach to schema-validity assessment of documents, and responsibilities of schema-aware
processors. The normative appendices include a A Schema for Schemas (normative) for the XML representation of schemas and
B References (normative). The non-normative appendices include the G DTD for Schemas (non-normative) and a F Glossary (non-normative). This document is primarily intended as a language definition reference.
As such, although it contains a few examples, it is not primarily designed
to serve as a motivating introduction to the design and its features, or as a
tutorial for new users.
Rather it presents a careful and fully explicit definition of that design, suitable
for guiding implementations. For those in search of a step-by-step
introduction to the design, the non-normative [XML Schema: Primer] is a much better
starting point than this document.
1.1 PurposeThe purpose of XML Schema: Structures is to define the nature of XML schemas
and their component parts,
provide an inventory of XML markup
constructs with which to represent schemas, and define the
application of schemas to XML documents. The purpose of an XML Schema: Structures schema is to define and describe a class of
XML documents by using schema components to constrain and document the meaning,
usage and relationships of their constituent parts: datatypes, elements and
their content and attributes and their values. Schemas may also provide for the specification of additional
document information, such as normalization and defaulting of attribute
and element values. Schemas have
facilities for self-documentation. Thus, XML Schema: Structures can be used to define, describe and catalogue XML
vocabularies for classes of XML documents. Any application that consumes well-formed XML can use the XML Schema: Structures
formalism to express syntactic, structural and value constraints applicable to
its document instances. The XML Schema: Structures formalism allows a useful level of
constraint checking to be described and implemented for a wide spectrum of XML
applications. However, the language defined by this specification does not attempt to provide
all the facilities that might be needed by any
application. Some applications may require constraint capabilities not
expressible in this language, and so may need to perform their own additional
validations.
1.3 Documentation Conventions and Terminology
The section introduces the highlighting and typography as used in
this document to present technical material. Special terms are defined at their point of
introduction in the text. For example [Definition:] a term is
something used with a special meaning. The definition is
labeled as such and the term it defines is displayed in boldface. The end of the definition is not specially marked
in the displayed or printed text. Uses of defined terms are links to
their definitions, set off with middle dots, for instance ·term·. Non-normative examples are set off in boxes and accompanied by a brief
explanation: <schema targetNamespace="http://www.example.com/XMLSchema/1.0/mySchema"> And an explanation of the example. The definition of each kind of schema component consists of a list of
its properties and their contents, followed by descriptions of the
semantics of the properties: References to properties of schema components are links to
the relevant definition as exemplified above, set off with curly braces, for instance {example property}. The correspondence between an element information item which is part
of the XML representation of a schema and one or more schema components is presented in a tableau
which illustrates the element information item(s) involved.
This is followed by a tabulation of the correspondence between properties of the component
and properties of the information item. Where context may determine which of
several different components may arise, several tabulations, one per context,
are given. The property correspondences are normative,
as are the illustrations of the XML representation element information items.
In the XML representation, bold-face
attribute names (e.g. count below) indicate a required
attribute information item, and the rest are
optional. Where an attribute information item has an enumerated type
definition, the values are shown separated by vertical bars, as for
size below; if there is a default value, it is shown
following a colon. Where an attribute information item has a built-in simple
type definition defined in [XML Schemas: Datatypes], a hyperlink to its
definition therein is given. The allowed content of the information item is
shown as a grammar fragment, using the Kleene operators ?,
* and +. Each element name therein is a hyperlink to
its own illustration. References to elements in the text are links to
the relevant illustration as exemplified above, set off with angle brackets, for instance <example>. References to properties of information items as defined in [XML-Infoset] are notated as links to the relevant section thereof, set off with square brackets, for example [children]. Properties which this specification defines for information items are
introduced as follows: References to properties of information items defined in this specification
are notated as links to their introduction as exemplified above, set off with square brackets, for example [new property]. The following highlighting is used for non-normative commentary in
this document: Note: General comments directed to all readers. Following [XML 1.0 (Second Edition)], within normative prose in this specification, the words
may and must are defined as follows: - may
- Conforming documents and XML Schema-aware processors are permitted to but need not behave as described.
- must
- Conforming documents and XML Schema-aware processors are required to behave as described; otherwise they are in error.
Note however that this specification provides a definition of error and of conformant processors'
responsibilities with respect to errors (see 5 Schemas and Schema-validity Assessment) which is considerably
more complex than that of [XML 1.0 (Second Edition)].
2 Conceptual FrameworkThis chapter gives an overview of XML Schema: Structures at the level of its abstract data model. 3 Schema Component Details provides details on this model, including
a normative representation in XML for the components of the model.
Readers interested primarily in learning to write schema documents may wish to
first read [XML Schema: Primer] for a tutorial introduction, and only then consult the sub-sections of
3 Schema Component Details named XML Representation of ... for
the details.
2.1 Overview of XML SchemaAn XML Schema
consists of components such as type definitions
and element declarations. These can be used to assess the validity of
well-formed element and attribute information items (as defined
in [XML-Infoset]), and furthermore
may specify augmentations to those items and their descendants. This augmentation makes explicit information which may have
been implicit in the original document, such as normalized and/or default values for
attributes and elements and
the types of element and attribute information items. [Definition:] We refer to the augmented infoset which results from conformant processing as defined in this specification as the post-schema-validation infoset, or PSVI. Schema-validity assessment has two aspects:
1 Determining local schema-validity, that is
whether an element or attribute information item satisfies the
constraints embodied in the relevant
components of an XML Schema; 2 Synthesizing an overall validation outcome for the item,
combining local schema-validity with the results of schema-validity
assessments of its descendants, if any, and
adding appropriate augmentations to the infoset to record this outcome. Throughout this specification, [Definition:] the
word valid and its derivatives are used to refer to
clause 1 above, the determination of local
schema-validity. Throughout this specification, [Definition:] the word assessment is used to refer
to the overall process of
local validation, schema-validity assessment and infoset augmentation.
2.2 XML Schema Abstract Data Model
This specification builds on [XML 1.0 (Second Edition)] and
[XML-Namespaces]. The concepts and definitions used
herein regarding XML are framed at the abstract level of information
items as defined in [XML-Infoset]. By
definition, this use of the infoset provides a priori guarantees of well-formedness
(as defined in [XML 1.0 (Second Edition)]) and namespace
conformance (as defined in [XML-Namespaces]) for
all candidates for ·assessment· and for all ·schema documents·. Just as [XML 1.0 (Second Edition)] and
[XML-Namespaces] can be described in terms of
information items, XML Schemas can be described in terms of an
abstract data model. In defining XML Schemas in terms of an abstract
data model, this specification rigorously specifies the information which
must be available to a conforming XML Schema processor. The abstract
model for schemas is conceptual only, and does not mandate any
particular implementation or representation of this information. To
facilitate interoperation and sharing of schema information, a
normative XML interchange format for schemas is provided. [Definition:] Schema component is the generic term for the building blocks that comprise the abstract data model of the schema.
[Definition:]
An XML Schema is a
set of ·schema components·. There are 13 kinds of
component in all, falling into three groups. The primary components, which may
(type definitions) or must (element and attribute declarations) have names
are as follows: - Simple type definitions
- Complex type definitions
- Attribute declarations
- Element declarations
The secondary components, which must have names, are as follows: - Attribute group definitions
- Identity-constraint definitions
- Model group definitions
- Notation declarations
Finally, the "helper" components provide small parts of
other components; they are not independent of their context: - Annotations
- Model groups
- Particles
- Wildcards
- Attribute Uses
During ·validation·, [Definition:] declaration components are associated by
(qualified) name to information items being ·validated·. On the other hand, [Definition:] definition components define
internal schema components that can be used in other schema components.
[Definition:] Declarations and
definitions may have and be identified by names, which are NCNames as defined by [XML-Namespaces]. [Definition:] Several kinds
of component have a target namespace, which is either
·absent· or a namespace name, also as
defined by [XML-Namespaces]. The ·target
namespace· serves to identify the namespace within which the
association between the component and its name exists. In the case of
declarations, this in turn determines the namespace name of, for example, the element
information items it may ·validate·. Note: At the abstract level, there is
no requirement that the components of a schema share a
·target namespace·. Any schema for use in
·assessment· of documents containing names from more than one namespace
will of necessity include components with different ·target namespaces·. This contrasts with
the situation at the level of the XML representation of components, in which each schema document contributes
definitions and declarations to a single target namespace. ·Validation·, defined in detail in 3 Schema Component Details, is a
relation between information items and schema components. For example, an
attribute information item may ·validate· with respect to an attribute
declaration, a list of element information items may ·validate· with
respect to a content model, and so on. The following sections briefly
introduce the kinds of components in the
schema abstract data model, other major features of the abstract
model, and how they contribute to ·validation·.
2.2.1 Type Definition ComponentsThe abstract model provides two kinds of type definition component: simple
and complex. [Definition:] This specification uses
the phrase type definition in cases where no distinction
need be made between simple and complex types. Type definitions form a hierarchy with a single root. The subsections below first describe characteristics of that
hierarchy, then provide an introduction to simple and complex type definitions themselves.
2.2.1.1 Type Definition Hierarchy
[Definition:] Except for a distinguished ·ur-type definition·, every ·type definition· is, by construction, either a
·restriction· or an ·extension· of some other type definition. The graph of these relationships forms a tree known as the Type Definition Hierarchy.
[Definition:] A type
definition whose
declarations or facets are in a one-to-one relation with those of another
specified type
definition, with each in turn restricting the possibilities of the one it
corresponds to, is said to be a restriction.
The specific restrictions might include narrowed ranges or reduced
alternatives.
Members of a type, A, whose definition is a ·restriction· of the definition of another type, B, are always members of type B as well. [Definition:] A complex type definition
which allows element or attribute content in addition to that allowed by
another specified type
definition is said to be an extension. [Definition:] A distinguished complex
type definition, the ur-type
definition, whose
name is anyType in the XML Schema namespace, is present in each ·XML Schema·, serving as the root of the type
definition hierarchy for that schema.
[Definition:] A type definition used as the
basis for an ·extension· or
·restriction· is known as
the base type definition of that definition.
2.2.1.2 Simple Type DefinitionA simple type definition is a set of constraints on strings and information about the values they encode, applicable to the ·normalized value· of an attribute
information item or of an element information item with no element children.
Informally, it applies to the values of attributes and the text-only content of elements.
Each simple type definition, whether built-in (that is, defined in [XML Schemas: Datatypes]) or
user-defined, is a ·restriction· of some particular
simple ·base type
definition·. For the built-in primitive type definitions, this is [Definition:] the simple
ur-type definition, a special restriction of the
·ur-type
definition·, whose name is anySimpleType in the XML Schema namespace. The ·simple ur-type definition· is considered to have an unconstrained lexical space, and a value space consisting of the union of the value spaces of all the built-in primitive datatypes and the set of all lists of all members of the value spaces of all the built-in primitive datatypes. The mapping from lexical space to value space is
unspecified for items whose type definition is the
·simple ur-type definition·.
Accordingly this specification does not constrain processors' behaviour in
areas where this mapping is implicated, for example checking such items against
enumerations, constructing default attributes or elements whose declared type
definition is the
·simple ur-type definition·, checking
identity constraints involving such items. Note: The Working Group expects to return to this area in a future
version of this specification. Simple types may
also be defined whose members are lists of items
themselves constrained by some other simple type definition, or whose
membership is the union of the memberships of some other simple type
definitions. Such list and union simple type definitions are also restrictions of the ·simple ur-type
definition·. For detailed information on simple type definitions, see 3.14 Simple Type Definitions and [XML Schemas: Datatypes]. The latter also defines an extensive inventory of
pre-defined simple types.
2.2.1.3 Complex Type DefinitionA complex type definition is a set of attribute declarations and a content type, applicable to the [attributes] and
[children] of an element information item respectively. The content type may
require the [children] to contain neither element nor character information
items (that is, to be empty), to be a string which belongs to a particular simple
type or to contain a sequence of element information items which conforms to a particular model group, with or without character information items as well. Each complex type definition other than the
·ur-type definition· is either
or
A
complex type which extends another does so by having additional content model
particles at the end of the other definition's content model,
or by having additional attribute declarations, or both.
Note: This specification allows only appending, and not other kinds of
extensions. This decision
simplifies application processing required to cast instances from derived to
base type. Future versions may allow more kinds of extension, requiring more
complex transformations to effect casting.
For detailed information on complex type definitions, see 3.4 Complex Type Definitions.
2.2.2 Declaration ComponentsThere are three kinds of declaration component: element, attribute, and
notation. Each is described in a section below. Also included is a discussion
of element substitution groups, which is a feature provided in conjunction with
element declarations.
2.2.2.1 Element DeclarationAn element declaration is an association of a name with a type definition, either simple or
complex, an (optional) default value and a (possibly empty) set of identity-constraint
definitions. The association is either global or scoped to a containing complex type definition. A
top-level element declaration with name 'A' is broadly comparable to a pair of
DTD declarations as follows, where the associated type definition
fills in the ellipses: <!ELEMENT A . . .>
<!ATTLIST A . . .>
Element declarations contribute to
·validation· as part of model group ·validation·, when their defaults and type components are checked against an element
information item with a matching name and namespace, and by triggering
identity-constraint definition ·validation·.
For detailed information on element declarations, see 3.3 Element Declarations.
2.2.2.2 Element Substitution GroupIn XML 1.0, the name and content of an element must correspond exactly to the element type referenced in the corresponding content model. [Definition:] Through
the new mechanism of element substitution groups, XML Schemas provides a more powerful model supporting substitution of one named element for another.
Any top-level element declaration can serve as the defining member, or
head, for an element substitution group. Other top-level element declarations,
regardless of target namespace, can be designated as members of the
substitution group headed by this element. In a suitably enabled content
model, a reference to the head ·validates· not just the head itself, but elements
corresponding to any other member of the substitution group as well.
All such members must have type definitions which are either the same as the
head's type definition or
restrictions or extensions of it.
Therefore, although the names of elements can vary widely as new
namespaces and members of the substitution group are defined, the
content of member elements is strictly limited according to the type
definition of the substitution group head. Note that element substitution groups are not represented as separate components. They are
specified in the property values for element declarations (see 3.3 Element Declarations).
2.2.2.3 Attribute DeclarationAn attribute declaration is an association between a name and a simple type definition, together
with occurrence information and (optionally) a default value. The
association is either global, or local to its containing complex type definition. Attribute declarations contribute to
·validation· as part of complex type definition ·validation·, when their
occurrence, defaults and type components are checked against an attribute
information item with a matching name and namespace.
For detailed information on attribute declarations, see 3.2 Attribute Declarations.
2.2.2.4 Notation DeclarationA notation declaration is an association between a name and an identifier for a
notation. For an attribute information item to be ·valid· with respect to a
NOTATION simple type definition, its value must have been declared
with a notation declaration.
For detailed information on notation declarations, see 3.12 Notation Declarations.
2.2.3 Model Group ComponentsThe model group, particle, and wildcard components contribute to
the portion of a complex type definition that controls an element
information item's content.
2.2.3.1 Model GroupA model group is a constraint in the form of a grammar fragment that applies to
lists of element information items. It consists of a list of particles, i.e.
element declarations, wildcards and model groups. There are three varieties of
model group: - Sequence (the element information items
match the particles in sequential order);
- Conjunction (the element information items match the
particles, in any order);
- Disjunction (the element information items match
one of the particles).
For detailed information on model groups, see 3.8 Model Groups.
2.2.3.2 ParticleA particle is a term in the grammar for element content, consisting of either an element
declaration, a wildcard or a model group, together with
occurrence constraints. Particles contribute to
·validation· as part of complex type definition ·validation·, when they allow anywhere
from zero to many element information items or sequences thereof, depending on
their contents and occurrence
constraints. [Definition:] A particle can
be used in a complex type definition to constrain the ·validation·
of the [children] of an element information item; such a particle is called
a content model.
For detailed information on particles, see 3.9 Particles.
2.2.3.3 Attribute UseAn attribute use plays a role similar to that of a particle, but for
attribute declarations: an attribute declaration within a complex type definition
is embedded within an attribute use, which specifies whether the declaration
requires or merely allows its attribute, and whether it has a default or fixed value.
2.2.3.4 WildcardA wildcard is a special kind of particle which matches element and attribute information items dependent on their namespace name, independently
of their local names.
For detailed information on wildcards, see 3.10 Wildcards.
2.2.4 Identity-constraint Definition ComponentsAn identity-constraint definition is an association between a name and one of
several varieties of
identity-constraint related to uniqueness and reference. All the
varieties use [XPath] expressions to pick out sets of
information items relative to particular target element
information items which are unique, or a key, or a ·valid· reference, within
a specified scope. An element information item is only ·valid· with
respect to an element declaration
with identity-constraint definitions if those definitions are all satisfied for all the descendants
of that element information item which they pick out.
For detailed information on identity-constraint definitions, see 3.11 Identity-constraint Definitions.
2.2.5 Group Definition ComponentsThere are two kinds of convenience definitions provided to enable
the re-use of pieces of complex type definitions: model group definitions
and attribute group definitions.
2.2.5.1 Model Group DefinitionA model group definition is an association between a name and a model group,
enabling re-use of the same model group in several complex type
definitions.
For detailed information on model group definitions, see 3.7 Model Group Definitions.
2.2.5.2 Attribute Group DefinitionAn attribute group definition is an association between a name and a set of attribute declarations,
enabling re-use of the same set in several complex type
definitions.
For detailed information on attribute group definitions, see 3.6 Attribute Group Definitions.
2.2.6 Annotation ComponentsAn annotation is information for human and/or mechanical
consumers. The interpretation of such information is
not defined in this specification.
For detailed information on annotations, see 3.13 Annotations.
2.3 Constraints and Validation Rules
The [XML 1.0 (Second Edition)] specification describes two kinds of
constraints on XML documents: well-formedness and
validity constraints. Informally, the well-formedness constraints
are those imposed by the definition of XML itself (such as the rules for the
use of the < and > characters and the rules for proper nesting of
elements), while validity constraints are the further constraints on document
structure provided by a particular DTD. The preceding section focused on ·validation·, that is
the constraints on information items which schema components supply. In fact
however this specification provides four different kinds of normative statements about schema
components, their representations in XML and their contribution to the
·validation· of information items: - Schema Component Constraint
- [Definition:] Constraints on the schema components themselves, i.e.
conditions components must satisfy to be components at all. Located in the
sixth sub-section of the per-component sections of 3 Schema Component Details
and tabulated in C.4 Schema Component Constraints.
- Schema Representation Constraint
- [Definition:] Constraints on the
representation of schema components in XML beyond those which are expressed
in A Schema for Schemas (normative). Located in the
third sub-section of the per-component sections of 3 Schema Component Details
and tabulated in C.3 Schema Representation Constraints.
- Validation Rules
- [Definition:] Contributions to ·validation· associated
with schema components. Located in the
fourth sub-section of the per-component sections of 3 Schema Component Details
and tabulated in C.1 Validation Rules.
- Schema Information Set
Contribution
- [Definition:] Augmentations to ·post-schema-validation infoset·s
expressed by schema components, which follow
as a consequence of ·validation· and/or ·assessment·.
Located in the
fifth sub-section of the per-component sections of 3 Schema Component Details
and tabulated in C.2 Contributions to the post-schema-validation infoset.
The last of these, schema information set
contributions, are not as new as they might at first seem. XML 1.0
validation augments the XML 1.0 information set in similar ways,
for example by
providing values for attributes not present in instances, and by implicitly
exploiting type information for normalization or access.
(As an example of the latter case, consider the
effect of NMTOKENS on attribute white space, and the semantics of
ID and IDREF.) By including schema
information set contributions, this specification makes explicit some features
that XML 1.0 left implicit.
2.4 Conformance
This specification describes three levels of conformance for schema aware processors. The first is
required of all processors. Support for the other two will depend on the application environments
for which the processor is intended. [Definition:] Minimally conforming processors must completely and
correctly implement the ·Schema Component
Constraints·, ·Validation Rules·,
and ·Schema Information
Set Contributions· contained in this specification. [Definition:] ·Minimally conforming· processors which accept
schemas represented in the form of XML documents as described in 4.2 Layer 2: Schema Documents, Namespaces and Composition are
additionally said to provide conformance to the XML Representation of Schemas.
Such processors must, when processing schema documents, completely and
correctly implement all ·Schema Representation
Constraints· in this specification, and must adhere exactly to the
specifications in 3 Schema Component Details for mapping the contents of
such documents to ·schema
components· for use in ·validation· and ·assessment·. Note: By separating the conformance requirements relating to the concrete syntax of XML schema
documents, this specification admits processors
which use schemas stored in optimized binary
representations, dynamically created schemas represented as programming language data structures, or implementations in which particular schemas are compiled into executable code
such as C or Java. Such processors can be said to be ·minimally conforming· but not necessarily in ·conformance to the XML Representation of Schemas·. [Definition:] Fully conforming
processors are network-enabled processors which are not only both ·minimally conforming· and ·in conformance to the XML Representation of Schemas·, but which additionally must be capable of accessing
schema documents from the World Wide Web according to 2.7 Representation of Schemas on the World Wide Web and 4.3.2 How schema definitions are located on the Web.
.
Note: Although this specification provides just these three standard levels of conformance, it is
anticipated that other conventions can be established in the future. For example, the World Wide
Web Consortium is considering conventions for packaging on the Web a variety of
resources relating to individual documents and namespaces. Should such
developments lead to new conventions for representing schemas, or for accessing them on the Web,
new levels of conformance can be established and named at that time. There is no need to modify
or republish this specification to define such additional levels of conformance. See 4 Schemas and Namespaces: Access and Composition for a more detailed explanation of the
mechanisms supporting these levels of conformance.
2.5 Names and Symbol Spaces
As discussed in 2.2 XML Schema Abstract Data Model, most schema
components (may) have ·names·.
If all such names were assigned from the same "pool", then
it would be impossible to have, for example, a simple type definition and an element
declaration both with the name
"title" in a given ·target namespace·.
Therefore [Definition:] this specification introduces the term
symbol space to denote a
collection of names, each of which is unique with respect to the others. A symbol space is similar to the non-normative concept of namespace partition introduced in [XML-Namespaces].
There is a single distinct symbol space within a given ·target
namespace· for each kind of definition and declaration component
identified in 2.2 XML Schema Abstract Data Model, except that within a target namespace, simple
type definitions and complex type definitions share a symbol space.
Within a given symbol space, names are unique, but the same name may appear in more than one symbol space without conflict. For example, the same name can appear in both a type definition and an element declaration, without conflict or necessary relation between the two.
Locally scoped attribute and element
declarations are special with regard to symbol spaces.
Every complex type definition defines its own local attribute and element declaration symbol
spaces, where these symbol spaces are distinct from each other and from any of the other
symbol spaces. So, for example, two complex type definitions having
the same target namespace can contain
a local attribute declaration for the unqualified name "priority", or contain a local element declaration
for the name "address", without conflict or necessary relation between
the two.
2.6 Schema-Related Markup in
Documents Being Validated
The XML representation of schema components uses a vocabulary
identified by the namespace name http://www.w3.org/2001/XMLSchema. For brevity, the text and examples in this specification use the prefix
xs: to stand for this namespace; in practice,
any prefix can be used. XML Schema: Structures also defines several attributes for direct use in any XML documents. These attributes are in a different namespace,
which has the namespace name http://www.w3.org/2001/XMLSchema-instance.
For brevity, the text and examples in this specification use the prefix
xsi: to stand for this latter namespace; in practice,
any prefix can be used. All schema processors have appropriate attribute
declarations for these attributes built in, see [],
[], [] and [].
2.6.2 xsi:nilXML Schema: Structures introduces a mechanism for signaling that an element should
be accepted as ·valid· when it has no
content despite a content type which does not require or even necessarily allow empty content. An
element may be
·valid· without content if it has the attribute xsi:nil with
the value true. An element so labeled must be empty, but can
carry attributes if permitted by the corresponding complex type.
2.6.3 xsi:schemaLocation, xsi:noNamespaceSchemaLocationThe xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes can be used in a document to provide
hints as to the physical location of schema documents which may be used for ·assessment·.
See 4.3.2 How schema definitions are located on the Web for details on the use of these attributes.
3 Schema Component Details
3.1 IntroductionThe following sections provide full details on the composition of all schema components, together
with their XML representations and their contributions to ·assessment·. Each section is devoted to a single component, with separate subsections for
- properties: their values and significance
- XML representation and the mapping to properties
- constraints on representation
- validation rules
- ·post-schema-validation infoset· contributions
- constraints on the components themselves
The sub-sections immediately below introduce conventions and terminology used throughout the component sections.
3.1.1 Components and PropertiesComponents are defined in terms of their
properties, and each property in turn is defined by giving its range,
that is the values it may have. This can be understood as
defining a schema as a labeled directed graph, where the root is a schema,
every other vertex is a schema
component or a literal (string, boolean, number) and every labeled edge is a
property. The graph is not acyclic: multiple copies of
components with the same name in the same ·symbol space· may not exist, so in some cases re-entrant chains
of properties must exist. Equality of components for the purposes of this
specification is always defined as equality of names (including target
namespaces) within symbol spaces. Note: A schema and its components as defined in this chapter are an idealization of the information a schema-aware
processor requires: implementations are not constrained in how they provide
it. In particular, no implications about literal embedding versus indirection
follow from the use below of language such as "properties . . . having . . .
components as values". [Definition:] Throughout this specification, the
term absent is used as a distinguished property value denoting absence. Any property not
identified as optional is required to be present; optional properties which are
not present are taken to have ·absent· as their value. Any
property identified as a having a set, subset or list value may have an empty value unless this is explicitly
ruled out: this is not the same as ·absent·. Any property value identified as a superset or subset of some set may be equal to that set, unless a proper superset or subset is explicitly called for.
By 'string' in Part 1 of this specification is meant a
sequence of ISO 10646 characters identified as legal XML characters
in [XML 1.0 (Second Edition)].
3.1.2 XML Representations of ComponentsThe principal purpose of XML Schema: Structures is to define a set of
schema components that constrain the contents of instances and augment the
information sets thereof. Although no external representation
of schemas is required for this purpose, such representations will
obviously be widely used. To provide for this in an appropriate and
interoperable way, this specification provides a normative XML representation for schemas which
makes provision for every kind of schema
component. [Definition:] A document in
this form (i.e. a <schema> element information item) is a schema document. For the schema document as a whole, and
its constituents, the sections below define correspondences between element
information items (with declarations in
A Schema for Schemas (normative) and G DTD for Schemas (non-normative)) and
schema components. All the element information items in the XML representation
of a schema must be in the XML Schema namespace, that is their [namespace name] must be http://www.w3.org/2001/XMLSchema. Although a common way of creating the XML Infosets which are or contain ·schema documents· will be using an XML parser, this is not required: any mechanism which constructs conformant infosets as defined in [XML-Infoset] is a possible starting point. Two aspects of the XML representations of components presented in the
following sections are constant across them all:
- All of them allow attributes qualified with namespace names other than
the XML Schema namespace itself: these appear as annotations in the
corresponding schema component;
- All of them allow an <annotation> as their first child, for human-readable documentation and/or machine-targeted information.
3.1.3 The Mapping between XML Representations and ComponentsFor each kind of schema component there is a corresponding normative XML representation.
The sections below describe the correspondences between the properties of each kind of
schema component on the one hand and the properties of information items in
that XML representation on the other, together
with constraints on that representation above and beyond those implicit in the
A Schema for Schemas (normative). The language used is as if the correspondences were mappings from XML representation to
schema component, but the mapping in the other direction, and therefore the
correspondence in the abstract, can always be
constructed therefrom. In discussing the mapping from XML representations to schema
components below, the value of a component property is often determined by the
value of an attribute information item, one of the [attributes] of an element
information item. Since schema documents are constrained by the
A Schema for Schemas (normative), there is always a simple type
definition associated with any such attribute information item. [Definition:] The
phrase actual value is used to refer to the member of the value space of the
simple type definition associated with an attribute information item which corresponds to
its ·normalized value·. This will often be a string, but may also be an
integer, a boolean, a URI reference, etc. This term is also occasionally used with respect to element or attribute information items in a document being ·validated·. Many properties are identified below as having
other schema components or sets of components as values. For the purposes of exposition, the definitions in
this section assume that (unless the property is explicitly identified as
optional) all such values are in fact present. When schema
components are constructed from XML representations involving reference by name
to other components, this assumption may be violated if one or more references
cannot be resolved. This specification addresses the matter of missing
components in a uniform manner, described in 5.3 Missing Sub-components: no mention of
handling missing components will be found in the individual component
descriptions below. Forward reference to named definitions and declarations is
allowed, both within and between ·schema documents·.
By the time the component corresponding to an XML representation which
contains a forward reference is actually needed for ·validation· an appropriately-named component may have become available to discharge the reference: see 4 Schemas and Namespaces: Access and Composition for details.
3.1.4 White Space Normalization during ValidationThroughout this specification, [Definition:] the
initial value of some
attribute information item is the value of the
[normalized
value] property of that item. Similarly, the initial value of an element information item is the string composed of, in order, the
[character code] of each character information item in the [children] of that
element information item. The above definition means that comments and processing instructions,
even in the midst of text, are ignored for all ·validation· purposes. [Definition:] The
normalized value of an element or
attribute information item is an ·initial value· whose white space, if any, has been
normalized according to the value of the whiteSpace facet of the
simple type definition used in its ·validation·:
- preserve
- No normalization is done, the value is the ·normalized value·
- replace
- All occurrences of
#x9 (tab), #xA (line feed) and
#xD (carriage return) are replaced with #x20 (space). - collapse
- Subsequent to the replacements specified above under replace,
contiguous sequences of
#x20s are collapsed to a single
#x20, and initial and/or final #x20s are deleted.
If the simple type definition used in an item's ·validation· is the ·simple ur-type definition·, the ·normalized value· must be determined as in the preserve case above. There are three alternative validation rules which may supply the
necessary background for the above: [] (clause 3), [] (clause 3.1.3) or [] (clause 2.2). These three levels of normalization correspond to the processing mandated
in XML 1.0 for element content, CDATA attribute content and tokenized
attributed content, respectively. See Attribute Value Normalization in [XML 1.0 (Second Edition)] for the precedent for replace and collapse for attributes. Extending this processing to element content is necessary to ensure a consistent ·validation· semantics for simple types, regardless of whether they are applied to attributes or elements. Performing it twice in the case of attributes whose [normalized
value] has already been subject to replacement or collapse on the basis of
information in a DTD is necessary to ensure consistent treatment of attributes
regardless of the extent to which DTD-based information has been made use of
during infoset construction. Note: Even when DTD-based information has been appealed to, and
Attribute Value
Normalization has taken place, the above definition of ·normalized value· may
mean further normalization takes place, as for instance when
character entity references in attribute values result in white space characters
other than spaces in their ·initial value·s.
3.2 Attribute Declarations
Attribute declarations provide for: - Local ·validation· of attribute information item values using a simple type definition;
- Specifying default or fixed values for attribute information items.
<xs:attribute name="age" type="xs:positiveInteger" use="required"/>
The XML representation of an attribute declaration.
3.2.1 The Attribute Declaration Schema ComponentThe attribute declaration schema component has the following
properties: The {name} property must match the local part of the names of attributes being ·validated·. The value of the attribute must conform to the supplied {type definition}. A non-·absent· value of the {target namespace} property provides for ·validation· of
namespace-qualified attribute information items (which must be explicitly
prefixed in the character-level form of XML documents). ·Absent· values of
{target namespace} ·validate· unqualified (unprefixed) items. A {scope} of global identifies attribute declarations
available for use in complex type definitions throughout the schema. Locally scoped declarations are available for use only within the
complex type definition identified by the {scope} property. This property is ·absent· in the case of declarations within attribute group definitions: their scope will be determined when they are used in the construction of complex type definitions.
{value constraint} reproduces the functions of XML 1.0 default and #FIXED
attribute values. default specifies that the attribute is to appear unconditionally in
the ·post-schema-validation infoset·, with the supplied value used
whenever the attribute is not actually present; fixed indicates that the attribute value if present must equal the supplied
constraint value, and if absent receives the supplied value as for
default. Note that it is values that are supplied and/or
checked, not strings. See 3.13 Annotations for information on the role of the
{annotation} property. [XML-Infoset] distinguishes attributes with names such as xmlns or xmlns:xsl from
ordinary attributes, identifying them as [namespace attributes]. Accordingly, it is unnecessary and in fact not possible for
schemas to contain attribute declarations corresponding to such
namespace declarations, see []. No means is provided in
this specification to supply a
default value for a namespace declaration.
3.2.2 XML Representation of Attribute Declaration Schema ComponentsThe XML representation for an attribute declaration schema component is an
<attribute> element information item. It specifies a simple type
definition for an attribute either by reference or explicitly, and may provide default information. The correspondences between the
properties of the information item and
properties of the component are as follows: <attribute default = string fixed = string form = formChoice id = ID name = NCName ref = QName type = QName use = (optional | prohibited | required) : optional {any attributes with non-schema namespace . . .}> Content: (annotation?, simpleType?) </attribute> If the <attribute> element information item has <schema> as its parent, the corresponding schema component is as follows: | Attribute Declaration Schema Component |
|---|
| Property | Representation |
|---|
| {name} | The ·actual value· of the name [attribute] | | {target namespace} | The ·actual value· of the
targetNamespace [attribute] of the parent <schema>
element information item, or ·absent· if there is none. | | {type definition} | The simple type definition
corresponding to the <simpleType> element information item in the
[children], if present, otherwise the simple type definition ·resolved· to by
the ·actual value· of the type [attribute], if present, otherwise the
·simple ur-type definition·. | | {scope} | global. | | {value constraint} | If there is a default or a fixed
[attribute], then a pair consisting of the ·actual value· (with respect to the
{type definition}) of that [attribute] and
either default or fixed, as appropriate, otherwise ·absent·. | | {annotation} | The annotation corresponding to the <annotation> element information item in the
[children], if present, otherwise ·absent·. |
|
otherwise if the <attribute> element information item has
<complexType> or <attributeGroup> as an ancestor
and the ref [attribute] is absent, it corresponds to an
attribute use with properties as follows (unless use='prohibited', in which case the item
corresponds to nothing at all): | Attribute Declaration Schema Component |
|---|
| Property | Representation |
|---|
| {name} | The ·actual value· of the name [attribute] | | {target namespace} | If form is present and its
·actual value· is qualified, or if form is absent and the
·actual value· of attributeFormDefault on the <schema>
ancestor is qualified, then the ·actual value· of the
targetNamespace [attribute] of the parent <schema>
element information item, or ·absent· if there
is none, otherwise ·absent·. | | {type definition} | The simple type definition
corresponding to the <simpleType> element information item in the
[children], if present, otherwise the simple type definition ·resolved· to by
the ·actual value· of the type [attribute], if present, otherwise the
·simple ur-type definition·. | | {scope} | If the <attribute> element information item
has <complexType> as an ancestor, the complex definition
corresponding to that item, otherwise (the <attribute> element
information item is within an <attributeGroup> definition), ·absent·. | | {value constraint} | ·absent·. | | {annotation} | The annotation corresponding to the <annotation> element information item in the
[children], if present, otherwise ·absent·. |
|
otherwise (the <attribute> element information item has
<complexType> or <attributeGroup> as an ancestor and the
ref [attribute] is present), it corresponds to an
attribute use with properties as follows (unless use='prohibited', in which case the item
corresponds to nothing at all): Attribute declarations can appear at the top level of a schema document, or within complex
type definitions, either as complete (local) declarations, or by reference to top-level
declarations, or within attribute group definitions. For complete declarations, top-level or local, the type attribute is used when the declaration can use a
built-in or pre-declared simple type definition. Otherwise an
anonymous <simpleType> is provided inline. The default when no simple type definition is referenced or
provided is the ·simple ur-type definition·, which imposes no constraints at all. Attribute information items ·validated· by a top-level declaration must be qualified with the
{target namespace} of that declaration (if this is ·absent·, the item must be unqualified). Control over whether attribute information items
·validated· by a local declaration must be similarly qualified or not
is provided by the form [attribute], whose default is provided
by the attributeFormDefault [attribute] on the enclosing <schema>, via its determination of {target namespace}. The names for top-level attribute declarations are in their own
·symbol space·. The names of locally-scoped
attribute declarations reside in symbol spaces local to the type definition which contains
them.
3.2.3 Constraints on XML Representations of Attribute DeclarationsSchema Representation Constraint: Attribute Declaration Representation OKIn addition to the conditions imposed on <attribute> element
information items by the schema for schemas,
all of the following must be true: 1 default and fixed must not both be present. 2 If default and use are both present,
use must have the ·actual value· optional. 3 If the item's parent is not <schema>, then
all of the following must be true: 3.1 One of ref or name must be present, but not both. 3.2 If ref is present, then all of <simpleType>,
form and type must be absent.
3.2.4 Attribute Declaration Validation RulesValidation Rule: Attribute Locally ValidFor an attribute information item to be locally ·valid· with respect to an
attribute declaration
all of the following must be true:
Validation Rule: Schema-Validity Assessment (Attribute)The schema-validity assessment of an attribute information item depends
on its ·validation· alone. [Definition:] During ·validation·, associations
between element and attribute information items among the [children]
and [attributes] on the one hand, and element and attribute
declarations on the other, are established as a side-effect. Such
declarations are called the context-determined declarations.
See clause 3.1 (in []) for
attribute declarations, clause 2 (in []) for element
declarations. For an attribute information item's schema-validity to have been assessed
all of the following must be true: 1 A non- ·absent· attribute declaration
must be known for it, namely
one of the following:
2 Its ·validity· with respect to that
declaration must have been evaluated as per []. 3 Both clause 1 and clause 2 of [] must be satisfied.
[Definition:] For attributes, there is no
difference between assessment and strict assessment, so if the above holds, the attribute information item has been strictly assessed.
3.2.5 Attribute Declaration Information Set ContributionsSchema Information Set Contribution: Assessment Outcome (Attribute) Schema Information Set Contribution: Validation Failure (Attribute) Schema Information Set Contribution: Attribute DeclarationIf an attribute information item is ·valid· with respect to an attribute
declaration as per [] then in the ·post-schema-validation infoset· the attribute
information item may, at processor option, have a property: |
|
|