Why do I get errors for multiple @xmlns on /html element?

Questions about XML that are not covered by the other forums should go here.
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: Why do I get errors for multiple @xmlns on /html element?

Post by george »

Hi,

The W3C Validator is an application. That tries to detect automatically the document type and in some cases it fails. But you have the option to specifically test a document against a document type that you choose, just click on More Options and then select in the Use Doctype combo box XHTML+RDFa. If you do that with the document from this post topic has the DOCTYPE declaration commented then you will get that passing the validation.

oXygen is also an application and it allows you to validate documents against XML Schema, DTD or other type of schemas. By default the engine used for validating against XML Schema for example is Xerces but you can change that to Saxon SA or other engine. It is not a but as the DTD information is used when you specify to prefer a schema for validation even if a DTD is specified. This is how it was designed to work to allow for example using entities with other schema languages. The validation results depend on the schema or DTD that is used. In this topic the DTD and the schema are not synchronized.

Hope that helps,
George
George Cristian Bina
sampablokuper
Posts: 22
Joined: Tue Apr 08, 2008 1:41 pm

Re: Why do I get errors for multiple @xmlns on /html element?

Post by sampablokuper »

george wrote: The W3C Validator is an application. That tries to detect automatically the document type and in some cases it fails. But you have the option to specifically test a document against a document type that you choose, just click on More Options and then select in the Use Doctype combo box XHTML+RDFa. If you do that with the document from this post topic has the DOCTYPE declaration commented then you will get that passing the validation.
Right, but if I select XHTML+RDFa in the Doctype combo box, this just adds a doctype line to the file and validates the result (which is something Oxygen won't validate). For instance, if I paste the last example you gave into the W3C validator and override the doctype, the result is this:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!--<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/1999/xhtml http://www.w3.org/MarkUp/SCHEMA/xhtml-rdfa-1.xsd"
xmlns:con="http://www.w3.org/2000/10/swap/pim/contact#"
xmlns="http://www.w3.org/1999/xhtml"
xmlns:rec="http://www.w3.org/2001/02pd/rec54#"
xmlns:dct="http://purl.org/dc/terms/"
xmlns:mat="http://www.w3.org/2002/05/matrix/vocab#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:doc="http://www.w3.org/2000/10/swap/pim/doc#"
xmlns:org="http://www.w3.org/2001/04/roadmap/org#">
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<title>SKOS Simple Knowledge Organisation System Reference, Editors' Draft
1 October 2008 $Revision: 1.38 $</title>
<meta name="generator" content="Amaya 9.54, see http://www.w3.org/Amaya/" />
<link href="extras.css" rel="stylesheet" type="text/css" />
<link href="http://www.w3.org/StyleSheets/TR/base" rel="stylesheet"
type="text/css" />
<script type="text/javascript" src="extras.js"></script>
</head>
<body>
</body>
</html>
which Oxygen does not consider valid due to the xml:space issue mentioned earlier in this thread. Even the W3C validator isn't totally happy: it generates the warning: 'The DOCTYPE Declaration for "XHTML + RDFa" has been inserted at the start of the document, but even if no errors are shown below the document will not be Valid until you add the new DOCTYPE Declaration.'

As far as I can tell, this is due to the W3C validator implementing an out-of-date draft of the RDFa in XHTML syntax spec, which required the presence of a doctype. The latest version does not require a doctype to be present, and the sample document presented (shown below) is not validated by the W3C validator unless the XHTML+RDFa doctype override is selected, much as was the case for your sample document.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml"
version="XHTML+RDFa 1.0"
xml:lang="en">
<head>
<title>Virtual Library</title>
</head>
<body>
<p>Moved to <a href="http://example.org/">example.org</a>.</p>
</body>
</html>
So w.r.t the latest draft, you were quite right about commenting out the doctype; w.r.t. the older one, you weren't, and it wasn't until I noticed the change in the draft that I realised what was going on.
Thank you for giving me the most up-to-date solution :)
george wrote: oXygen is also an application and it allows you to validate documents against XML Schema, DTD or other type of schemas. By default the engine used for validating against XML Schema for example is Xerces but you can change that to Saxon SA or other engine.
Just to check my understanding: all of these should conclude with the same result as each other, though, shouldn't they? I realise that Saxon SA might be faster, but the outcome should be the same, shouldn't it, unless one of the engines has a bug? (I don't have a license for Saxon SA though, so I can't test this.)
george wrote: It is not a but as the DTD information is used when you specify to prefer a schema for validation even if a DTD is specified. This is how it was designed to work to allow for example using entities with other schema languages. The validation results depend on the schema or DTD that is used. In this topic the DTD and the schema are not synchronized.
I'm afraid I don't understand this. If selecting the "Ignore the DTD for validation if a schema is specified" is supposed to make Oxygen ignore validation information in the DTD and use it for entity specifications only, then why is Oxygen still objecting to the presence of xml:space attributes in script elements? Surely the relevant part of http://www.w3.org/MarkUp/DTD/xhtml-script-1.mod (which is linked from http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd) counts as validation information that should be ignored?

That is to say, I realise that http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd and http://www.w3.org/MarkUp/SCHEMA/xhtml-rdfa-1.xsd are not synchronized exactly, but surely ticking the "Ignore the DTD for validation if a schema is specified" checkbox should overcome this problem.

If it did, then I could create a document that would validate in both the W3C validator (as it stands) and in Oxygen, which would mean I'd benefit from Oxygen's validation while I'm developing my code, and my site's users could check my code's validity (against the older standard) by way of "validation" buttons on my site. Until the W3C updates its validator, that would be the best of both worlds :)

Again, many thanks indeed for the help!

Sam
george
Site Admin
Posts: 2095
Joined: Thu Jan 09, 2003 2:58 pm

Re: Why do I get errors for multiple @xmlns on /html element?

Post by george »

Please note that oXygen version 10 includes a licensed Saxon SA so you can use that from oXygen without the need for an additional license.

Different engines may give different results - in some cases due to bugs, in other cases due to different interpretations of the specification. Different applications may give different results because different engines give different results or because an application can process differently or apply different validations to a file. As in this topic when you probably validate against a DTD with the W3C Validator and with oXygen you validate against a schema (and that schema that matches a different set of documents than the DTD).

No, "Ignore the DTD for validation if a schema is specified" just removes the DTD Validator but all other DTD processing takes place and that includes entities handling and default values. For example, DTDs are not namespace aware, the namespace declarations are just attributes for a DTD and you can have such declarations defined as defaults in the DTD. If the DTD information will be completely ignored then the document can be not namespace wellformed.

<test>
<x:sample/>
</test>

with a DTD that defines xmlns:x as attribute on x:sample with a default value of http://www.example.com/sample. If you parse that without the DTD then you will have a not wellformed document, that is not XML.

What you can do however is to use an XML catalog and map the DTD to a local copy that can be just a dummy DTD (even an empty file will work I believe). Then oXygen will validate against the schema, will read also the DTD but as that does not contain any defaults the validation against the schema will work.

Best Regards,
George
George Cristian Bina
sampablokuper
Posts: 22
Joined: Tue Apr 08, 2008 1:41 pm

Re: Why do I get errors for multiple @xmlns on /html element?

Post by sampablokuper »

george wrote:Please note that oXygen version 10 includes a licensed Saxon SA so you can use that from oXygen without the need for an additional license.
I'm still using 9.3, but I'll look into getting a license for 10. Thanks for the tip.
george wrote:What you can do however is to use an XML catalog and map the DTD to a local copy that can be just a dummy DTD (even an empty file will work I believe). Then oXygen will validate against the schema, will read also the DTD but as that does not contain any defaults the validation against the schema will work.
Ah, that's a crafty workaround. I'll give that a try when I have time.

Thanks again,

Sam
halindrome
Posts: 1
Joined: Thu Jan 08, 2009 5:11 pm

Re: Why do I get errors for multiple @xmlns on /html element?

Post by halindrome »

I am the principle maintainer of the XHTML Family DTDs and Schema at the W3C. Let me see if I can shed a little light here.

The W3C Validator uses the DTD implementations to validate documents. That validator has been enhanced to ignore warnings about xmlns: attributes. While those warnings are strictly correct (the DTD does not declare the attributes with those names) the warnings are useless and INCORRECT when namespace support is required. Since all XHTML family markup languages require namespace support, it seemed logical to update the validator to also have this support - albeit in a hackish manner. My recommendation would be that Oxygen implement a similar strategy.

W.r.t. xml:space, all XHTML family markup languages are required to have xml:space set to "preserve" on all elements. See XHTML Modularization section 3.5 clause 8, and Appendix I. The xml:space attribute defines how an XML processor treats data on input. XHTML requires that the data is preserved intact and passed to the underlying application for further processing. I believe that any discrepancy with regard to xml:space treatment is a red herring - there is no conflict here. If there are errors in my implementation of the DTD or Schema, please let me know via mail to www-html-editor@w3.org so we can address them formally.
sampablokuper
Posts: 22
Joined: Tue Apr 08, 2008 1:41 pm

Re: Why do I get errors for multiple @xmlns on /html element?

Post by sampablokuper »

Readers of this forum may also wish to view the mailing list thread on the issue, which begins here.
Post Reply