Page 1 of 1

unvalid NVDL - dtd schema type

Posted: Thu Jul 21, 2011 1:31 pm
by matthieu.ricaud
Hello,

I'm just starting with NVDL.

I want to validate XHTML with a DTD.

Code: Select all

<validate schema="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd" schemaType="application/xml-dtd"/>
It raises a timeout error because W3C web site prevents tools (other than web browser) to access the DTD.

So I tried with the local version of the DTD in oXygen Framework's folder :

Code: Select all

<validate schema="file:///C:/Program%20Files/Oxygen%20XML%20Editor%2012/frameworks/xhtml11/dtd/xhtml11-flat.dtd" schemaType="application/xml-dtd"/>
I get this errors :
SystemID: C:\Program Files\Oxygen XML Editor 12\frameworks\xhtml11\dtd\xhtml11-flat.dtd
Nom du moteur: Jing
Gravité: fatal
Description: The markup in the document preceding the root element must be well-formed.
Emplacement de début: 34:3

SystemID: C:\Program Files\Oxygen XML Editor 12\frameworks\xhtml11\dtd\xhtml11-flat.dtd
Nom du moteur: Jing
Gravité: error
Description: SAXParseException - The markup in the document preceding the root element must be well-formed.
Emplacement de début: 34:3

It seems Jing tries to valid the DTD just like it was an XML file ??

Any idea ?

Matthieu.

Re: unvalid NVDL - dtd schema type

Posted: Thu Jul 21, 2011 5:39 pm
by george
The NVDL implementation that we use (initially developed as part of the oNVDL project and now contributed to Jing) does not support DTDs as schema language inside an NVDL script. This works only with XML Schema 1.0, Relax NG (compact and XML) and pre-ISO Schematron.

Please use the Relax NG schema for XHTML instead of the DTD, for example you can refer to that as http://www.w3.org/1999/xhtml/xhtml.rng (oXygen will automatically resolve this to a local copy [oxygen]/frameworks/xhtml/relaxng/xhtml.rng).

Note also that DTDs are not namespace aware. Also, as a general workaround in case there is no other type of schema available, you can use Trang (from oXygen you can trigger it with the Tools -> Generate/Convert Schema) to convert from DTD to Relax NG and then use the Relax NG schema in your NVDL script.

Best Regards,
George

Re: unvalid NVDL - dtd schema type

Posted: Thu Jul 21, 2011 6:43 pm
by matthieu.ricaud
Thanks for the explanation.

I found "NVDL and DTD" use on this tuto : http://www.dpawson.co.uk/nvdl/syntax.html#syntax.ex1

But as you said DTD are not namespace aware, so I will go on with XSD and/or Relax NG (by the way, does oXygen supports Relax NG compact syntax which like DTD is not XML ?).

In my case, I have to validate an XML in which textual parts are writen in XHTML. This is how it looks :

Code: Select all

<foo:forms xmlns:foo="http://www.foo.com" xmlns="http://www.w3.org/1999/xhtml">
<foo:form id="form1" type="TrueFalseList">
<p>Veuillez répondre au question suivantes :</p>
<foo:assertion id="form1q1" is="true">
<p>
<em>Jean</em> est plus grand que <em>Paul</em> :
<foo:input type="radio" value="true"><strong>Vrai</strong></foo:input>
&#160;&#160;
<foo:input type="radio" value="false"><strong>Faux</strong></foo:input>
</p>
</foo:assertion>
<foo:assertion id="form1q2" is="false">
<p>
<em>Jean-Pierre</em> est plus petit que Paul :
<foo:input type="radio" value="true"><strong>Vrai</strong></foo:input>
&#160;&#160;
<foo:input type="radio" value="false"><strong>Faux</strong></foo:input>
</p>
</foo:assertion>
<p><foo:input type="submit" value="Vérifier vos réponses"/></p>
</foo:form>
</foo:forms>
That means there is no <html> root element anywhere. Let's say XHTML elements may appear inside ou inbetween each <foo:*> elements. Those XHTML element must be valid in a <body> context of XHTML grammar.

Is there a way to define such context (xpath context = "/html/body")with NVDL ?

I tried to point to [oxygen]/frameworks/xhtml/relaxng/modules/text.rng, but ofcourse Jing is complaining there is no <start> element (and some breaking modules binding).

DTD doesn't define any root element, that's maybe the only "advantage" ?

Regards,

Matthieu Ricaud.

Re: unvalid NVDL - dtd schema type

Posted: Fri Jul 22, 2011 12:50 pm
by george
Hi,

I do not know if NVDL specifically excludes DTDs and one NVDL implementation, JNVDL supports DTDs. The oNVDL/Jing implementation does not support DTDs.

You can define a Relax NG schema that includes the XHTML schema and redefines the start pattern to be a choice of the patterns that allow the elements that you want to appear in your document, for example

Code: Select all


<?xml version="1.0" encoding="UTF-8"?>
<grammar
xmlns="http://relaxng.org/ns/structure/1.0"
xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

<include href="http://www.w3.org/1999/xhtml/xhtml.rng">
<start>
<choice>
<ref name="p"/>
<ref name="div"/>
<!-- etc. -->
</choice>
</start>
</include>
</grammar>
Then you can refer to this schema in your NVDL script. For example the following script gathers all the content in foo namepsace and allows it (you can replace the allow action with a validate action against your schema) and then it gathers each XHTML subtree and validates it against the test.rng schema above

Code: Select all


<rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="foo">
<mode name="foo">
<namespace ns="http://www.foo.com">
<allow useMode="attachFoo"/>
<unwrap useMode="xhtml"/>
</namespace>
</mode>
<mode name="attachFoo">
<namespace ns="http://www.foo.com"><attach/></namespace>
<anyNamespace><unwrap/></anyNamespace>
</mode>
<mode name="xhtml">
<namespace ns="http://www.w3.org/1999/xhtml">
<validate schema="test.rng" useMode="attachXHTML"/>
</namespace>
<anyNamespace><unwrap/></anyNamespace>
</mode>
<mode name="attachXHTML">
<namespace ns="http://www.w3.org/1999/xhtml"><attach/></namespace>
<anyNamespace><unwrap/></anyNamespace>
</mode>
</rules>
Best Regards,
George

Re: unvalid NVDL - dtd schema type

Posted: Fri Jul 22, 2011 3:01 pm
by matthieu.ricaud
I was on the same direction in my search.

I have to give up now, but I tell you soon about the test.

Thanks for your help.

Regards,
Matthieu

Re: unvalid NVDL - dtd schema type

Posted: Mon Jul 25, 2011 4:53 pm
by matthieu.ricaud
Hi,

I have tested your solution, it works fine. I still have a few interrogations.

1) XHTML within <div> validation
For my XHTML part, I wanted to allow the same content as div element. That mean any content that might appear within a div (text, span, div, table, ...).
According to the xhtml rng schema a div is defined by :

Code: Select all

<define name="div">
<element name="div">
<ref name="div.attlist"/>
<ref name="Flow.model"/>
</element>
</define>
and Flow.model is defined by :

Code: Select all

<define name="Flow.model">
<zeroOrMore>
<choice>
<text/>
<ref name="Inline.class"/>
<ref name="Block.class"/>
</choice>
</zeroOrMore>
</define>
In test.rng, I can't define "Flow.model" to be a "start" because :
- zeroOrMore is not a good model for a starting (oneOrore would be correct I guess)
- text neither

I then just tried this out for test.rng :

Code: Select all

<?xml version="1.0" encoding="utf-8"?>
<rng:grammar xmlns:rng="http://relaxng.org/ns/structure/1.0">
<rng:include href="http://www.w3.org/1999/xhtml/xhtml.rng">
<rng:start>
<rng:choice>
<rng:text/>
<rng:ref name="Block.class"/>
<rng:ref name="Inline.class"/>
</rng:choice>
</rng:start>
</rng:include>
</rng:grammar>
But this schema can't be validated, jing is complaining :
found element matching the prohibited path start//text in the simplified XML form of the schema (see section 7.1 of the RELAX NG specification)


Text at this point is actually a problem : the only solution would be to allow it is part of "foo" grammar and should be allowed at the point where XHTML comes.

As a conclusion I would say : Text apparition can't make NVDL to change validation model because text is not "namespaced", and should always be considered as part of the "parent element" model in the validation process.

Daves pawson explains a similar case with validating "Atom" XML grammar in his tuto : http://www.dpawson.co.uk/nvdl/validate.html.
"Atom chose to allow anything in the XHTML namespace within a div element".
2) XHTML schema
It's a more general question, but I find it hard to find XHTML schema for any version of the langage. I actually works with XHTML1.1.

- The RelaxNG schema provided with oXygen ([oxygen]/frameworks/xhtml/relaxng) seems to be for XHTML1.0, as explained in http://www.thaiopensource.com/relaxng/xhtml. This schema is modularized and I've read that XHTML1.1 is a modularized version of XHTML1.0 (with a few differences ?)

- Looking at [oxygen]/frameworks/xhtml11, there is no Relax NG version of the schema.

- Do you know where I can find a XHTML1.1 RNG schema ?

- I thought about converting it from XSD with "Trang" but :
1) Trang doesn't take XSD as input (http://www.thaiopensource.com/relaxng/trang.html)
Do you know any other reliable tools for such a conversion ?
2) which XSD input should I take :
- the one in http://www.w3.org/TR/2004/WD-xhtml-modu ... zation.zip (xhtml11.xsd stipule v1.7)
- the one in http://www.w3.org/TR/xhtml11/xhtml11.zip (xhtml11.xsd says v1.3)

It's sometimes hard to find out one way with schema and version's jungle :roll:

Any help, advices, related experiences would be appreciated.

Best Regards,
Matthieu.

Re: unvalid NVDL - dtd schema type

Posted: Tue Jul 26, 2011 9:23 am
by george
The NVDL script will never break a text node from its parent element to validate the text with a different schema. In the end the result of the NVDL validation is equivalent with splitting the document in a few documents and apply the corresponding validation actions on them. As these are XML documents they will always have a root element thus the start pattern should refer only to elements so please remove the text pattern from there and everything should.

The closest I was able to find was
http://lists.dsdl.org/dsdl-discuss/2010-03/0000.html
which refers to
http://www.thaiopensource.com/relaxng/xhtml/
and to some updates on that.
But please note that you can use also XML Schema from NVDL so you may want to refer to the XSD instead.

Best Regards,
George

Re: unvalid NVDL - dtd schema type

Posted: Tue Jul 26, 2011 5:18 pm
by matthieu.ricaud
Hi George,

Thanks for confirming the text validation issue I pointed out. Your explnation is quite clear !
My NVDL works fine now.

I have written a RNG schema "foo.rng" for foo elements only and a schematron foo-xhtml.sch for both foo and xhtml.

This is the new NVDL I made, with a few differences in the way modes are called.

test.ndvl

Code: Select all

<rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="main">
<mode name="main">
<namespace ns="http://www.foo.com">
<validate schema="foo-xhtml.sch" useMode="attacheALL"/>
<validate schema="foo.rng" useMode="attachFoo"/>
</namespace>
<namespace ns="http://www.w3.org/1999/xhtml">
<validate schema="xhtml-within-foo.rng" useMode="attachXHTML"/>
</namespace>
</mode>
<mode name="attachFoo">
<namespace ns="http://www.foo.com"><attach/></namespace>
<namespace ns="http://www.w3.org/1999/xhtml"><unwrap/></namespace>
</mode>
<mode name="attachXHTML">
<namespace ns="http://www.foo.com"><unwrap/></namespace>
<namespace ns="http://www.w3.org/1999/xhtml"><attach/></namespace>
</mode>
<mode name="attachALL">
<namespace ns="http://www.foo.com"><attach/></namespace>
<namespace ns="http://www.w3.org/1999/xhtml"><attach/></namespace>
</mode>
</rules>
Let me know if this way to call modes seems ok.
Note that the schematron validation doesn't work in oXygen, I wrote another topic about this on oNDVL forum - http://www.oxygenxml.com/forum/topic6185.html)

About XHTML shema, thanks for the link, interresting discussion. I actually work with EPUB format, which require validation XHTML1.1 validation.

I'm trying with XSD validation instead of RNG but I can't find a way to translate my XHTML driver to an XSD one :

xhtml-within-foo.rng

Code: Select all

<grammar  xmlns="http://relaxng.org/ns/structure/1.0">
<include href="file:///G:/3B2/DTD/W3C/xhtml1/rng/xhtml.rng">
<start>
<choice>
<!--<text/>-->
<ref name="Block.class"/>
<ref name="Inline.class"/>
</choice>
</start>
</include>
</grammar>
Cause it seems not possible to define any kind of "start" element (undeterminated name) with XSD . I tried with a complexType :

Code: Select all

<?xml version="1.0" encoding="utf-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://www.w3.org/1999/xhtml" blockDefault="#all">
<xs:import namespace="http://www.w3.org/1999/xhtml" schemaLocation="file:///G:/3B2/DTD/W3C/xhtml11/xsd/oxygen/xhtml11.xsd"/>
<!--<xs:include schemaLocation="file:///G:/3B2/DTD/W3C/xhtml11/xsd/oxygen/xhtml11.xsd"/>-->

<xs:complexType name="start">
<xs:choice>
<xs:element ref="InlStruct.class"/>
<xs:element ref="BlkStruct.class"/>
</xs:choice>
</xs:complexType>

<element ref="start"/>
</xs:schema>
But I can't validate this schema...

I tried to convert the rng to xsd with trang but it only imports the XHTML1.1 definition in the "driver" schema (all xhtml-...rng are converted to xsd too).

I think I will finaly accept only some <div> element in foo grammar, will be much easyer !!!

Best Regards,

Matthieu.

Re: unvalid NVDL - dtd schema type

Posted: Thu Jul 28, 2011 4:19 pm
by george
Hi Matthieu,

The NVDL script works very similar with how XSLT works. Only that it is applied on element and attribute sections instead of elements and attributes. When you say useMode="someMode" in some action then the subsections will be processed with the rules from that mode, exactly how it happens when you say <xsl:apply-templates mode="someMode"/>.

Back to your script, the XHTML validation that you put inside the main mode

Code: Select all


<namespace ns="http://www.w3.org/1999/xhtml">
<validate schema="xhtml-within-foo.rng" useMode="attachXHTML"/>
</namespace>
will never be triggered, unless the document starts with an XHTML section.

For the Schematron issue please see my reply from the topic you refer to.

XML Schema does not specify the start elements, as Relax NG does. Any global element can be a start element. Thus, all you need to do is to refer to the XHTML 1.1 XML Schema and if needed you can add an additional Schematron schema to specifically check that the root element is one that you allow.

Code: Select all


<rules xmlns="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0" startMode="foo">
<mode name="foo">
<namespace ns="http://www.foo.com">
<allow useMode="attachFoo"/>
<unwrap useMode="xhtml"/>
</namespace>
</mode>
<mode name="attachFoo">
<namespace ns="http://www.foo.com"><attach/></namespace>
<anyNamespace><unwrap/></anyNamespace>
</mode>
<mode name="xhtml">
<namespace ns="http://www.w3.org/1999/xhtml">
<validate schema="http://www.w3.org/MarkUp/SCHEMA/xhtml11.xsd" useMode="attachXHTML"/>
</namespace>
<anyNamespace><unwrap/></anyNamespace>
</mode>
<mode name="attachXHTML">
<namespace ns="http://www.w3.org/1999/xhtml"><attach/></namespace>
<anyNamespace><unwrap/></anyNamespace>
</mode>
</rules>
Best Regards,
George

Re: unvalid NVDL - dtd schema type

Posted: Fri Jul 29, 2011 1:45 pm
by matthieu.ricaud
Hi George,

I finaly found a RelaxNG implementation of XHTML1.1 inside epucheck jar file (http://code.google.com/p/epubcheck). But thanks for the tip of using a schematron rule to set the authorized xhtml "root".

I enjoy XSLT programming but I'm still confused with some NVDL model, I have to get the habit of it. Thanks for your advices and sample.

Regards,

Matthieu