Working with large Docbook document splitted into parts

Are you missing a feature? Request its implementation here.
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Working with large Docbook document splitted into parts

Post by honyk »

Hello,

our common scenario is to split documentation into smaller parts, which are referred using xi:include into the main document.

Although it is not necessary for processing of the main document, for linking we use olink instead of link to avoid validation issues if target is outside of current document. Sometimes repeating blocks of text are defined in the main document in form of entities and used across the whole documentation. Again, it is not problem to validate the whole document, but if such global, but invisible entities for partial document are used, they cause validation errors if such document is to save.

If documentation is created as single file, there can be used nice interlinking feature. As splitted document have no info about its main document (parent), such linking across the individual splitted files is impossible.

I cannot imagine possible consequences, but for more convenient work I would be grateful for any method how to tell to Oxygen that main document for current partial document is XY.xml and validation of this part would be validation of the whole main document, and only errors which belong to current edited part would be shown. This connection to the main document could be used in many tasks - for example for selection of target ID's of link, there would be easier for Oxygen to jump from link in partial document to another partial document and so on.

This connection could be established e.g. by inserting a special processing instruction into each partial file with relative path to the main document. Such inserting could be automated in Oxygen by the following way:
1. Opening the main file
2. Use new action 'Create project from file' - it would automatically insert all reffered files into project and set a special flag to the main document
3. Run new 'Project' action 'Insert connection info' - it would insert PI into all reffered files without flag
4. This 'single document project' need not be saved, it is for temporary use only.
5. In similar way all PI could be discarded. If one file is linked to other main documents, only PI which belongs to curent 'single document project' would be removed by other new 'project' action 'Remove connection info'.

Jan
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

Hello,
honyk wrote:Although it is not necessary for processing of the main document, for linking we use olink instead of link to avoid validation issues if target is outside of current document.

...

As splitted document have no info about its main document (parent), such linking across the individual splitted files is impossible.

...

for more convenient work I would be grateful for any method how to tell to Oxygen that main document for current partial document is XY.xml and validation of this part would be validation of the whole main document
Thank you for your request but it is already implemented :) Just create a validation scenario that specifies XY.xml as validation unit (the starting point of validation) and associate this scenario with all the XIncluded parts. The Validate Document action applies the validation scenario associated with the current file if there is a scenario associated with it.


Regards,
Sorin
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

honyk wrote:This connection to the main document could be used in many tasks - for example for selection of target ID's of link, there would be easier for Oxygen to jump from link in partial document to another partial document and so on.
This can be done in Author mode in DocBook documents with one of the actions of the DocBook document type: Insert Link, Insert OLink, Insert ULink, Insert XRef. Click on the symbol of the inserted link in Author mode jumps to the referenced section/element.


Regards,
Sorin
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Re: Working with large Docbook document splitted into parts

Post by honyk »

Custom validation scenario seems promising, thanks for the tip. I've tested it but I am quite confused by validation result:
1. I am disturbed by errors not corresponding to my document. Sometimes it is useful, but there would be nice an option to show only related errors.
2. It doesn't tell me the truth. If in my partial document there is specified link, which target is in another document, I would expect fine result, but validation complains that 'An element with the identifier "mylink" must appear in the document.'
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Re: Working with large Docbook document splitted into parts

Post by honyk »

sorin wrote:Click on the symbol of the inserted link in Author mode jumps to the referenced section/element.
This works for links across single document, not for links between two individual documents, which are xincluded into the main document. Linkend attribute is ID only so I understand that real path cannot be resolved properly without access to the content of the main (parent) document.

=============

If linkend attribute is filled in, there is combo with list of available ID's displayed. This list contains only these ID's in current document, not list of all available ID's of all child document of main document. It would be nice if validation scenario is set, this list would be extended. Sometimes it could be timeconsuming operation, but it can avoid possible mistakes.
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

If a validation scenario that starts the validation at a master XML file that includes other XML files is set then all the IDs defined in the included files are displayed in the combo box as possible values of the linkend attribute. If a validation scenario is not set only the IDs of the current files are displayed. This is true for both DocBook 4 and DocBook 5 documents. If you get a different result please post some sample XML files and the master file that includes them.


Regards,
Sorin
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Re: Working with large Docbook document splitted into parts

Post by honyk »

Great, I got it. It requires valid master file. From my point of view it was valid, but Oxygen have some limitation. It is described in previous post
If in my partial document there is specified link, which target is in another document, I would expect fine result, but validation complains that 'An element with the identifier "mylink" must appear in the document.'
There is not also possible to jump to final location clicking on link icon of such 'invalid' link.
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Re: Working with large Docbook document splitted into parts

Post by honyk »

To clarify my post:

master file:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<book>
<title>Test</title>
<xi:include href="first.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>
<xi:include href="second.xml" xmlns:xi="http://www.w3.org/2001/XInclude"/>
</book>
first.xml:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<chapter id="first">
<title>First chapter</title>
<para>Dummy text.</para>
</chapter>
second.xml:

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd">
<chapter id="second">
<title>Second chapter</title>
<para>Dummy text with <link linkend="first">link</link>.</para>
</chapter>
Although there is established validation scenario with master document which is linked to both child documents, validation fails.
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

As I specified in a DocBook 4 document the default engine cannot validate a reference to other file because DocBook 4 documents are validated against a DTD and each file is validated before resolving the XInclude references. You have to set LIBXML instead of the default engine in the validation scenario.

When you edit the master file the link that is rendered in Author mode for <link linkend="first">link</link> (second.xml) jumps to the element <chapter id="first"> of file first.xml.


Regards,
Sorin
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Re: Working with large Docbook document splitted into parts

Post by honyk »

If LIBXML validation engine is used, no list of available ID's is available for linkend attribute of link element in my docbook document. Is this feature available only to the default engine?

Btw, jumping across reffered content, not only via master document, would be nice feature. If validation scenario points to a master document which is applied to a document with defined link, from my point of view there is complete info what to open and where to scroll...
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

Yes, the list of possible values for linkend is available in auto-completion only if the default engine is used. LIBXML is an external engine and cannot be used for finding these values.

I added an enhancement request for jumping to the link target when there is a validation scenario.


Regards,
Sorin
honyk
Posts: 176
Joined: Wed Apr 29, 2009 4:55 pm

Re: Working with large Docbook document splitted into parts

Post by honyk »

sorin wrote:Yes, the list of possible values for linkend is available in auto-completion only if the default engine is used. LIBXML is an external engine and cannot be used for finding these values.
In this case there would be nice to improve the default engine to be able to process XIncludes and to offer similar functionality as LIBXML. If default engine is Xerces, some XInclude support is available I think (http://xerces.apache.org/xerces2-j/faq-xinclude.html).

From my point of view, auto-completion and proper validation are two crucial features for regular work with large & splitted documentation.
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

honyk wrote:In this case there would be nice to improve the default engine to be able to process XIncludes and to offer similar functionality as LIBXML. If default engine is Xerces, some XInclude support is available I think (http://xerces.apache.org/xerces2-j/faq-xinclude.html).
Yes, the default engine is Xerces and it has XInclude support but as I specified the ID references are resolved before the XInclude references. The problem was reported to the Xerces project but a change in Xerces that will switch this order is quite a challenging one.

Add to this the fact that we are not the committers of the Xerces project so any Xerces modification must go through them. We tried with a different modification (not related to XInclude) and it was rejected but we think it was a god modification and we included it in Oxygen.

So the implementation of this enhancement may take some time. We may come back to it in one of the future versions.


Regards,
Sorin
sorin_ristache
Posts: 4141
Joined: Fri Mar 28, 2003 2:12 pm

Re: Working with large Docbook document splitted into parts

Post by sorin_ristache »

You can have both features in the same validation operation (validation with Xerces for IDs defined in other fragment files and listed in the content completion window and validation with LIBXML for links across different fragment files): create two validation units in the same validation scenario, one uses the default engine and the other uses LIBXML, both validation units starting the validation from master.xml. Associate the validation scenario with all files (master.xml, first.xml, second.xml) and ignore Xerces errors about unresolved IDREF attributes that point to ID values defined in other fragment files (the case of <link linkend="first"/>).


Regards,
Sorin
Radu
Posts: 9059
Joined: Fri Jul 09, 2004 5:18 pm

Re: Working with large Docbook document splitted into parts

Post by Radu »

Hi,

We released Oxygen 15.0 a couple of days ago.
In oxygen 15.0 we added a feature called:

Code: Select all

Post XInclude Processing DTD Validation
When XML documents using DTDs are assembled using XInclude, the default validation behavior is now to first assemble all documents in a master document and then validate the master document using the referenced DTD.
Basically this would mean that the Xerces DTD validation of the master XML document is done after all xi:include documents are resolved and all IDs are collected, thus eliminating the need to use LIBXML in this equation.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
Post Reply