[oXygen-user] How to include smaller files into a "master document" in Oxygen?

Jean-Luc Chevillard
Tue Dec 8 11:14:16 CST 2015


Hello Wendell,

thanks for your message.

I have managed yesterday to use the method recommended by Adrian (to 
whom I also adress my thanks).

This involved, among other things, transforming the internal DTD-s for 
the the chapters
into an external DTD file,
and verifying that they were all identical
(which was in fact originally not the case, because I had been 
progressively enriching the structure;
while editing the chapters one by one  :-)

I also had to diversify my XSLT strategy
because the original XSLT script, which worked fine with the chapters,
which are subtrees
did not work with the global thesaurus tree,
because the XPath references were too short ...

But in the end it worked fine,
and remained lightning fast,
even when going through 6000 entries
(instead of three times 2000 entries)
[I have 3 chapters so far, and will have 10 chapters in the end]

I have also been wondering whether I should write a DTD for the master 
document ...,
(but have so far not tried)
in the same way as I had to write yesterday evening a specific CSS for 
the master document.

My question concerning the place where the "information" is "stored"
was motivated by the fact that the series of videos
(for the "older", SGML-like, method)
showed buttons on which the virtual user was supposed to press,
in order to ENABLE the master file support
(and also because some menus indicated that the user
had to choose between several "possible master files"
and I imagined that the result of that customizing would be stored
somewhere in a file hidden the cryptic place called 
"%APPDATA%\Roaming\com.oxygenxml"
(or something of the sort .... I have not yet tried to SEE it ...)

I am relieved to realize that many things are now much easier than they 
were,
for people who work with complex scripts as Tamil
(my first experience with this matter go back to the time, in the eighties,
of writing Escape sequences for entering a user-defined "font"
in the memory of a dot-matrix printer, at the time of Dos 2.0

We have gone a long way :-)

Cheers

-- Jean-Luc (Paris-Pondicherry-Hamburg)


"https://univ-paris-diderot.academia.edu/JeanLucChevillard"

"https://plus.google.com/u/0/113653379205101980081/posts/p/pub"

"https://twitter.com/JLC1956"




On 08/12/2015 16:52, Wendell Piez wrote:
> Hi Jean-Luc,
>
> I have a couple of comments regarding the XML part of your question
> (as opposed to the oXygen part). Some of this merely repeats what
> Adrian has said in more detail.
>
> Both external parsed entities and XInclude are mechanisms designed, at
> least putatively (in the case of parsed entities) for your use case.
> However, they are very different. In understanding why we have two
> mechanisms and what accounts for their differences, it helps to know
> that external parsed entities predate the "Dawn of XML", being a part
> of the SGML standard (ISO 8879:1986) of which XML is a refinement.
> This means they (at least for your use case) are somewhat like cooking
> your dinner in the fireplace. It works, and some systems (perhaps even
> fine restaurants) still use and rely on the mechanism; but some
> professional chefs have never seen it done and wonder why you would do
> it this way when you have a stove.
>
> The main difference between the mechanisms is that entity resolution
> takes place at "parse time", i.e. when a processor (a parser) reads
> XML markup (tags and text) and then does something with it.
>
> XInclude resolution postpones the assembly of the composite document
> until a processing step after parsing. That is, the various components
> are parsed separately, yielding several "XML documents" (considered as
> tree structures in memory, no longer tags-and-text) which can then be
> assembled typically as one step in a transformation or processing
> pipeline that does other stuff as well (such as generate formatted
> output).
>
> This is an important and useful distinction -- XInclude, in other
> words, takes advantage of modern architectures in which parsing is
> generic -- we don't configure separate parsing logic for every
> document, but instead use a commodity parser that produces a
> standardized result, which we can then process using XPath, XSLT etc.
>
> In particular, since you are validating using a DTD, the XInclude
> mechanism may work better for you, as (among other things) it means
> you can continue to validate the fragments, as fragments, against the
> DTD even before or without XInclude resolution that assembles the
> composite document. (Using oXygen's "master document" feature you
> could perhaps work around this limitation by always validating the
> composite document even when working with a fragment; but XInclude is
> nevertheless more flexible.)
>
> You asked about where the information is stored regarding how the
> master document and the included documents are related.
>
> This will typically be in the master file itself ... using entities,
> in declarations (indeed these are formally part of the DTD but they
> are commonly managed in an internal DTD subset in the document prolog
> as Adrian showed -- or search for "XML external parsed entities" for
> more examples). Using XInclude, however, you simply embed XInclude
> elements in your master document that present references to other
> files in your system. (I.e.: these are a form of hypertext link, to be
> resolved in processing.)
>
> <xi:include href="file.xml"/>
>
> This means, pretty simply, "Include the XML here from 'file.xml'." It
> can get more complex - there can be a fallback if file.xml is ever
> missing, plus you can actually include fragments from within file.xml,
> etc. etc.
>
> Using either mechanism, however, you will probably find things are
> pretty straightforward once you know about the knobs and switches --
> until you hit the problem of internal cross-references within your
> assembly, which becomes somewhat more complex to manage as soon as
> your document is in several pieces. The bottom line here is that using
> XInclude, you can no longer rely on ID/IDREF attributes (declared in
> your DTD) to help ensure the integrity of your cross-references, since
> these attributes must now be able to point to elements across file
> boundaries. Of course, there are ways of dealing with this too....
>
> Cheers, Wendell
>
>
> On Mon, Dec 7, 2015 at 10:57 AM, Jean-Luc Chevillard
> <> wrote:
>> Hello Adrian,
>> (resent, with copy to the mailing list)
>>
>> thanks for your explanations.
>>
>> My XML files are "custom"
>> and I have defined a DTD and a CSS myself,
>> which I am progressively enriching,
>> as my understanding of the complexity of that particular thesaurus grows.
>>
>> In my creation of the Thesaurus
>> I alternate between "Text Mode" for certain (complex) tasks
>> and "Author Mode" for (easy) tasks.
>>
>> I had not realized until now that the RLT file size constraints are not that
>> stringent in Author Mode.
>> (BTW, Tamil is not a "RTL language" but rather a script with complex
>> rendering ...)
>>
>> Since I asked the question,
>> I have discovered the following video on the Oxygen Web site,
>> "http://oxygenxml.com/demo/Working_With_XML_Modules.html"
>>
>> Using your answer and the video,
>> I should (hopefully) have no difficulty trying to use that method.
>>
>> I have however a remaining question:
>> -- I expect the linkage between the "master document" and the "included
>> documents" to be described in some auxiliary file, which will tell Oxygen
>> where to look for the DTD, etc. when it does "content completion"
>> -- WHERE will those auxiliary files be located? (will they be in a hidden
>> location) How can I do backup for them?
>> -- WHAT HAPPENS when I upgrade to the next version of Oxygen? Will the
>> auxiliary files be destroyed?
>>
>> Thanks for clarifying those points
>>
>> Best wishes
>>
>> -- Jean-Luc Chevillard (Paris)
>>
>>
>> "https://univ-paris-diderot.academia.edu/JeanLucChevillard"
>>
>> "https://plus.google.com/u/0/113653379205101980081/posts/p/pub"
>>
>> "https://twitter.com/JLC1956"
>>
>>
>>
>> On 07/12/2015 15:32, Oxygen XML Editor Support (Adrian Buza) wrote:
>>> Hello,
>>>
>>> What type of XML files are these (DITA, DocBook, TEI or custom)?
>>>
>>> The "support for RTL languages" is problematic for large files in Text
>>> mode. But Author mode (Document > Edit Mode > Author) can handle RTL
>>> content a lot better, so that's a possible solution if you want to
>>> work with larger files.
>>> However, if you have a custom type of document thhat Oxuygen doesn't
>>> support out-of-the-box, you'll have to create a custom CSS, so that
>>> the document is represented properly in Author mode:
>>>
>>> http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#concepts/dg-css-stylesheet.html
>>>
>>>
>>>> Can I create a MASTER document in Oxygen
>>>> in which the chapters are INCLUDED
>>>> (by mentionning their names)
>>>> for the sake of processing the sum of the chapters?
>>> Yes, you can use XInclude to bind all documents together in a single
>>> master file. This way you can transform the master that includes all
>>> other documents as if it's a single document.
>>>
>>> http://www.oxygenxml.com/doc/versions/17.1/ug-editor/index.html#topics/including-document-parts-with-XInclude.html
>>>
>>> The examples from this section are for DocBook, but XInclude is
>>> supported by Oxygen independent from the XML format.
>>> Check your XInclude options from Oxygen (Options > Preferences, XML >
>>> "XML Parser", "XInclude Options"), they should be enabled by default.
>>>
>>>> (I currently run XSLT transformations on separate chapters for making
>>>> indices? How to do that for the sum of the chapters?)
>>> It's simpler to use the master file and runt the transformation just
>>> once on it.
>>>
>>> Regards,
>>> Adrian
>>>
>>> Adrian Buza
>>> oXygen XML Editor and Author Support
>>>
>>> Tel: +1-650-352-1250 ext.2020
>>> Fax: +40-251-461482
>>> 
>>>
>>>
>>> On 04.12.2015 15:40, Jean-Luc Chevillard wrote:
>>>> Greetings!
>>>>
>>>> I am currently editing several XML files
>>>> which are the chapters of a Tamil thesaurus,
>>>> and each file is dangerously close to the size limit
>>>> connected with the script specificity.
>>>> ("support for RTL languages" has to be activated,
>>>> for proper display,
>>>> and I have already increased the default size
>>>> beyond which RTL language support is automatically deactivated)
>>>>
>>>> I would appreciate pointers
>>>> on the best methods
>>>> for dealing with the larger entity,
>>>> i.e. the sum of the chapters.
>>>>
>>>> Can I create a MASTER document in Oxygen
>>>> in which the chapters are INCLUDED
>>>> (by mentionning their names)
>>>> for the sake of processing the sum of the chapters?
>>>>
>>>> (I currently run XSLT transformations on separate chapters for making
>>>> indices? How to do that for the sum of the chapters?)
>>>>
>>>> Thanks for any pointers to the appropriate section inside the Oxygen
>>>> documentation (provided that it is possible to do that in Oxygen ...).
>>>>
>>>> If the "RTL language" limit was not there, I would have used a single
>>>> file but that does not seem to be possible.
>>>>
>>>> -- Jean-Luc Chevillard (Paris-Pondicherry-Hamburg)
>>>>
>>>>
>>>> "https://univ-paris-diderot.academia.edu/JeanLucChevillard"
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> oXygen-user mailing list
>>>> 
>>>> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
>>>>
>>>
>>>
>> _______________________________________________
>> oXygen-user mailing list
>> 
>> https://www.oxygenxml.com/mailman/listinfo/oxygen-user
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.oxygenxml.com/pipermail/oxygen-user/attachments/20151208/6d324761/attachment.html>


More information about the oXygen-user mailing list