[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
[xsl] overlap nomenclature
Subject: [xsl] overlap nomenclature From: Syd Bauman <Syd_Bauman@xxxxxxxxx> Date: Tue, 21 Feb 2012 10:53:31 -0500 |
[XML, but not XSL, related response to "Processing milestoned XML leads to many preceding:: calls and horrible performance" thread.] <soapBox> The first communication between Charles Goldfarb and Yuri Rubinsky was the written message "tag != element!" (or perhaps "element != tag!" or some such). In that same vein ... milestone != empty element! Mat?j Cepl provides a concise description of the overlap problem in XML, but the solution he describes is NOT the use of milestone elements. He describes HORSE (hierarchical overlap representation using same element, but empty). Both methods make use of empty elements, but they are different. The use of milestones is a simple overlap solution that only works when one of the involved "hierarchies" is already flat, and preferably tessellates the document (or at least the chapter, or whatever). The most common example of this sort of situation is the overlap between the hierarchy of chapters and paragraphs of a book, and the "hierarchy" (really a flat tessellation) of pages in said book. Specifically, milestones are a case of using the single empty segment-boundary element method for (e.g.) referencing systems. Using milestones, one only explicitly marks *the change* from one position in a referencing system to another. Milestone elements differ, however, in assuming a simple single-level segmentation of the text: the values specified in any milestone element apply to all following text until the next element of the same type marked as belonging to the same edition. Hence no explicit end marker is needed and the ID / IDREF mechanism can be dispensed with. -- paraphrased from http://www.tei-c.org/Vault/ML/mlw18.txt footnote (3) HORSE markup, on the other hand, is a case of typed segment-boundary delimiters. In HORSE, an empty element is used to explicitly mark each end (the beginning and the end) of a content object. Each of the empty elements indicates the other by co-indexing of special attributes (sID= and eID=). <aside> The XML elements used as empty segment-boundary elements, whether single, paired, or typed, don't really have to be empty, they just have to be empty with respect to the content objects that overlap. E.g., <pb ed="Caxton2" n="7"/> could just as well be expressed as <pb> <ed>Caxton2</ed> <n>7</n> </pb> so long as "Caxton2" and "7" are guaranteed not to be part of the overlapping data. At one point I had planned to write up a brief treatise on this issue and submit it as a squib for _Markup Technologies: Theory and Practice_. Such is life. </aside> </soapBox> > Maybe I should at least briefly explain it. In many areas > (especially in documents processing) there is a problem with > multiple possible hierarchies overlapping each other (e.g., in > Bibles there are divisions of text which are going across verse and > chapters boundaries and sometimes terminating in the middle of > verse, many especially English Bibles marks Jesus' sayings with a > special element, etc.). One of the ways how to overcome obvious > problem that XML doesn't allow overlapping elements is to use > milestones. So that the book of Bible is not divided like > > <book> > <chapter> > <verse>text</verse> > ... > </chapter> > ... > </book> > > but just putting milestones in the text, i.e.: > > <book> > <chapter n="1" /> > <verse sID="ID1.1" /> > text of verse 1.1 > <verse eID="ID1.1" /> > ... > </book> > > Is this clear?
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Processing milestoned XML, Matěj Cepl | Thread | Re: [xsl] overlap nomenclature, Matěj Cepl |
Re: [xsl] Processing milestoned XML, Matěj Cepl | Date | Re: [xsl] overlap nomenclature, Matěj Cepl |
Month |