[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: Standard API to XSL processors


Subject: Re: Standard API to XSL processors
From: "Oren Ben-Kiki" <oren@xxxxxxxxxxxxx>
Date: Tue, 12 Jan 1999 12:05:45 +0200

Tyler Baker <tyler@xxxxxxxxxxx> has been pushing for a DOM-based API to the
XSL processor:
>... you need some sort of random-access structure like the DOM to
>do anything useful with the XSL output.  Whenever a browser renders a
document, it
>needs to be stored in memory one way or another.  From my understanding,
that is
>one of the goals of the DOM.  The way I look at it from this discussion you
could
>have:
>
>Source + Stylesheet -> Stream -> XML Parser -> DOM
>Source + Stylesheet -> DocumentHandler -> DOM
>Source + Stylesheet -> DOM


These transformations are only necessary for applications which need random
access to the result tree. DocumentHandler is sufficient for serial
application such as writing the tree to a file, conversion to non-XML
languages (such as TeX, say) and so on. That said, There obviously are
applications which require random access, and I agree that the DOM interface
should be "the" random access interface to XML trees, just like
DocumentHandler should be "the" serial interface.

So the last transformation - directly to DOM - is the problematic one . What
are its advantages? One issue was pointed out by Tyler:

>... unless either the
>namespaces spec changes, or else SAX changes (I am hoping for the former),
you have
>this namespaces quagmire to deal with which makes DocumentHandler not
exactly the
>best choice IMHO.


This will have to be fixed, one way or the other, since the namespaces
problem exists for parsed documents as well as for XSL generated ones. We
should be working on fixing this issue, not on workarounds such as providing
alternative APIs. IMVHO this isn't a valid reason for adding a DOM interface
to an XSL processor.

Another reason Tyler mentioned is performance. There are two separate issues
here. First, if I have a DOM XSL or XML tree, and the XSL processor does use
DOM internaly for these trees, then using a SAX interface for the inputs
unnecessarily demands that a second copy of these DOM trees be built. The
second issue is that DocumentHandler requires the XSL processor to emit the
output XML tree in document order. If a DOM tree was being built, it could
create it in arbitrary order, or even in parallel. I'm not sure how serious
an issue this is.

If this reason is deemed important enough, then we need two more standard
interfaces - DomToDocumentHandler and DocumentHandlerToDom. We could then
with a clear concience require each XSL processor to provide all
combinations of input and output formats. This way, you provide the XSL
processor whatever is easiest for you to build, and you request the
processor for whatever you need for further processing.

The XSL processor might be forced to use a conversion class in order to
satisfy this, but this is a price that would have to be paid anyway. On the
other hand, if what you have happens to fit what the XSL processor wants or
is able to give, you avoid the conversions.

So, _if_ XSL processor can make use of DOM trees internally, _and_ the
"standard" interfaces are expanded to add the two conversion processors,
_and_ there are freely available implementations of these interfaces, then I
agree with Tyler - the XSL processor interface should allow both forms of
input and output.

I think that the weakest premise here is the first one. First, XSL
processors written in one programming language (say, these built into
browsers) can't make efficient use of a DOM implemented in another. Second,
James has claimed that the DOM is not suitable to be the internal
representation for one random access application - the XSL processor itself.

My guess is that an XSL processor does not need "random access" to the
tree - it needs highly structured search capabilities (by mode, element
name, and so on) - which are beyond the scope of the DOM interface. Maybe
these can be added by external "indexing" data structures which would hold
references into DOM sub-trees?

Could James - and anyone else who has been writing XSL processors or other
random-access XML applications - share their experience with using the DOM
as internal representation?

Share & Enjoy,

    Oren Ben-Kiki



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords