[XSL-LIST Mailing List Archive Home]
RE: [xsl] Schema Optimsations Was XSL: For-Each Efficient or Not?
Subject: RE: [xsl] Schema Optimsations Was XSL: For-Each Efficient or Not?|
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Tue, 2 Jul 2002 09:34:24 +0100
You are right that there is a law of diminishing returns with
optimization: there's a rule of thumb with relational optimizers that
says if you've found an execution strategy that you expect to be quicker
than the time you have already spent optimizing, the time has come to
execute the query. Applying that strategy of course requires a fairly
good cost model.
Where stylesheets are going to be used thousands of times there
certainly are potentially worthwhile gains from compile-time
optimization, and it's true that schema knowledge can help this in
theory. In practice there are quite a few obstacles: the very dynamic
template rule mechanism used in XSLT means that there is rather limited
knowledge available about the flow of control in the stylesheet or the
types of the nodes being processed at any point. It turns out that even
binding the stylesheet to a schema (so that it can only process
documents that are valid against that schema) isn't enough, because of
complications like temporary trees and secondary input documents.
But actually, I think people sometimes underestimate what can be
achieved with run-time optimization (that is, decisions made about the
execution strategy at run-time rather than at compile time). An XSLT 2.0
processor is likely to have much better type information available at
run-time than at compile time, and this might turn out to be the key.
The classic exemplar of a schema-based optimization is to reduce the
search space for a path expression such as //item. If you can compile
the schema (not the stylesheet) to create an index that shows which
element types can contain which other element types, the XSLT processor
can quite reasonably use this index at run-time, without adding to the
stylesheet compilation cost.
The bottom line is: wait and see!
> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx] On Behalf Of
> Kevin Jones
> Sent: 02 July 2002 00:20
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Schema Optimsations Was XSL: For-Each Efficient or Not?
> On a related subject I have been thinking about what might be
> with schema information available at stylesheet compile time
> as apposed to
> runtime. Its long been speculated that there are many
> optimisations available
> in that scenario but I don't know of any processor that takes
> advantage of
> them today which is probably a big hint about the
> practicality of such
> Given runtime schema information of the type proposed in
> XPath 2.0, it would
> appear to me that the performance cost of generating/using it
> may out way the
> benefit, potentially causing schema aware XSLT 2.0 processors
> to be slower
> than 1.0 ones.
> The best alternative I can think of is to compile stylesheets
> against a
> specific schema. So there may be many compiled forms for a
> single stylesheet.
> But even this case has problems, just because a document says
> it uses a
> schema is no help if its not being validated, which I can't
> see being cheap
> with any schema language.
> I guess the question is, how do you write a schema aware
> processor that is
> quicker than a schema ignorant processor?
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list