[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] A proposal:xsl:result-document asynchronous attribute


Subject: Re: [xsl] A proposal:xsl:result-document asynchronous attribute
From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx>
Date: Tue, 11 Mar 2003 10:33:19 +0000

Hi Francis and Kurt,

Kurt wrote:
> I'd second Francis's note on the idempotency issue. Consider the
> canonical HTTP GET based web service - the stock market ticker. Such
> a service will be returning a different result set at any given
> moment in time, to an extent that you could effectively argue that
> time itself becomes a parameter in any such service, regardless of
> the specific implementation. My experience with web services is that
> most meaningful web service results vary with time.

As designed, the document() function will only GET such a document
once, so the result of calling the document() function with a
particular URL is always the same. Of course if you run the
transformation several times you will get different answers each time,
but within a transformation, there aren't side-effects.

Francis wrote:
> How could xsl:result-document *not* cause changes on the server?

and Kurt wrote:
> Likewise, <xsl:result-document> WILL cause changes on the server.
> It's difficult to determine from the specification what happens when
> an XML document is posted to an external location, but at the very
> least the assumption is that you are creating an XML document at
> that point, probably via HTTP PUT via WebDAV.

To be facetious, <xsl:result-document> doesn't necessarily create a
physical document -- it just creates a result tree, identifiable via
the URL specified in the href attribute, that the processor must make
accessible to whatever processes are managing the transformation. So
<xsl:result-document> wouldn't cause any changes on the server if you
ran the transformation and then immediately dropped all the result
trees from that transformation.

Conceptually, <xsl:result-document> only causes changes to the server
after the transformation is complete, when all the result trees are
serialised. But, as with the main result tree, <xsl:result-document>
you might have a streaming transformation and thus you may cause
changes on the server during the transformation process. However, it's
an error for you to try to access that document, so as far as the
transformation is concerned, such changes are invisible, and thus the
transformation is side-effect free. (Or as much as they can be: if the
document were accessible via an alternative URL, I don't think that
there'd be anything the processor could do to stop you accessing it.
However, I think it's fairly clear that it's unwise to rely on this
behaviour.)

By contrast, POST is explicitly about making changes immediately --
that's its whole purpose. It would be impossible to wait to do all the
POSTing until the end of the transformation, since that way you
wouldn't get the results of the POSTs during the transformation. The
situations are very different.

Kurt wrote:
> The WSDL architecture does not in fact make any distinctions with
> regard to whether an HTTP GET is being used to query as opposed to
> being used to update. It is quite permissible (albeit again not
> necessarily "good practice") to have a web service invocation of the
> form:
> 
> http://www.myservice.com/updateValue?newValue=foo
> 
> The argument that the document() function should be the primary
> interface for such web services then contradicts the fact that you
> are changing state on the server; if this holds true for document(),
> then it should be just as permissible for <xsl:result-document>).

I think that's it's reasonable for us to allow bad practice (using
GETs for unsafe and unidempotent requests) in order to support good
practice (using GETs for safe and idempotent requests). I don't think
that it's reasonable for us to support bad practice (using POSTs for
safe and idempotent requests).

Francis wrote:
> I sometimes think that language designs can become fetishistic about
> things like idempotency (in the technical sense - a fetish being
> something originally associated with an aim - good language design
> in this case - which ends up becoming a non-functional [no pun
> intended] substitute for the original aim). Useful languages end up
> having to deal with things that change state. Even a "purely"
> functional language like Haskell has monads. I think at the very
> least it would be useful to report a success or error on executing
> the GET, and if you concede even that then idempotency has gone for
> this function.

and Kurt wrote:
> One of the major flaws in the XSLT 1.0 spec was that there were a
> great many number of features than became incorporated into the spec
> that were intended to prevent people from doing "dangerous" things -
> the creation of XML fragments, for instance, rather than allowing
> the creation of intermediate node-sets. The fact that most
> implementations built work arounds for these limitations indicate to
> me that far from being dangerous functionality, the attempt to
> protect programmers from their own stupidity was itself pretty
> misguided.

I disagree, but I suspect that's because I'm not facing the real
challenges of using XSLT with SOAP messaging day-in day-out whereas I
do deal with questions from newcomers to XSLT (often confused having
come from a procedural programming background) day-in day-out.

Basically, I don't think that it's XSLT's job to perform any actions
aside from transformations. POSTing is an action that should be taken
by the surrounding application, with the results passed into the
transformation.

Francis wrote:
>>We *could* manage the multiple-evaluation problem in XSLT in the
>>same way we do for GET by saying that the result of two POST
>>requests with the same URL and deep-equal message bodies must be
>>identical. This would force implementations to cache and reuse the
>>results of each POST. I think that the ordering problem would be
>>harder to manage, and that it's likely to lead to subtle bugs due to
>>different processors following different evaluation orders.
>>
> Or just say that this function is not idempotent. Which, in reality, it 
> isn't - you might get a time-out one time and success the next.

So you would be happy with the situation where, given the variable
declaration:

  <xsl:variable name="foo" select="post($myURI, $myMessage)" />

You could have something like:

  <xsl:if test="$foo">
    <result>
      <xsl:copy-of select="$foo" />
    </result>
  </xsl:if>

and have the $foo that was tested be different from the $foo that was
copied?

Not to mention the fact that having such a post() function return
different results each time would mean that processors couldn't
perform the optimisations they could otherwise.

Francis wrote:
>>There are applications that use POST in safe, idempotent ways, as a
>>GET-with-complex-arguments. However, SOAP 1.2 explicitly discourages
>>that practice, and I think that it would be a bad idea to base XSLT
>>functionality on bad practice. SOAP 1.2 encourages, instead, the use
>>of a GET request resulting in a SOAP message, and this is already
>>supported with the document() function in XSLT.
>>
> Except that assumes that all useful idempotent web services are
> restricted to - and will in fact be implemented - using "flat"
> non-XML queries that do not require any kind of nested structure. I
> have never seen anyone make a convincing attempt to justify this
> assumption.

I have been pretty much convinced by Paul Prescod's arguments that
this is how idempotent web services *should* be designed. It seems
that the designers of SOAP 1.2 agree. Reality may be very different,
but I'd be very concerned about adding functionality to XSLT whose
only purpose was to support bad practice, no matter how common this
bad practice might be.

> Here is another example, albeit one that I'm sure will raise more
> than a few hackles:
> 
> <xsl:apply-templates select="$myContext/{$anXPathExpression}"/>
> 
> This structure is illegal in both XSLT 1.0 and 2.0, no doubt because
> the cost of multiple evaluations of XPath context add overhead to
> the work involved in the parser. However, such an expression has a
> lot of potential usage, for instance, designing an XSLT processing
> template in which the details of a given XML "record" are not known
> until the time of evaluation, utilizing an external configuration
> file to determine what makes up the requisite records, identity
> attributes, and so forth. The fact that such a feature does exist in
> the Saxon parser (and I believe in EXSLT, though obviously I may be
> wrong here) indicates that it has utility that may outweigh its
> "potentially" disruptive effects.

I am in absolute agreement that dynamic evaluation of strings as XPath
expressions is incredibly useful. It is something that I have argued
for in the past. The dyn:evaluate() function defined in EXSLT is
currently supported in Xalan-J, 4XSLT and libxslt.

> In a related vein, there has been an implicit assumption that the
> href value in the <xsl:result-doument> contains either an http: or
> file: protocol, but I'm not necessarily sure that this is a valid
> assumption. Suppose that you had the following construct:
> 
> <xsl:result-document href="mailto:{url}?subject={subject}">
>         <html>
>                 <body>
>                         <xsl:copy-of select="body"/>
>                 </body>
>         </html>
> </xsl:result>
> 
> Is this construct invalid? It's possible it may be unsupported, of
> course, but this is just as true of an http: protocol message.

There is nothing in the XSLT WD that says that the URI must use the
file or http protocol. The spec says:

  "There may be implementation-defined restrictions on the form of
   absolute URI that may be used, but the implementation is not
   required to enforce any restrictions. Any legal relative URI must be
   accepted."

As I pointed out above, the URI is really just an identifier that the
implementation uses to label the result tree. The application
controlling the transformation can do what it likes with the result
trees, including posting them (via HTTP or email).

If the processor is in charge of serialising the result tree, the spec
says:

  "The location to which result trees are serialized (whether in
   filestore or elsewhere) is implementation-defined (which in practice
   may mean that it is controlled using an implementation-defined API).
   However, these locations must satisfy the constraint that when two
   result trees are both created (implicitly or explicitly) using
   relative URIs in the href attribute of the xsl:result-document
   instruction, then these relative URIs may be used to construct
   references from one tree to the other, and such references must
   remain valid when both result trees are serialized."

It's really up to the implementation what it does with the result. I
think it would be perfectly reasonable for an implementation that
recognises a mailto URI to email the tree as XML, or an implementation
that recognises an ftp URI to upload the tree as an XML document.

By the way, I'm arguing about this here because these comments have
been posted to XSL-List. If you want to make a comment or suggestion
about the WD that will be read by the members of the XSL WG, you
should post it to public-qt-comments@xxxxxx, or let me know if you'd
like me to forward your message there.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords