[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Cannot write more than one result document to the same URI


Subject: Re: [xsl] Cannot write more than one result document to the same URI
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxx>
Date: Fri, 5 Apr 2013 09:18:13 -0400

Dan,

In your own answer to your own question you also have another solution
to your problem: generate your results in a temporary tree, then
serialize that, taking care when you do so not to serialize the same
results more than once.

This is essentially Ken's solution (offered early in the thread)
except it determines whether results are being duplicated after
generating them (but before serializing them), not before. This may
make it a bit more robust in the face of complexity -- and it also
exposes the extra work you are asking the processor to do, which Ken's
heuristic approach avoids. (So I'd probably prefer Ken's way at least
until things got really complicated.)

Cheers, Wendell

On Thu, Apr 4, 2013 at 11:22 PM, Dan Vint <dvint@xxxxxxxxx> wrote:
> Thanks, I can see that argument. but now you open a can of worms for me. So
> say your template for b just writes output to the standard result tree (not
> separate files). Why doesn't the same argument for parallelism apply there?
>
> Probably answering my own question here, I'm guessing the difference is
> writing to a file more or less serially vs writing to a tree. The processor
> can write to multiple locations in the tree and keep track of where multiple
> b outputs should be created, but not as well supported writing to a file
> from multiple sources (unless your willing to live with jumbled results in
> the file).
>
> Thanks again.
>
> ..dan
>
>
> At 07:56 PM 4/4/2013, you wrote:
>>
>> At 2013-04-04 19:37 -0700, Dan Vint wrote:
>>>
>>> I can live with the rule, just would like to understand the logic.
>>
>>
>> Consider the following scenario.  An XML document has two elements <b>:
>>
>>   <a>
>>     <b id="1">...</b>
>>     <b id="2">...</b>
>>   </a>
>>
>> An XSLT stylesheet uses the built-in template rule for <a> and has a
>> template rule for <b>:
>>
>>    <xsl:template match="b">
>>      <xsl:result-document href="output.xml">
>>        <xsl:copy-of select="."/>
>>      </xsl:result-document>
>>    </xsl:template>
>>
>> If the specification allowed this, then without considering the
>> opportunities of parallelism, one might come to the conclusion that the file
>> "output.xml" would always contain:
>>
>>     <b id="1">...</b><b id="2">...</b>
>>
>> The problem is that the specification does not require the XSLT processor
>> to complete the processing of the first <b> before starting or even ending
>> the processing of the second <b>.  Sure a single-process implementation "X"
>> likely would.  But a parallelized (is that a word?) implementation "Y"
>> running on multiple CPUs could very well fully process the second <b> before
>> the first <b> if it chose to do so.  Its only obligation is to arrange the
>> resulting tree with the result of processing the first <b> before the result
>> of processing the second <b>.  This obligation ensures that the result of
>> processing by "X" is identical to the result of processing by "Y".  But
>> there is no obligation on what the processor does to get to that result.
>>
>> When using <xsl:result-document> the processor is not building the result
>> tree.  It is creating a completely separate result.  If the instruction
>> required "re-opening" of the file for append, processor "X" likely would
>> produce the expected result, but processor "Y" in the situation above would
>> produce an unexpected result.  Two processors would produce two results.
>>
>> And this is also why one cannot assert that the writing to the file is
>> even finished before the next attempt to write to the file starts.  The file
>> handle could very well still be left open by one parallel process when the
>> other is ready to open it for itself.  So it can't be used even if the file
>> is opened for write and not for append.
>>
>> Note that some of my students have come to class thinking that you have to
>> fully complete an <xsl:result-document> before starting another one to
>> another URI, but I tell them that is not the case.  You can nest
>> <xsl:result-document> instructions to different URI target locations, and
>> the nested <xsl:result-document> will complete the nested file and resume
>> the "outer" file output when done, without having to close and re-open the
>> outer file.  This has been a very handy feature when fragmenting files.  And
>> this would be another reason not to allow the same URI to be used.
>>
>>> I know the what now, even though I don't understand why the rule exists
>>> ;-)
>>
>>
>> I hope the explanation above has helped.
>>
>> . . . . . . . . . Ken
>>
>> --
>> Contact us for world-wide XML consulting and instructor-led training |
>> Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm |
>> Crane Softwrights Ltd.            http://www.CraneSoftwrights.com/s/ |
>> G. Ken Holman                   mailto:gkholman@xxxxxxxxxxxxxxxxxxxx |
>> Google+ profile: https://plus.google.com/116832879756988317389/about |
>> Legal business disclaimers:    http://www.CraneSoftwrights.com/legal |
>>
>
> ---------------------------------------------------------------------------
> Danny Vint
>
> Panoramic Photography
> http://www.dvint.com
>
> voice: 619-938-3610
>



-- 
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^


Current Thread
Keywords