[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Duplicate Elimination


Subject: Re: [xsl] Duplicate Elimination
From: Ihe Onwuka <ihe.onwuka@xxxxxxxxx>
Date: Thu, 13 Mar 2014 11:28:12 +0000

This is why I don't like to post code, because all I really needed
from this thread was confirmation that duplicate elimination only
applied to node identity rather than content.

If my client asked me (he's the one who highlighted the problem) how I
fixed the duplicates and I said Muenchian Grouping he wouldn't know
WTF I was talking about.

Clients happy with the solution and it's performance. Until he says
otherwise it's filed away under solved.

On Thu, Mar 13, 2014 at 11:17 AM, David Carlisle <davidc@xxxxxxxxx> wrote:
> On 13/03/2014 10:59, Ihe Onwuka wrote:
> e for-each syntax is more verbose.
>>
>>
>> Muenchian grouping is not something I burden my short term memory
>> with because I hardly ever use it and and it's  a phrase that is
>> meaningless beyond a very select cognoscenti. What I posted can
>> literally be described to a layman - add all the B/Dates that aren't
>> in the set of A/Dates additionally it literally translates it into a
>> set-theoretic
>
>
> Not at all in an internet age, "muenchian grouping" might be an
> arbitrary label, but if you (or rather I) call it that rather than call
> it "the grouping idiom using keys that used to be needed in XSLT1"
> then you have a high probability of getting an exact description and
> code samples if you type it into google.
>
>
>
>>
>> A union (B diff A)
>>
>> That's why I prefer it.
>
>
> But as I say if you have more than a dozen or so elements in your list
> you are likely to be able to observe the time difference.
>
>>
>>>
>>> this will (unless you have a very aggressively optimising XSLT
>>> engine) be quadratic in performance as the full A list is going to
>>> be searched for every B.
>>>
>>
>> good point. If and when the volumes warrant performance tuning I'll
>> know where to start.
>>
>>> Also of course using text() rather than
>>>
>>> <xsl:apply-templates select="A/Date | B/Date[not(current()/A/Date =
>>> .)]>
>>>
>>> means the code is very fragile and will break if comments spit up
>>> the text nodes.
>>>
>>
>> I have been doing too much XQuery recently,
>>
>
> The same advice not to use text() applies equally to XQuery.
>
>
>
>> but is . robust against changes to the content model?
>
>
> well you were already using it for sorting, and if if the content model
> changes then any of this could break. if Date became
> <Date><year>2014</date><moth>03</month><day>13</day></Date>
> then . would keep working but text() would break.
>
>
> David
>
>
>
>
>
> ________________________________________________________________________
> The Numerical Algorithms Group Ltd is a company registered in England
> and Wales with company number 1249803. The registered office is:
> Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
>
> This e-mail has been scanned for all viruses by Star. The service is
> powered by MessageLabs.
> ________________________________________________________________________


Current Thread
Keywords