[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] A compelling use case for employing binary trees in XML processing?


Subject: Re: [xsl] A compelling use case for employing binary trees in XML processing?
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Wed, 12 Dec 2012 06:39:27 -0800

Another good use of a balanced binary search tree -- with a little
additional work a BST can be used to produce in O(log(N)) the rank of
an item in a collection (regarded as sorted), or to access an item in
a collection (regarded as sorted) by index.

So if a balanced BST contains: 1, 3, 8, 12, 17, 23, 37  -- we can
determine that the rank of 17 is 5 and that the 4th item in this
sorted sequence is 12.

To do so, we need to store with each tree node, the count of nodes in
its left subtree.

Let's remember that accessing an item by its position in a sequence is
generally O(N) operation (some XSLT processors, like Saxon optimize
sequences by using vectors, but such optimizations can't be relied on
accross different XSLT processors). And of course, sorting is
O(N*log(N)). One can't have efficient binary search in a sequence,
because a sequence isn't an array, and expressions like $seq[$mid]
are O(N) -- not O(1).


Cheers,
Dimitre

On Sun, Dec 9, 2012 at 4:49 PM, Dimitre Novatchev <dnovatchev@xxxxxxxxx> wrote:
>> I understand that binary trees can be used to sort data. But XSLT already has <xsl:sort>, so using binary trees for
>> sorting isn't a particularly compelling use case.
>
> I guess you mean "binary search trees* -- other kinds of binary trees
> such as the heap data structure do not have the complete set of their
> items sorted.
>
> A binary *search* tree (and I mean *balanced* binary serch tree) has
> more useful properties than just that it one of its serializations
> presents the values in sorted order.
>
> A binary search tree  allows searches and insertions with efficiency
> of  O(log(N)). In contrast, an insertion in an array or in a sequence
> is O(N) -- the whole array needs to be copied (even the non-functional
> array) and the initial part of a sequence (the sub-sequence before the
> insertion point) of a persistent (functional sequence) needs also to
> be copied -- this is on average N / 2 -- so again we have O(N) here.
>
> The deletion in a balanced binary serch tree  is also O(log(N)).
>
> This makes the balanced binary search tree quite good even for
> implementing dictionaries and sets. A disadvantage for such binary
> search tree -based  implementations is that the value-space must have
> total ordering (while other implementations of a dictionary or set
> require only an equality (equivalence) operation).
>
>
> Cheers,
>
> Dimitre.
>
> On Sun, Dec 9, 2012 at 2:49 PM, Costello, Roger L. <costello@xxxxxxxxx> wrote:
>> Hi Folks,
>>
>> Dimitre has created a beautiful set of functions for building binary trees [1].
>>
>> I understand that binary trees can be used to sort data. But XSLT already has <xsl:sort>, so using binary trees for sorting isn't a particularly compelling use case.
>>
>> I am seeking a compelling use case for binary trees -- given XML document foo.xml and processing requirement P, a binary tree is ideally suited.
>>
>> Would you provide a simple, compelling use case please?
>>
>> /Roger
>>
>> [1] http://dnovatchev.wordpress.com/2012/01/09/the-binary-search-tree-data-structurehaving-fun-with-xpath-3-0/
>>



-- 
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
To avoid situations in which you might make mistakes may be the
biggest mistake of all
------------------------------------
Quality means doing it right when no one is looking.
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play
-------------------------------------
Facts do not cease to exist because they are ignored.
-------------------------------------
Typing monkeys will write all Shakespeare's works in 200yrs.Will they
write all patents, too? :)
-------------------------------------
I finally figured out the only reason to be alive is to enjoy it.


Current Thread
Keywords