[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Approach to transform 250GB xml data

Subject: Re: [xsl] Approach to transform 250GB xml data
From: "Hank Ratzesberger xml@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 10 Sep 2014 14:42:54 -0000

Well, it's unfortunate that this is a bit of a challenge, but it seems to
me there are several XML databases (several are open source or have free
trial periods) that would be able to store the elements in such a way that
you could query in "chunks" and as a worse case, possibly concatenate the

XQuery allows you to query sequences, and you can enclose those in some
other element when you concatenate.

Let us know how that goes, Happy to suggest some DB's , but off list,
because I don't want to show some preference and likely would miss some
that are completely capable of this.

Anyway, this get's me thinking about whether a completely different kind of
algorithm could be applied, such as, ahem, pointers to the start of
elements in the text file. An index to the key elements. I suppose that's
not too different from what the XML databases do.

Good luck and let us know how you solved this.


On Wednesday, September 10, 2014, Imsieke, Gerrit, le-tex
gerrit.imsieke@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Out of curiosity: how do you intend to access/process the 250 GB once they
> are transformed?
> If it is a huge DB dump, maybe you can dump it in slices or, if it is an
> XML database with XSLT 2 capabilities, transform it in place.
> Gerrit

> [snip]

Hank Ratzesberger

Current Thread