[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: Searchable XML


Subject: RE: Searchable XML
From: Mark Birbeck <Mark.Birbeck@xxxxxxxxxxxxx>
Date: Wed, 16 Jun 1999 14:02:01 +0100

Ben Robb wrote:
> Problem is, though, our client wants this to be 
> searchable....going back a
> year (that's 15k x 365 = 3.67 M if you bundle it into one XML file - a
> little large).
> 
> Any thoughts?

Depends on what form the searches are to take. Should the documents
simply be searched for words, or do you want to search the hierarchy? If
you want to search for words, and perhaps 'a word in an element' then
all you need to do is use Microsoft Index Server (I suggest this because
you said you use IIS 4.0 and ASP). It's quite easy to write a little ASP
script to export all the XML elements as META tags in a dummy HTML file
and then index that. Alternatively write another XSL transformation file
like the one you have done to convert Word to XML but this time convert
to HTML with loads of META tags, and store this stub in the same
directory. To search what is, say, the author element in the XML file,
you search meta_author in Index Server. (Check some of the HTML source
on the Fact Finder pages in http://www.worldink.co.uk and see how we
even use META tags to store some XML so that the search results can be
displayed better.)

To get hierarchy using XSL-type queries is more difficult. I tried to
mimic this with Index Server but found the performance too slow.
Admittedly all I had done is add lots of extra information in META tags
to allow hierarchy to be discerned, and really Index Server needs a
proper XML extension which I'm sure is being worked on. To get round
this I don't use static documents anymore, I put everything in a
hierarchical database, importing and exporting as needed. (See
http://www.worldink.co.uk for an application using these 'XML database
documents'.)

Regards,

Mark Birbeck
Mark.Birbeck@xxxxxxxxxxxxx
http://www.iedigitial.net/

PS Although we've never met, I know Brian and Mike. Say hi to them for
me!


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords