[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Efficient dictionary lookup

Subject: Re: [xsl] Efficient dictionary lookup
From: David Carlisle <davidc@xxxxxxxxx>
Date: Thu, 22 Mar 2012 21:44:37 +0000

On 22/03/2012 21:39, Martin Holmes wrote:
> HI all,
> As part of a small pilot project, I'm implementing a set of spelling
> normalization rules applied through XSLT 2.0 using Saxon 9. One
> operation that happens extremely frequently is a dictionary lookup;
> basically I'm checking a word form to see if it appears in a
> spell-checker dictionary.
> The dictionary currently consists of a whitespace-separated text string
> (although it could be formatted any way I choose), and I've been using
> fn:matches() and fn:contains() to check whether or not the form appears
> in the dictionary:
> <xsl:function name="f:wordExists" as="xs:boolean">
> <xsl:param name="inString" as="xs:string"/>
> <xsl:value-of select="contains($dictModern, concat(' ',
> lower-case($inString), ' '))"/>
> </xsl:function>
> <xsl:function name="f:wordExists" as="xs:boolean">
> <xsl:param name="inString" as="xs:string"/>
> <xsl:value-of select="matches($dictModern, concat('\s', $inString),
> '\s', 'i')"/>
> </xsl:function>
> Both options appear to be very costly in terms of time, and I'm
> wondering what the most efficient way to do this might be. Is there a
> faster way to do text lookups like this?
> Ultimately I guess I'll implement this as an external Java process, but
> for the moment I'm working with XSLT, and I'd like to get some speed
> improvement if I can.
> All help appreciated,
> Martin

sounds like you want to use a key then the processor will almost certainly create an efficient lookup index.

If your dictionary is in a file say dict.xml


<xsl:key name="w" match="word" use="."/>

declares the index and


will return the word if it is in the dictionary.


Current Thread