[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Does =?windows-1256?Q?=27Lec=9Cur=27_occur_in_=24?= =?windows-1256?Q?text=3F_Do_you_have_a_multi-factor_XPath_?= =?windows-1256?Q?solution=3F?=

Subject: Re: [xsl] Does 'Lecœur' occur in $text? Do you have a multi-factor XPath solution?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Fri, 18 Jan 2013 22:59:57 +0000

If you want to write queries that handle all the nuances of natural language text, I would strongly recommend using a text retrieval language rather than XPath. Many XQuery implementations have free text retrieval modules.

Michael Kay

On 18/01/2013 22:12, Costello, Roger L. wrote:
Hi Folks,

I want to determine if 'Lecur' occurs in $text.

A naove solution is this XPath expression:

contains($text, 'Lecur')

However, that does not take into account many important factors:

1. Perhaps 'Lecur' occurs, but in $text it is in uppercase

2. Perhaps instead of the '' ligature, $text uses 'oe'

3. Perhaps in $text 'Lecur' is split over two lines and thus is hyphenated

4. Perhaps 'Lecur' is slightly misspelled in $text and therefore requires fuzzy matching

And there are many other important factors.

Do you have an XPath solution to this problem that takes into account the many important factors?


Current Thread