[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Regular Expressions in XPath 2.0


Subject: Re: [xsl] Regular Expressions in XPath 2.0
From: "Rashmi Rubdi" <rashmi.sub@xxxxxxxxx>
Date: Mon, 23 Apr 2007 11:24:44 -0400

On 4/23/07, Michael Kay <mike@xxxxxxxxxxxx> wrote:
> I thought I'll share the above presentation as regular
> expressions in XPath are based on Perl.

It's an interesting article. But of course the statement that "regular
expressions in XPath are based on Perl" has nothing to do with it: that's a
statement about the syntax, not about the implementation or choice of
algorithm.

Good point, thank you for clarifying.



The article is about different ways of implementing regular expressions. The most interesting point it makes, I think, is that back-references turn regular expressions into non-regular expressions, and that this requires the use of backtracking implementation algorithms (which have worst-case performance that is exponential). But of course the vast majority of regexes do not use back-references, so any of the classical algorithms can be used.

Ok.


There are some interesting trade-offs between the time taken to compile a
regular expression and the time taken to execute it. Determinizing an NFA
can be an expensive operation. This is rarely discussed in the theory, as
far as I can tell, though some of the papers do talk about incremental
determinization. You see this in schema processors (which use the regular
expression approach to validate an XML document against a grammar) - Saxon
creates a deterministic FSA for this, which has excellent run-time
performance, but in pathological cases creating the DFSA can be extremely
slow.


Michael Kay http://www.saxonica.com/


-Thank you Rashmi


Current Thread
Keywords