[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
I feel your pain. Many of us have lost a few hairs over this one. The good news is that you probably won't make the same mistake again, or if you do, you will spot it far more quickly.
It's a case where even in retrospect, it's hard to see how we could have avoided this problem in the language design. Perhaps two separate attributes, regex and regex-avt. But that feels very heavy-handed. Most languages have a few quirks like this where people just have to learn the hard way.
On 18/11/2012 18:18, Ihe Onwuka wrote:
Re: [xsl] analyze-string gotcha/reminder
Subject: Re: [xsl] analyze-string gotcha/reminder From: Michael Kay <mike@xxxxxxxxxxxx> Date: Mon, 19 Nov 2012 09:12:33 +0000 |
I feel your pain. Many of us have lost a few hairs over this one. The good news is that you probably won't make the same mistake again, or if you do, you will spot it far more quickly.
It's a case where even in retrospect, it's hard to see how we could have avoided this problem in the language design. Perhaps two separate attributes, regex and regex-avt. But that feels very heavy-handed. Most languages have a few quirks like this where people just have to learn the hard way.
Michael Kay Saxonica
On 18/11/2012 18:18, Ihe Onwuka wrote:
Below is a multiple match meant to extract 4 digit numbers from text
<xsl:analyze-string select="$line" regex="(\D|^)(\d{4})(\D|$)"> <xsl:matching-substring> <year><xsl:value-of select="regex-group(2)"/></year> </xsl:matching-substring> </xsl:analyze-string
It doesn't work. I tried exactly the same regex in XQuery using replace
xquery version "1.0"; replace('Accounting Items Dec.31,2005 Dec.31,2006 Dec.31,2007 Dec.31,2008','(\D|^)\d{4}(\D|$)','xxxx')
it worked and I got
Accounting Items Dec.31xxxx Dec.31xxxx Dec.31xxxx Dec.31xxxx
I thought maybe there was special syntax for the multiple match case - but no. Eventually I turned to the specification and found this.
Note: Because the regex attribute is an attribute value template, curly brackets within the regular expression must be doubled. For example, to match a sequence of one to five characters, write regex=".{{1,5}}". For regular expressions containing many curly brackets it may be more convenient to use a notation such as regex="{'[0-9]{1,5}[a-z]{3}[0-9]{1,2}'}", or to use a variable.
So I had to double up my curly braces.
There's an hour of my life that I won't get back.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] analyze-string gotcha/rem, Ihe Onwuka | Thread | Re: [xsl] analyze-string gotcha/rem, Andrew Welch |
Re: [xsl] analyze-string gotcha/rem, Andrew Welch | Date | Re: [xsl] analyze-string gotcha/rem, Andrew Welch |
Month |
Keywords