[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] analyze-string gotcha/reminder


Subject: [xsl] analyze-string gotcha/reminder
From: Ihe Onwuka <ihe.onwuka@xxxxxxxxx>
Date: Sun, 18 Nov 2012 18:18:00 +0000

Below is a multiple match meant to extract 4 digit numbers from text

	         <xsl:analyze-string select="$line" regex="(\D|^)(\d{4})(\D|$)">
                   <xsl:matching-substring>		
                     <year><xsl:value-of
select="regex-group(2)"/></year>
                   </xsl:matching-substring>
                 </xsl:analyze-string

It doesn't work. I tried exactly the same regex  in XQuery using replace

xquery version "1.0";
replace('Accounting Items                                Dec.31,2005
 Dec.31,2006    Dec.31,2007
Dec.31,2008','(\D|^)\d{4}(\D|$)','xxxx')

it worked and I got

Accounting Items                                Dec.31xxxx
Dec.31xxxx   Dec.31xxxx   Dec.31xxxx

I thought maybe there was special syntax for the multiple match case - but no.
Eventually I turned to the specification and found this.

Note:
Because the regex attribute is an attribute value template, curly
brackets within the regular expression must be doubled. For example,
to match a sequence of one to five characters, write regex=".{{1,5}}".
For regular expressions containing many curly brackets it may be more
convenient to use a notation such as
regex="{'[0-9]{1,5}[a-z]{3}[0-9]{1,2}'}", or to use a variable.

So I had to double up my curly braces.

There's an hour of my life that I won't get back.


Current Thread
Keywords