[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: [xsl] Re: Re: Using XSLT to add markup to a document
Subject: Re: [xsl] Re: Re: Using XSLT to add markup to a document From: Jeni Tennison <jeni@xxxxxxxxxxxxxxxx> Date: Tue, 8 Jul 2003 10:56:05 +0100 |
Dimitre wrote: > Another problem with this solution is that it finds the strings not > strictly from left to right (when we search for words as opposed to > generally strings this may not be a problem -- my knowledge of > English does not allow me to make a strong conclusion). All Dimitre's observations about the inadequacy, in the general case, of the solution David and I were discussing are correct. Flexible, general solutions to marking up a string using XSLT 1.0 are not straight-forward. It's interesting to see what the regular expression processing in XSLT 2.0 can do to help here. With the words hard-coded into the stylesheet, it would look like: <xsl:analyze-string select="$text" regex="relation|core"> <xsl:matching-substring> <special><xsl:value-of select="." /></special> </xsl:matching-substring> <xsl:non-matching-substring> <xsl:value-of select="." /> </xsl:non-matching-substring> </xsl:analyze-string> <xsl:analyze-string> is defined such that the first matching substring gets picked up, so if you have: There is a strong corelation... then you get: There is a strong <special>core</special>lation... There's no definition in the spec about what happens if you have overlapping matching substrings, for example: <xsl:analyze-string select="$text" regex="relation|core|corelation"> ... </xsl:analyze-string> (Saxon 7 picks the one that appears first in the regex.) I think that this is a bug in the spec, and I'll raise it as an issue; I think probably it should select the longest match. You can generate the regular expression that's used for the string dynamically, with an attribute value template. So for example, you could have: <xsl:template name="markup" as="xs:string"> <xsl:param name="text" as="xs:string" /> <xsl:param name="replacements" as="item()*" /> <xsl:variable name="regex" as="xs:string"> <xsl:value-of select="$replacements" separator="|" /> </xsl:variable> <xsl:analyze-string select="$text" regex="{$regex}"> ... </xsl:analyze-string> </xsl:template> in which case the markup function can be called with: <xsl:call-template name="markup"> <xsl:with-param name="text" select="'There is a strong corelation...'" /> <xsl:with-param name="replacements" select="('core', 'relation', 'corelation')" /> </xsl:call-template> though to be thorough, you'd need to make sure that you escaped any regex-significant characters in the replacement strings. To get the longest match first, at least using Saxon 7, you can sort the replacements by length, with the longer ones first (or alphabetically in reverse order will give you the same result): <xsl:variable name="regex" as="xs:string"> <xsl:for-each select="$replacements"> <xsl:sort select="string-length(.)" order="descending" /> <xsl:value-of select="." /> <xsl:if test="position() != last()">|</xsl:if> </xsl:for-each> </xsl:variable> Getting whole-word-only matches is much more complicated, in fact I can't think of a good approach right now, but perhaps someone else can? Cheers, Jeni --- Jeni Tennison http://www.jenitennison.com/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Re: Re: Using XSLT to add mar, Dimitre Novatchev | Thread | Re: [xsl] Re: Re: Using XSLT to add, David Carlisle |
Re: [xsl] Generating CSV : Line fee, XSL_chatr | Date | Re: [xsl] Re: Re: Using XSLT to add, David Carlisle |
Month |
Keywords