[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] Processing a Sequence of String Tokens in XSLT 2


Subject: RE: [xsl] Processing a Sequence of String Tokens in XSLT 2
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 6 Jun 2007 21:09:22 +0100

> In my specific case, I need to init-cap all the words in a 
> string (I don't need any sophistication like special case for 
> conjunctions or anything).

I'm really not sure I want to help anyone deliver such an aesthetic
monstrosity, but I suppose tastes vary from one side of the pond to the
other...
 
> Here's the XSLT 2 function I came up with:
> 
>    <xsl:function
>      name="func:normalizeTitleContent">
>      <!-- Normalizes the case of titles based on the 
> FASB-defined rules for title case -->
>      <xsl:param
>      name="titleElem"/>
>      <xsl:variable name="titleTokens"
> select="tokenize(string($titleElem), ' ')"/>
>      <xsl:variable name="resultString">
>        <xsl:for-each select="$titleTokens">
>          <xsl:sequence select="concat(upper-case(substring(., 
> 1,1)), substring(., 2))"/>
>        </xsl:for-each>
>      </xsl:variable>
>      <xsl:sequence select="$resultString"/>
>    </xsl:function>
> 
> Which seems reasonably compact and understandable but I 
> suspect that I'm not doing things as cleverly or as 
> "correctly" as I could.
> 
Well, for a start it would be a lot more readable and possibly more
efficient if you declared the types of the arguments and the results.

It's probably inefficient to create the temporary document for
$resultString, although of course it depends on the implementation. Saxon is
getting reasonably good at optimizing away unnecessary temporary documents,
but it's better to avoid creating them in the first place. A function like
this should work in terms of atomic values only, no need to create any
document nodes or text nodes.

It's a matter of personal style, but I tend to code this kind of thing in
XPath rather than XSLT. Specifically,

<xsl:function
     name="func:normalizeTitleContent" as="xs:string">
  <!-- Normalizes the case of titles based on the 
       FASB-defined rules for title case -->
  <xsl:param name="titleElem" as="xs:string"/>
  <xsl:sequence select="
     string-join(
       for $x in tokenize($titleElem, ' ')
         return concat(upper-case(substring($x, 1,1)), substring($x, 2)),
       ' ')"/>
</xsl:function>

Michael Kay
http://www.saxonica.com/


Current Thread
Keywords