[xsl] Mistake in tokenizing under Saxon 8.2

From: "Nicholas Hemley" <Nicholas.Hemley@xxxxxxxxxxxxxxx>
Date: Fri, 21 Jan 2005 09:50:25 +0000


I presume that I have made a mistake somewhere in the stylesheet when
using the tokenize function under Saxon 8.2 - for some reason I am
loosing the whitespace chars around the matched regular expression.

For example, the following pattern:
text text [link,alt,link_text] text

should be transformed to:

text text <a href="link" alt="alt">link text</a> text


I am loosing the whitespasce characters around the <a> as follows:

text text<a href="link" alt="alt">link text</a>text
Why is this please? All the other whitespace chars are copied OK, even
though I am tokenising on whitespace.

If I use a &nbsp; in the stylesheet to compensate for the loss, it adds
two spaces, not one, which is wierd, so this is not currently a viable

Any input appreciated!

Many thanks,

Appendix: Stylesheet Snippet

  <xsl:template match="/html/body/P|p">
    <!-- copy node plus select contents -->

              <xsl:variable name="tokens" select="tokenize(.,'\s+')"/>

              <xsl:for-each select="$tokens">

                  <xsl:when test='matches(.,"\[(.*),(.*),(.*)\]")'>

                    <xsl:variable name="elValue" select="."/>

                      <xsl:analyze-string select="$elValue"

                          <a href="{regex-group(1)}">
                                  <xsl:attribute name='alt'>
select='replace(regex-group(3), "_"," ")'/>
select='replace(regex-group(2), "_"," ")'/>


                    <xsl:copy-of select="."/>

