[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] Mistake in tokenizing under Saxon 8.2

Subject: [xsl] Mistake in tokenizing under Saxon 8.2
From: "Nicholas Hemley" <Nicholas.Hemley@xxxxxxxxxxxxxxx>
Date: Fri, 21 Jan 2005 09:50:25 +0000


I presume that I have made a mistake somewhere in the stylesheet when
using the tokenize function under Saxon 8.2 - for some reason I am
loosing the whitespace chars around the matched regular expression.

For example, the following pattern:
text text [link,alt,link_text] text

should be transformed to:

text text <a href="link" alt="alt">link text</a> text


I am loosing the whitespasce characters around the <a> as follows:

text text<a href="link" alt="alt">link text</a>text
Why is this please? All the other whitespace chars are copied OK, even
though I am tokenising on whitespace.

If I use a &nbsp; in the stylesheet to compensate for the loss, it adds
two spaces, not one, which is wierd, so this is not currently a viable

Any input appreciated!

Many thanks,

Appendix: Stylesheet Snippet

  <xsl:template match="/html/body/P|p">
    <!-- copy node plus select contents -->

              <xsl:variable name="tokens" select="tokenize(.,'\s+')"/>

              <xsl:for-each select="$tokens">

                  <xsl:when test='matches(.,"\[(.*),(.*),(.*)\]")'>

                    <xsl:variable name="elValue" select="."/>

                      <xsl:analyze-string select="$elValue"

                          <a href="{regex-group(1)}">
                                  <xsl:attribute name='alt'>
select='replace(regex-group(3), "_"," ")'/>
select='replace(regex-group(2), "_"," ")'/>


                    <xsl:copy-of select="."/>

The information contained in this message may be confidential or legally privileged and is intended for the addressee only, If you have received this message in error or there are any problems please notify the originator immediately. The unauthorised use, disclosure, copying or alteration of this message is strictly forbidden.

Current Thread