[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] duplicate occurances within strings


Subject: Re: [xsl] duplicate occurances within strings
From: Geert Josten <Geert.Josten@xxxxxxxxxxx>
Date: Thu, 16 Dec 2004 22:43:40 +0100

Hi,

Might be a lot easier in XSLT 2.0, but I'll leave that for the others..

In XSLT 1.0 this will require a recursive template to split your string into an element structure and after that you have to filter out the duplicates from the element structure.

(If anyone has a better idea, tell me!)

You can do that with two stylesheets, but also with one if you are prepared to use the node-set extension function.

<!-- input xml -->

<?xml version="1.0"?>
<test>Hello, Hello, Hello, test, dog, cat, cat</test>

<!-- first step -->

<xsl:template match="*" mode="split-value">
  <xsl:param name="value" select="." /> <!-- start with full content string -->
  <xsl:param name="separator" select="', '"/>

  <xsl:choose>
    <xsl:when test="string-length($value) = 0" /><!-- nothing to do -->

    <xsl:when test="contains($value, $separator)">
      <chunk>
        <xsl:value-of select="substring-before($value, $separator)"/>
      </chunk>
      <!-- look for more chunks -->
      <xsl:apply-templates select=".">
        <xsl:with-param name="value" select="substring-after($value, $separator)" />
      </xsl:apply-templates>
    </xsl:when>

    <!-- last chunk -->
    <xsl:otherwise>
      <chunk>
        <xsl:value-of select="$value"/>
      </chunk>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

<!-- second step -->

<!-- build an index that can return all chunks with a certain value -->
<!-- I added the id of the parent to the key, to localize the return values
     to only those chunks that have the same parent -->
<xsl:key name="chunks" match="chunk" use="concat(generate-id(parent::*), '-', .)"/>

<xsl:template match="*[chunk]" mode="unduplicate-and-rejoin">
  <xsl:param name="$separator" select="', '" />

  <xsl:copy>
    <xsl:copy-of select="@*" />

    <!-- using generate-id to determine whether the chunk at hand is
         the same one as the first returned from the index -->
    <xsl:for-each select="chunk[generate-id(.)
                                = generate-id(key('chunks',
                                                  concat(generate-id(parent::*), '-', .))[1])]">
      <xsl:value-of select="." />
      <xsl:if test="not(position() = last()">
        <xsl:value-of select="$separator" />
      </xsl:if>
    </xsl:for-each>
  </xsl:copy>
</xsl:template>

<!-- the above may contain typos -->

Try it with two stylesheets first. If it works, you can try to catch the result from the first step in a variable and pass the variable to xsl:apply-templates with exsl:node-set(..) around it.

Hope this helps for now, it is bed time for me...

Cheers,
Geert

Christopher Hansen wrote:

1.0


On Thu, 16 Dec 2004 22:07:36 +0100, Geert Josten <Geert.Josten@xxxxxxxxxxx> wrote:

With XSLT 1.0 or 2.0?


If i have a variable $thestring   containing the following string:
"Hello, Hello, Hello, test, dog, cat, cat"

Is there a way to use the string-compare function to parse it and
check for duplicates within the string, and then possibly remove those
extra occurrances...resulting in the string "Hello, test, dog, cat"

Thanks
Chris




-- Geert.Josten@xxxxxxxxxxx IT-consultant at Daidalos BV, Zoetermeer (NL)

http://www.daidalos.nl/
tel:+31-(0)79-3316961
fax:+31-(0)79-3316464

GPG: 1024D/12DEBB50


Current Thread
Keywords