[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] How to filter characters from a string?


Subject: Re: [xsl] How to filter characters from a string?
From: Greg Faron <gfaron@xxxxxxxxxxxxxxxxxx>
Date: Thu, 28 Mar 2002 16:42:54 -0700

As per the third paragraph (quoted below) under the subheading "Decoding" on the page <http://www.kbcafe.com/articles/base64.html#decoding>, I am stripping out non-base64 characters from an already encoded base64 file.

      Another important thing to do is to ignore [non]
    base-64 characters in the stream.  During encoding I
    dropped carriage returns and line feeds into the stream
    to break up the lines.  It is also allowed to drop other
    non base-64 characters into the stream.  For these
    reasons, I scan and remove non base-64 characters from
    the stream before decoding.

I was simply asking how to filter() out non-specified characters from a string, rather than translate() specified characters to nothing. From Dimitre's post, I adapted the following code which does the requested task, but I'm not sure about its efficiency. It basically checks every character in a source string against a string of known, valid characters. Any string not found in the valid string doesn't make the cut. I suppose there's no faster algorithm, unless one is built into the XSLT processor.

  <!-- Begin Template: str:filter -->
  <xsl:template name="str:filter">
    <xsl:param name="string"/>
    <xsl:param name="validChars"/>
    <xsl:if test="$string and $validChars">
      <xsl:variable name="first" select="substring($string, 1, 1)"/>
      <xsl:if test="$first and contains($validChars, $first)">
        <xsl:value-of select="$first"/>
      </xsl:if>
      <xsl:call-template name="str:filter">
        <xsl:with-param name="string" select="substring($string, 2)"/>
        <xsl:with-param name="validChars" select="$validChars"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>
  <!-- End Template: str:filter -->

As for the rest of your confusion below, I wasn't really discussing the encoding algorithm, just giving some exposition for clarification.

Greg Faron
Integre Technical Publishing Co.

At 04:12 PM 3/28/2002, you wrote:
Perhaps I misunderstand what you're trying to do.

You seem to be confusing the act of discarding information with
the act of encoding it.

Base64 is a means of encoding binary data as a sequence of ASCII
characters which are known to survive simpleminded text transmission
protocols like SMTP.  The Base64 encoding and decooding operations
are inverses of each other.  The original sequence of 8 bit bytes
is recovered as a result of decoding.

Taking a body of text and removing any character is that is not
one of the output characters of the Base64 encoding operation is
not the same as encoding that text.

What are you really trying to do?



XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list




Current Thread
Keywords