[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] analyze-string question


Subject: Re: [xsl] analyze-string question
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Fri, 26 Oct 2012 09:17:05 +0200

On 2012-10-26 06:38, Birnbaum, David J wrote:
Dear XSLT-list,

For an up-conversion of a plain-text word-list with grammatical classification information to XML, I've been a file with lines like the following:

I hope you recovered from that condition (of being a file) ;)



DRUG<OJ MO <MS-P <P 3V>>

See a solution below that assumes that an alt group doesnt start with a vowel.


What if an alt group starts with a vowel?
DRUG<OJ MO <AS-P <P 3V>>

If your strategy was to nest first, no matter what comes after the first &lt;, the result might be:

DRUG<alt>OJ MO <alt>AS-P &lt;P 3V</alt></alt>


Starting from 5.10, Perl has some balanced text facilities: http://www.perlmonks.org/?node_id=660316


And maybe Dimitre Novatchevs LR parser (written in XSLT) might help.


Well, heres an outer-to-inner analyze-string solution that works for the test input,
<p>DRUG&lt;OJ MO &lt;MS-P &lt;P 3V&gt;&gt;</p>



<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:my="my" version="2.0" exclude-result-prefixes="my xs" >

  <xsl:template match="p">
    <xsl:copy>
      <xsl:sequence select="my:wrap-nested(., '&lt;([^aeiou].+)&gt;')"/>
    </xsl:copy>
  </xsl:template>

<xsl:function name="my:wrap-nested" as="node()*">
<xsl:param name="string" as="xs:string" />
<xsl:param name="regex" as="xs:string" />
<xsl:analyze-string select="$string" regex="{$regex}" flags="i">
<xsl:matching-substring>
<alt>
<xsl:choose>
<xsl:when test="matches($string, $regex)">
<xsl:sequence select="my:wrap-nested(regex-group(1), $regex)"/>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="my:stress(.)"/>
</xsl:otherwise>
</xsl:choose>
</alt>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:sequence select="my:stress(.)"/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:function>


  <xsl:function name="my:stress" as="node()*">
    <xsl:param name="string" as="xs:string" />
    <xsl:analyze-string select="$string" regex="&lt;([aeiou])" flags="i">
      <xsl:matching-substring>
        <stress>
          <xsl:value-of select="regex-group(1)"/>
        </stress>
      </xsl:matching-substring>
      <xsl:non-matching-substring>
        <xsl:value-of select="."/>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:function>

</xsl:stylesheet>


Gerrit



Current Thread
Keywords
xml