[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] analyze-string question

Subject: Re: [xsl] analyze-string question
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Fri, 26 Oct 2012 09:17:05 +0200

On 2012-10-26 06:38, Birnbaum, David J wrote:
Dear XSLT-list,

For an up-conversion of a plain-text word-list with grammatical classification information to XML, I've been a file with lines like the following:

I hope you recovered from that condition (of being a file) ;)


See a solution below that assumes that an alt group doesnt start with a vowel.

What if an alt group starts with a vowel?

If your strategy was to nest first, no matter what comes after the first &lt;, the result might be:

DRUG<alt>OJ MO <alt>AS-P &lt;P 3V</alt></alt>

Starting from 5.10, Perl has some balanced text facilities: http://www.perlmonks.org/?node_id=660316

And maybe Dimitre Novatchevs LR parser (written in XSLT) might help.

Well, heres an outer-to-inner analyze-string solution that works for the test input,
<p>DRUG&lt;OJ MO &lt;MS-P &lt;P 3V&gt;&gt;</p>

<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:my="my" version="2.0" exclude-result-prefixes="my xs" >

  <xsl:template match="p">
      <xsl:sequence select="my:wrap-nested(., '&lt;([^aeiou].+)&gt;')"/>

<xsl:function name="my:wrap-nested" as="node()*">
<xsl:param name="string" as="xs:string" />
<xsl:param name="regex" as="xs:string" />
<xsl:analyze-string select="$string" regex="{$regex}" flags="i">
<xsl:when test="matches($string, $regex)">
<xsl:sequence select="my:wrap-nested(regex-group(1), $regex)"/>
<xsl:sequence select="my:stress(.)"/>
<xsl:sequence select="my:stress(.)"/>

  <xsl:function name="my:stress" as="node()*">
    <xsl:param name="string" as="xs:string" />
    <xsl:analyze-string select="$string" regex="&lt;([aeiou])" flags="i">
          <xsl:value-of select="regex-group(1)"/>
        <xsl:value-of select="."/>



Current Thread