<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">

<meta content="text/html; charset=utf-8">

</head>

<body>

<div dir="auto" style="direction:ltr; margin:0; padding:0; font-family:sans-serif; font-size:11pt; color:black">

Thanks for v the quick reply james but doesnt your approach imply that the tokenisation into sentences has already been done? Im trying t o avoid a two pass solution as I expect to be doing this hundreds of times<br>

<br>

</div>

<div dir="auto" style="direction:ltr; margin:0; padding:0; font-family:sans-serif; font-size:11pt; color:black">

<div dir="auto" style="direction:ltr; margin:0; padding:0; font-family:sans-serif; font-size:11pt; color:black">

reluctantly using <a href="https://aka.ms/ghei36">Outlook for Android</a></div>

<br>

</div>

<hr tabindex="-1" style="display:inline-block; width:98%">

<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> James Cummings <james@blushingbunny.net><br>

<b>Sent:</b> Monday, November 5, 2018 1:10:02 PM<br>

<b>To:</b> Lou Burnard<br>

<b>Cc:</b> oxygen-user@oxygenxml.com<br>

<b>Subject:</b> Re: [oXygen-user] an xslt challenge</font>

<div> </div>

</div>

<div>

<div dir="ltr">Hi Lou,

<div><br>

</div>

<div>Would it make sense to use xsl:for-each-group to group the sentences into <s> units to make this easier? Then I'd probably recursively call a template or function passing the current collection of <s> units as a variable item* value, testing if its tokenised

 number is above or below $maxWords.  </div>

<div><br>

</div>

<div>Not got time to write that out as a solution atm, and I'm sure it can be done without the recursivity as well, but that is the approach that would have occurred to me at least.</div>

<div><br>

</div>

<div>-James</div>

<div><br>

</div>

</div>

<br>

<div class="gmail_quote">

<div dir="ltr">On Mon, 5 Nov 2018 at 12:03, Lou Burnard <<a href="mailto:lou.burnard@retired.ox.ac.uk">lou.burnard@retired.ox.ac.uk</a>> wrote:<br>

</div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex; border-left:1px #ccc solid; padding-left:1ex">

<div bgcolor="#CCCCCC">

<p>I hope I am not abusing this list in asking occasionally for advice on the best way to hack something in xslt.

<br>

</p>

<p>Today's problem is to output only the first x sentences (string terminated by a full stop) of a paragraph such that the total number of words (space delimited strings)  is less than some limit (call it $maxWords) Since the sentences are of variable length,

 obviously I don't know what x is.<br>

</p>

<p>Here's where I got to so far:</p>

<p><xsl:template match="t:p"><br>

        <xsl:variable name="pString"><br>

            <xsl:value-of select="."/><br>

        </xsl:variable><br>

        <xsl:for-each select="tokenize($pString, '\.\s')"><br>

            <xsl:variable name="seq"><br>

                <xsl:value-of select="string(position())"/><br>

            </xsl:variable><br>

            <xsl:variable name="wordsSoFar"><br>

                <xsl:value-of select="string-length(translate(normalize-space<br>

                (preceding-sibling::text()), ' ', '')) + 1"/><br>

            </xsl:variable><br>

          <xsl:if test="$wordsSoFar &lt; $maxWords"></p>

<p>            <s n="{$seq}"><br>

                <xsl:value-of select="."/><br>

            </s></p>

<p>          <xsl:if></p>

<p>       </xsl:for-each><br>

    </xsl:template><br>

</p>

<p>But this is not valid because preceding-sibling:: wants a node() not a string (even though "text()" *is* a node imho).

<br>

</p>

<p>Am I going about this entirely the wrong way?</p>

<p><br>

</p>

<p><br>

</p>

<p><br>

</p>

</div>

_______________________________________________<br>

oXygen-user mailing list<br>

<a href="mailto:oXygen-user@oxygenxml.com" target="_blank">oXygen-user@oxygenxml.com</a><br>

<a href="https://www.oxygenxml.com/mailman/listinfo/oxygen-user" rel="noreferrer" target="_blank">https://www.oxygenxml.com/mailman/listinfo/oxygen-user</a><br>

</blockquote>

</div>

</div>

</body>

</html>