[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Turning escaped mixed content back to XML


Subject: Re: [xsl] Turning escaped mixed content back to XML
From: Martin Holmes <mholmes@xxxxxxx>
Date: Fri, 28 Mar 2014 12:02:11 -0700

On 14-03-28 11:32 AM, Graydon wrote:
On Fri, Mar 28, 2014 at 11:12:37AM -0700, Martin Holmes scripsit:
[getting escaped text back into parsed content]
     <xsl:template match="text:p" exclude-result-prefixes="#all">
         <xsl:variable name="unparsed">
             <xsl:copy-of select="*|text()"/>
         </xsl:variable>

$unparsed is going to be item()* instead of string if it's formed like that, and I don't think saxon:parse will work on item()* as input, it wants a single string.

That's why I'm trying to use saxon:serialize to feed into saxon:parse.


But even if I feed the string-joined text nodes directly into saxon:parse(), it fails; I get a "Content not allowed in prolog" error, presumably because there's no containing root element in the unparsed string. If I try to add that:

<xsl:template match="text:p" exclude-result-prefixes="#all">

<xsl:variable name="unparsed" select="concat('&lt;p&gt;', string-join(//text(), ''), '&lt;/p&gt;')"/>
<xsl:variable name="parsed" select="saxon:parse($unparsed)"/>
<xsl:copy-of select="$parsed" exclude-result-prefixes="#all"/>


</xsl:template>

I get "The entity name must immediately follow the '&' in the entity reference," which is a bit puzzling...

purely escaped markup text, normalize-space(.)
ought to work; if it's mixed content, you probably have to do the
string-join dance with saxon:serialize to get a single string to feed
saxon:parse, or figure out some way to return only the escaped text if
that's all you're interested in.

Might actually be a use case for //text()[normalize-space()] :)

-- Graydon


Current Thread