[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] WordML Question and normalize-space() question


Subject: Re: [xsl] WordML Question and normalize-space() question
From: Nigel Whitaker <nigel.whitaker@xxxxxxxxxxxx>
Date: Fri, 19 May 2006 16:10:44 +0100

Jordan Soet wrote:
First my normalize-space() question... I'm wondering if it's possible to
normalize all the space without removing the leading and trailing
whitespace?

One technique is to compare the normalized and unnormalized strings and see if the first and last characters differ. This can be used to detect leading and trailing whitespace.

The following is part of our space normalization filter.
It works well (in conjunction with an identity like transform)
for xhtml in general and for the inline bold-italic case
you mention:


<b> bold text <i> bold and italic </i> just bold again </b>



<xsl:template match="text()">
<xsl:variable name="normalized-text" select="normalize-space()"/>
<xsl:variable name="input-text" select="."/>
<xsl:choose>
<xsl:when test="ancestor-or-self::*[@xml:space][1][./@xml:space='preserve']">
<!-- if a text node itself has, or has an ancestor with,
xml:space attribute and if the nearest ancestor has
the attribute with value 'preserve', output exactly -->
<xsl:copy-of select="."/>
</xsl:when>
<xsl:when test="string-length($normalized-text) = 0">
<!-- The PCDATA is all whitespace, just output a single space -->
<xsl:text> </xsl:text>
</xsl:when>
<xsl:otherwise>
<!-- mixed whitespace and text in PCDATA which is normalized,
but with leading and trailing (single) whitespace
preservation -->
<xsl:if test="substring($normalized-text, 1, 1) !=
substring($input-text, 1, 1)">
<!-- If the first character of the non-normalized text is
different from the first character of the normalized
version, then the non-normalized one must have started
with whitespace -->
<xsl:text> </xsl:text>
</xsl:if>
<xsl:value-of select="$normalized-text"/>
<xsl:if test="substring($normalized-text,
string-length($normalized-text), 1) !=
substring($input-text,
string-length($input-text), 1)">
<xsl:text> </xsl:text>
</xsl:if>
</xsl:otherwise>
</xsl:choose>
</xsl:template>


There are more efficient ways of doing this in Java with an
org.xml.sax.XMLFilter and probably with XSLT2.0.

Hope this helps,

Nigel
--
Nigel Whitaker,  DeltaXML: "Change control for XML, in XML"
nigel.whitaker@xxxxxxxxxxxx    http://www.deltaxml.com


Current Thread
Keywords