[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] invalid xpath?


Subject: RE: [xsl] invalid xpath?
From: "Trevor Nicholls" <trevor@xxxxxxxxxxxxxxxxxx>
Date: Thu, 3 Jul 2008 01:19:45 +1200

Hi Abel

I don't want to post the whole stylesheet here because it is rather long and
complex, and my only problem is with this particular template which is
making a small adjustment to text() node children of elements which have a
preformatted attribute - or to the output of another template
(pre)processing such a text node. So we are only dealing with text.

The input XML may be coming from a variety of sources which pad out the
input with whitespace in different ways. The template I posted is part of an
included stylesheet which provides templates to try and normalise some of
this input. The text nodes are identified as "solitary", "initial", "final",
and "central", depending on the presence or absence of sibling elements to
one or both sides, and then whitespace may be handled differently. I have
WSfromL, WSfromR, KeepWS, and various other templates which are pipelined
together as appropriate.

Moving to the particular issue I have, we're looking at code samples where
spaces are significant, and line breaks are inserted into the text with
<nl/> elements. The following three XML fragments need to produce identical
output:

<code>
   abc<nl/>
   def<nl/>
</code>

<code>   abc<nl/>   def</nl></code>

<code>   abc<nl/>
   def<nl/></code>


So the original piece of XSL I gave you is dropping an initial newline
a) from a preformatted text node which has a <nl> element as its preceding
sibling (lines 2 and 3 of the first example, line 1 of the third example);
b) from a preformatted text node which has no preceding sibling and which
commences with a newline (line 1 of the first example).

I do not want to translate *all* newlines to nothing, because newlines which
occur in the input because of line wrapping (and this happens) need to be
preserved as a real whitespace character (a later template in the chain
translates them to spaces).

A simple question is getting more and more complex! I still don't understand
why the original stylesheet (..)[..] thru Saxon drops the newlines as I
want, and the modified stylesheet (..) and (..) thru xsltproc/XMLSpy turns
them into space characters as I don't want.

Cheers
Trevor


-----Original Message-----
From: Abel Braaksma [mailto:abel.online@xxxxxxxxx] 
Sent: Thursday, 3 July 2008 12:54 a.m.
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] invalid xpath?

Trevor Nicholls wrote:
> Thank you Abel
>
>   
>> [...snip...]
>>
>>    <xsl:when
test="not(preceding-sibling::*)[starts-with($Arg,'&#x0a;')]">
>>     <xsl:call-template name="WS">
>>
>> [...snip...]
>>     
>
> OK, the foregoing is invalid 1.0. So I tried modifying it to this:
>
>  <xsl:when test="not(preceding-sibling::*) and
starts-with($Arg,'&#x0a;')">
>   <xsl:call-template name="WS">
>
> Now there are no reported errors, but the test appears not to be working
(at
> least, there is an extra leading space in the output document wherever
this
> template has been called, compared with what Saxon was producing with the
> original test).

In all honesty, I haven't delved into your stylesheet logic. What you 
are testing above is whether the current node has a preceding sibling 
element and whether $Arg starts with a newline character.

You don't show how the original template is called. You select the 
current node into $Arg (which could contain any number of children) and 
then you use string functions on that node, which essentially normalizes 
that node into a string, giving you no way whatsoever to extract any 
elements from it (they will all be stringized).

Is that what you want? Is that expected behavior?

If you want to remove the newlines you could make it easier on yourself 
by using:

    translate($Arg, '&#xA;', '')


You also seem to have special cases. Why not use the template techniques 
for those cases? Let XSLT decide for you:

<xsl:template match="text()[following-sibling::nl]">
    ....

<xsl:template match="text()">
   ....

Then your KeepWS and WS named templates will become easier to program.

HTH,
Cheers,
-- Abel --


Current Thread
Keywords