[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] generating numerical character entities in html output

Subject: RE: [xsl] generating numerical character entities in html output
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 18 Jan 2005 10:32:07 -0000

> That is if I have, say,
> <nospam>me@xxxxxxxxxx</nospam>
> in the xml source, I want my xslt code to produce something like
> &#109;&#101;&#64;&#109;&#121;&#115;&#101;&#108;&#102;&#46;&#10
> 1;&#120;&#116;
> in the generated html source.

This is a serialization issue and not a transformation issue. That's why
it's difficult to handle in XSLT.

XSLT 2.0 allows you to control this aspect of serialization with character
maps, but it's not straightforward, because character maps apply to all
characters in the output.

> I'm particularly puzzled by the fact that
> <xsl:text disable-output-escaping="yes">&amp;</xsl:text>
> correctly generates "&amp;" in the html code but
> <xsl:text disable-output-escaping="yes">&#64;</xsl:text>
> always generates "@", independently of the value of 
> disable-output-escaping. 

If d-o-e is supported by your processor, I would expect the first case to
produce "&amp;" in the normal case, and "&" if you suppress the escaping. In
the second case, an @ sign never needs to be escaped, so disabling escaping
has no effect. (Remember that writing &#64; in the input is exactly the same
as writing @, the XSLT processor sees no difference between these two

You don't want to disable escaping of characters that would normally be
escaped, you want to enable escaping of characters that wouldn't normally be

If you want to write character references using d-o-e, then you can do it

<xsl:text disable-output-escaping="yes">&amp;#64;</xsl:text>

provided the processor supports d-o-e. Here you are outputting five
characters & # 6 4 ;. The first character, &, would normally be escaped as
&amp;, and d-o-e suppresses this.

However: are you sure what you are doing makes sense? No one reading an HTML
document is supposed to make any distinction between @ and &#64; and I would
have thought this included spammers.

Michael Kay

Current Thread