[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

getting xsl to produce ill-formed xml?

Subject: getting xsl to produce ill-formed xml?
From: "Oren Ben-Kiki" <oren@xxxxxxxxxxxxx>
Date: Sun, 21 Mar 1999 14:35:16 +0200

Mark D. Anderson <mda@xxxxxxxxxxxxxx> asked:

>is there any way to get an xsl style sheet to produce
>something that isn't proper xml?

This should be in the FAQ, assuming we had one.

The following makes use (some would say "abuses") a feature intended for
another purpose.
This feature is mentioned only in an "editorial note" in section 2.2 of the
current version of the standard (16-Dec-98).
Being just an editorial note, it isn't on firm ground. It might be dropped
due to widespread acceptance of XHTML, for example.
I've tested it in James Clark's XT and it seems to work. YMMV.

It would have been better if there was an explicit CDATA "namespace" which
would prevent the stylesheet from using <xsl:element> and <xsl:attribute>
and behave as if all the output of the stylesheet goes into a single CDATA
If you also think so, write to your representative in the W3C and let him

Declare your stylehseet as follows:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"
                indent-results="no" result-ns="html">


This requests that the output be in the HTML namespace. HTML isn't XML - it
is SGML, which allows for the following loophole: Wrap all your output
within a <SCRIPT> element. HTML declares SCRIPT to be a CDATA element which
means it can contain anything (including '<' and '>' characters), and a
conforming XSL processor should be aware of this and not mangle your output:

<xsl:template match="/"
    >#!/bin/perl or whatever
# </xsl:text
# </xsl:text

Which should emit:

#!/bin/perl or whatever
... the rest of the generated code ...

In the rest of the templates, do not use <xsl:element> and <xsl:attribute>;
instead rely on <xsl:text> and <xsl:value-of> for generating output.

Handling whitespace will be painful - make liberal use of <xsl:text> and
stuff all extra whitespace in the stylesheet _within_ <...> pairs (as the
above example demonstrates).

In the stylesheet itself, make sure to use &lt; &gt; instead of '<' and '>'
(alternatively you can use <[CDATA[ ... ]]> sections). In the output you
should get the correct '<' and '>' characters.

You can use <xsl:text> to write data before the <SCRIPT> and after the
</SCRIPT> but you must ensure that it doesn't contain '<' or '>' characters.
As demonstrated above, you would probably have to do so to comment the
SCRIPT tags out, depending on the target language. The above addresses Perl,
but the same idea should work for PostScript, TeX, or whatever.

Good luck,

    Oren Ben-Kiki

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread