[oXygen-user] using automatic character conversion in XML to XML transformation
George Cristian Bina
Mon Oct 9 02:30:19 CDT 2006
Hi Paul,
When the output method is html then the XSLT processor uses a different
output serializer than the one used when you set the output method to
xml. For XML there are different rules than the ones for HTML.
In XSLT 1.0 you can set the output encoding to ASCII for instance or to
some encoding that cannot represent the characters you want to output as
entities and those characters will be output as as character references,
the copyright symbol will appear as &_#169; (added one underscore _ to
avoid the conversion of the character reference to the actual character
by some email clients).
In XSLT 2.0 you can use character maps to output &_copy; (again I added
an _ ). You can find below a sample stylesheet that copies the source to
the output representing the copyright characters as &_copy;
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:character-map name="test">
<xsl:output-character character="©" string="&copy;"/>
</xsl:character-map>
<xsl:output use-character-maps="test"/>
<xsl:template match="node() | @*">
<xsl:copy>
<xsl:apply-templates select="node() | @*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This stylesheet applied on a document like:
<test>
<a> © </a>
<b> © </b>
<c> © </c>
</test>
will result in
<?xml version="1.0" encoding="UTF-8"?><test>
<a> © </a>
<b> © </b>
<c> © </c>
</test>
But note that the result document is not wellformed as the copy entity
is used but not declared. To have an wellformed result you need to
create a DTD like below
test.dtd
<?xml version="1.0" encoding="UTF-8"?>
<!ENTITY copy "©">
and change the xsl:output to refer to this DTD
<xsl:output doctype-system="test.dtd" use-character-maps="test"/>
And the result will be now:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE test
SYSTEM "test.dtd">
<test>
<a> © </a>
<b> © </b>
<c> © </c>
</test>
which is wellformed (but not valid against the DTD). If you want the
output to be also valid you need to update the DTD to contain the
elements and attributes declarations, in the above example that will be
<?xml version="1.0" encoding="UTF-8"?>
<!ENTITY copy "©">
<!ELEMENT test (a,b,c)>
<!ELEMENT a (#PCDATA)>
<!ELEMENT b (#PCDATA)>
<!ELEMENT c (#PCDATA)>
Best Regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Dever, Paul (ELS) wrote:
> If I transform my XML document to HTML using the HTML output method
> oXygen automatically turns special characters into HTML character
> entities (e.g., a copyright symbol gets transformed into "©").
>
> I want to use that functionality in a transformation from XML to XML
> using XML as an output method, but I can't figure out how to do it. Is
> there a way?
>
> Thanks,
> --Paul
> ___________________________
> *Paul Dever* *::* Manager, Electronic Workflows *::* EPD-US
> (T) +1 314-995-3291 *:: *(E)
> <mailto:>
> 11830 Westline Industrial Drive, St. Louis, MO 63146
> *ELSEVIER*
> **
> **
>
>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> oXygen-user mailing list
>
> http://www.oxygenxml.com/mailman/listinfo/oxygen-user
More information about the oXygen-user
mailing list