Generating Numeric Character Reference in HTML

Here should go questions about transforming XML with XSLT and FOP.
Posts: 25

Generating Numeric Character Reference in HTML

Wed Mar 09, 2016 3:31 am


I am trying to generate an HTML document with XSLT that can contain a numeric character reference in the output. I n many cases, the XSLT works with 'normal' characters, e.g., non-breaking space character (Š), etc., since all browsers can successfully interpret them when they are output as literal values.

However, my documents contain certain characters that lie outside the normal Basic Multilingual Plane. When XSLT transforms these as literal characters, many browsers have trouble interpreting them, even though the font called out in the CSS does contain a glyph for the applicable codepoint. Consequently, for these special characters, I need to generate a numeric character reference in order to ensure satisfactorily rendered HTML.

My current method for outputting the character is using <xsl:text>, e.g.,

Code: Select all


XSLT however, resolves this entity to its literal value, which, as I said, is not a reliable means of rendering the desired character in HTML.

How do I guarantee that the numeric character reference is output instead of the literal value?

Thank you.

I am using OxygenXML 17.1 on Windows 7 (64-bit). My stylesheets are XSLT v2.0.
Posts: 2336

Re: Generating Numeric Character Reference in HTML

Wed Mar 09, 2016 1:45 pm


A simple way to force all non-ascii characters to be serialized as numeric character references in the output is to use in the XSL xsl:output/@encoding="us-ascii" To force just Unicode characters you can use "ISO-8859-1". e.g.

Code: Select all

<xsl:output method="html" encoding="ISO-8859-1"/>

If you want to control this on a character basis, you can use xsl:text/@disable-output-escaping="yes", but only on the ampersand character. e.g.

Code: Select all

<xsl:text disable-output-escaping="yes">&amp;</xsl:text>#x10198;

If there's a limited set of characters, you could use xsl:character-map to map the character to the representation you want serialized.

BTW, the encoding problem in web browsers may be that your HTML document doesn't specify explicitly the encoding it is using (e.g. UTF-8 or ISO-8859-1). Sometimes that's all that's needed so that the web browser understands the encoding of the document. e.g.

Code: Select all

<meta charset="UTF-8">

Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
Posts: 310
Location: Craiova

Re: Generating Numeric Character Reference in HTML

Wed Mar 09, 2016 2:13 pm


Thank you for contacting us.

I understand that you want to emit the character entity '&#x10198;' in the HTML output . This can be done in two ways:

1. By setting the disable-output-escaping="yes" attribute for the xsl:text instruction

Code: Select all

<xsl:text disable-output-escaping="yes">&amp;#x10198;</xsl:text>

2. By declaring a character maps in the XSLT and reference it in xsl:output:

Code: Select all

<xsl:character-map name="cm">
    <xsl:output-character character="&#x10198;" string="&amp;#x10198;"/>
<xsl:output use-character-maps="cm"/>
Radu Pisoi
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger

Return to “XSLT and FOP”

Who is online

Users browsing this forum: No registered users and 0 guests