Page 1 of 1

Generating Numeric Character Reference in HTML

Posted: Wed Mar 09, 2016 3:31 am
by sarcanon
Hello.

I am trying to generate an HTML document with XSLT that can contain a numeric character reference in the output. I n many cases, the XSLT works with 'normal' characters, e.g., non-breaking space character (Š), etc., since all browsers can successfully interpret them when they are output as literal values.

However, my documents contain certain characters that lie outside the normal Basic Multilingual Plane. When XSLT transforms these as literal characters, many browsers have trouble interpreting them, even though the font called out in the CSS does contain a glyph for the applicable codepoint. Consequently, for these special characters, I need to generate a numeric character reference in order to ensure satisfactorily rendered HTML.

My current method for outputting the character is using <xsl:text>, e.g.,

Code: Select all

 <xsl:text>&#x10198;</xsl:text>
XSLT however, resolves this entity to its literal value, which, as I said, is not a reliable means of rendering the desired character in HTML.

How do I guarantee that the numeric character reference is output instead of the literal value?

Thank you.

I am using OxygenXML 17.1 on Windows 7 (64-bit). My stylesheets are XSLT v2.0.

Re: Generating Numeric Character Reference in HTML

Posted: Wed Mar 09, 2016 1:45 pm
by adrian
Hi,

A simple way to force all non-ascii characters to be serialized as numeric character references in the output is to use in the XSL xsl:output/@encoding="us-ascii" To force just Unicode characters you can use "ISO-8859-1". e.g.

Code: Select all

<xsl:output method="html" encoding="ISO-8859-1"/>
If you want to control this on a character basis, you can use xsl:text/@disable-output-escaping="yes", but only on the ampersand character. e.g.

Code: Select all

<xsl:text disable-output-escaping="yes">&</xsl:text>#x10198;
If there's a limited set of characters, you could use xsl:character-map to map the character to the representation you want serialized.

BTW, the encoding problem in web browsers may be that your HTML document doesn't specify explicitly the encoding it is using (e.g. UTF-8 or ISO-8859-1). Sometimes that's all that's needed so that the web browser understands the encoding of the document. e.g.

Code: Select all

<head>
<meta charset="UTF-8">
</head>
Regards,
Adrian

Re: Generating Numeric Character Reference in HTML

Posted: Wed Mar 09, 2016 2:13 pm
by radu_pisoi
Hi,

Thank you for contacting us.

I understand that you want to emit the character entity '&#x10198;' in the HTML output . This can be done in two ways:

1. By setting the disable-output-escaping="yes" attribute for the xsl:text instruction

Code: Select all

<xsl:text disable-output-escaping="yes">&#x10198;</xsl:text>
2. By declaring a character maps in the XSLT and reference it in xsl:output:

Code: Select all

<xsl:character-map name="cm">
<xsl:output-character character="&#x10198;" string="&#x10198;"/>
</xsl:character-map>
<xsl:output use-character-maps="cm"/>