[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] 16-bit chars rendered as "?" in UTF-8?


Subject: [xsl] 16-bit chars rendered as "?" in UTF-8?
From: John English <john.foreign@xxxxxxxxx>
Date: Mon, 13 Aug 2012 13:59:48 +0300

I am filtering documents through an XSLT stylesheet to generate HTML.
The documents can contain 16-bit characters (Hebrew in this case).
The stylesheet begins like this:

  <?xml version='1.0'?>
  <xsl:stylesheet version='1.0'
       xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
  <xsl:output method="html" encoding="UTF-8" indent="no"/>
  <xsl:output doctype-public="-//W3C//DTD HTML 4.01//EN"/>

I then have templates for two different root elements, shown below, which are very similar in outline. Both have the text content wrapped
in <span>...</span>, with a rule for <span> like this:


  <xsl:template match='span'>
    <xsl:copy-of select="."/>
  </xsl:template>

Both give me back an HTML page which begins like this:

  <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
  <html><head>
  <META http-equiv="Content-Type" content="text/html; charset=UTF-8">

However, although in "type1" documents Hebrew is correctly rendered
so that &#1488; appears as W, in "type2" documents all Hebrew content
is rendered as "?". Can anyone tell me why this might happen, bearing
in mind that the two document types go through the same stylesheet,
both have Hebrew content enclosed in the same element below the root
and both come out with charset=UTF-8?

Here are the templates for the root elements:

  <xsl:template match='/type1'>
    <html>
      <head>
        <title><xsl:value-of select='title'/></title>
        <xsl:for-each select='//script'>
          <xsl:copy-of select='.'/>
        </xsl:for-each>
      </head>
      <xsl:element name='body'>
        <xsl:attribute name='onLoad'>init()</xsl:attribute>
        <xsl:if test='$text.dir'>
          <xsl:attribute name='dir'>
             <xsl:value-of select='$text.dir'/>
          </xsl:attribute>
        </xsl:if>
        <xsl:apply-templates select='heading'/>
        <xsl:apply-templates select='help'/>
        <xsl:apply-templates select='tabs'/>
        <h1><xsl:value-of select='title'/></h1>
        <xsl:apply-templates select='content'/>
        <xsl:call-template name='footer'/>
        <xsl:apply-templates select='//message'/>
      </xsl:element>
    </html>
  </xsl:template>

  <xsl:template match='/type2'>
    <html>
      <head>
        <title><xsl:value-of select='@title'/></title>
        <xsl:for-each select='//script'>
          <xsl:copy-of select='.'/>
        </xsl:for-each>
      </head>
      <xsl:element name='body'>
        <xsl:if test='$text.dir'>
          <xsl:attribute name='dir'>
            <xsl:value-of select='$text.dir'/>
          </xsl:attribute>
        </xsl:if>
        <xsl:apply-templates select='*[not(self::script)]'/>
      </xsl:element>
    </html>
  </xsl:template>
--
John English


Current Thread
Keywords