[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] character entities


Subject: Re: [xsl] character entities
From: Tony Graham <Tony.Graham@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 03 Nov 2008 09:50:11 +0000

On Mon, Nov 03 2008 07:32:17 +0000, jbar@xxxxxxxx wrote:
...
> Latest effort: I tried using encoding="utf-8" for all levels: my
> original xml, my xsl output, and the input to ZSL's index, & I also
> saved my xml file as utf-8 format, and used the Spanish n inside my
> xml, i.e. C1 rather than &#241;. Doing that, the Spanish n was
> preserved through the xsl output, but ZSL stores it as: CB1, & that's
> also how my browser displays it.

That is UTF-8, but your browser thinks it's ISO-8859-1.

The generalisation is that if a character from the Latin-1 Supplement
block comes out as two characters where the first character is an
accented "A", then you are probably reading UTF-8 as ISO-8859-1.

If you go to Richard Ishida's excellent Unicode Code Converter [1] and
enter 241 in the "Decimal code points" box, you'll see that it's "C3 B1"
in UTF-8.

If you then go to Richard Ishida's excellent UniView [2], you can suss
out that "C3 B1" as two ISO-859-1 characters would be "CB1".

Regards,


Tony Graham                         Tony.Graham@xxxxxxxxxxxxxxxxxxxxxx
Director                                  W3C XSL FO SG Invited Expert
Menteith Consulting Ltd
XML, XSL and XSLT consulting, programming and training
Registered Office: 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland
Registered in Ireland - No. 428599   http://www.menteithconsulting.com
  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --  --
xmlroff XSL Formatter                               http://xmlroff.org
xslide Emacs mode                  http://www.menteith.com/wiki/xslide
Unicode: A Primer                               urn:isbn:0-7645-4625-2


[1] http://rishida.net/scripts/uniview/conversion
[2] http://rishida.net/scripts/uniview/uniview.php?codepoints=F1


Current Thread
Keywords