When to Use Character Codes

Questions about XML that are not covered by the other forums should go here.
jbzech
Posts: 37
Joined: Fri May 20, 2011 6:07 pm

When to Use Character Codes

Post by jbzech »

I'm pretty new to XML and coding generally, but picking it up quickly.

One question I can't seem to figure out is when I should use character codes for characters (such as &#8212 for em-dash) instead of using the character itself in the XML.

For example, if I paste in content from Word to a DocBook page in Author View, my em-dashes and curly quote marks all look fine in both Author & Text view. Should I replace them with the appropriate codes in text view? Or leave them?

I'm working in DocBook documents with encoding="UTF-8" and I'm outputting the XML content as PDF, HTML, and ePub, and I think the output looks the same whether I used character codes or the original characters pasted in from Word. I'm publishing work for consumers, so it is important to have appropriate curly quotes, etc.

Can someone point me to a layman resource or give me a quick rundown on this issue? Right now I'm searching out single and double quotes, em-dash, en-dash, and ellipsis. Are there other characters I should be replacing with code? Am I wasting my time?

Any help is appreciated.
adrian
Posts: 2855
Joined: Tue May 17, 2005 4:01 pm

Re: When to Use Character Codes

Post by adrian »

Hello,

Character coding(numerical character entity) provides the means to include in the XML content characters that are not supported by the XML document encoding.
e.g. Use japanese characters in an XML document with a Latin(ISO8859-1) encoding

Since you are using UTF-8, which is a very comprehensive encoding, they will rarely be necessary.

Note however that there are some key characters('<', '&', quotes in attribute values) that are forbidden in XML content and must be replaced with a numerical character or character entity reference for the XML document to be well formed. If you are working in Author mode these are automatically replaced by Oxygen. However, if you are editing in Text mode, you have full control so you must also take these under consideration.

http://en.wikipedia.org/wiki/List_of_XM ... references

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Post Reply