Opening and Saving Unicode Documents

When loading documents, Oxygen XML Editor reads the document prolog to determine the specified encoding type. This encoding is then used to instruct the Java Encoder to load support for and to save the document using the specified code chart. When the encoding type cannot be determined, Oxygen XML Editor prompts and display the Available Java Encodings dialog box that provides a list of all encodings supported by the Java platform.

If the opened document contains an unsupported character, Oxygen XML Editor applies the policy specified for handling such errors. If the policy is set to REPORT, Oxygen XML Editor displays an error dialog box with a message about the character not allowed by the encoding. If the policy is set to IGNORE, the character is removed from the document displayed in the editor panel. If the policy is set to REPLACE, the character is replaced with a standard replacement character for that encoding.

While in most cases you are using UTF-8, simply changing the encoding name causes the application to save the file using the new encoding.

When saving a document edited in the Text, Grid, or Design modes, if it contains characters not included in the encoding declared in the document prolog, Oxygen XML Editor detects the problem and signals it to the user. The user is responsible to resolve the conflict before saving the document.

When saving a document edited in the Author mode, all characters that fall outside the detected encoding will be automatically converted to hexadecimal character entities.

To edit documents written in Japanese or Chinese, change the font to one that supports the specific characters (a Unicode font). For the Windows platform, Arial Unicode MS or MS Gothic is recommended. Do not expect WordPad or Notepad to handle these encodings. Use Internet Explorer or Word to examine XML documents.

When a document with a UTF-16 encoding is edited and saved in Oxygen XML Editor, the saved document has a byte order mark (BOM) that specifies the byte order of the document content. The default byte order is platform-dependent. That means that a UTF-16 document created on a Windows platform (where the default byte order mark is UnicodeLittle) has a different BOM than a UTF-16 document created on a Mac OS platform (where the byte order mark is UnicodeBig). The byte order and the BOM of an existing document are preserved when the document is edited and saved. This behavior can be changed in Oxygen XML Editor from the Encoding preferences panel.