Page 1 of 1

BOM Handling Question

Posted: Tue Jun 23, 2020 6:28 pm
by Jamil
I am noticing that when I execute an XSLT with encoding set for UTF-8, the BOM is not being created in the output file. I just checked Ecoding under preferences, and UTF-8 BOM handling is set for keep. For a new file, I want the BOM to always be added to the output file. My transformer is set to Saxon-EE 9.9.1.7.

How can I force the BOM to be written for new output files?

This is <oXygen/> XML Editor 22.1, build 2020061102

Thanks.

Re: BOM Handling Question

Posted: Mon Jun 29, 2020 9:09 am
by Radu
Hi Jamil,

That UTF-8 BOM handling setting in Oxygen is used only when opening and saving XML documents in the application. It does not control the way in which the Saxon XSLT processor saves the result of applying an XSLT transformation. I added an internal issue to see if we can use our setting to control the BOM used for serializing the result of the XSLT processing.
In general, adding BOM bytes to UTF-8 files is useless, it is also not recommended:

https://stackoverflow.com/questions/222 ... ithout-bom

Usually the BOM makes sense when saving to UTF-16 but it can also be missing from UTF-16 files, in which case a default behavior is implied.

Regards,
Radu

Re: BOM Handling Question

Posted: Mon Jun 29, 2020 9:54 pm
by Jamil
Hi Radu.

The issue I face is that the Unicode file is not interpreted correctly resulting in character loss for UTF-8. My XSLT output is text containing UTF-8. There is no indicator for this other than the BOM. Since it is missing, data gets interpreted incorrectly resulting in data loss.

For UTF-8, it may be considered useless if, and only if, all characters are eight bit. In the event of 16 bit characters under Windows, it should be required.

Re: BOM Handling Question

Posted: Tue Jun 30, 2020 6:10 am
by Radu
Hi,

If you are outputting XML, the XML default encoding according to the specification is UTF-8.
If you are outputting some other text content, I found an attribute on the xsl:output element which you could try to set like:

Code: Select all

<xsl:output method="text" byte-order-mark="yes"/>
Regards,
Radu

Re: BOM Handling Question

Posted: Wed Jul 01, 2020 1:18 am
by Jamil
Thanks, and I was not even aware this attribute existed. This solved the issue.