Display Bengali characters instead of its html equivalent

Fri Jun 02, 2017 9:32 pm


I'm creating an ebook of a document written in Bengali. I first created a test document in MS Word using the on-screen keyboard and saved the file as an .htm file (Save As Webpage - Filtered). When I open this file in Oxygen I see the html equivalents (&#xxxx;) of the Bengali characters instead of the characters themselves.

I am able to type in Bengali in the editor window so the issue is probably not font related. In fact, I am having the same issue in Notepad++ so I'm fairly certain it is not a problem with how Oxygen is set-up... perhaps something to do with the file's encoding?

Re: Display Bengali characters instead of its html equivalent

Mon Jun 05, 2017 9:15 am

Dear Nabodita,

Those particular &#xxxx; notations which appear in the HTML document are called character entities, for a web browser they are the equivalent of the corresponding character.
If the HTML document is wellformed you can change to the "Author" visual editing mode (at the bottom of the opened document there is a Text/Grid/Author switch) to see the HTML in a visual what you see is what you get format. The Author mode should present the character entities as characters.

You can also convert all those character entities to the corresponding characters. Just select the entire content in the Text editing mode, right click and choose Source->Unescape Selection and choose to unescape characters, unchecking everything else.

