Oxygen XML Forum

Posted: **Thu Mar 24, 2011 2:18 am**

I just used this forum to figure out how to preserve unicode encoding in hex of special characters in our XML, but I find that the leading zeros are removed. Is this something that can be fixed? I'd rather make as few changes as possible to the coding, so that it's easier for me to check my before and after files to make sure I didn't change anything I wasn't expecting to change.

Example:

Before transform:

“

After transform:
“

The zero before the 201c is gone.

Thanks for the advice. I'm trying to figure out encoding, but currently it's a weak spot for me!

Posted: **Thu Mar 24, 2011 6:40 am**

Entities are not part of the data model so basically what happens is that they are converted to characters, processed by XSLT and then the result is serialized. If a character cannot be represented in the output encoding then the serializer will output that as a character entity.

So, basically there is no link between your input representation of character entities and the output - you can expect anything in the output as long as that is XML correct. To preserve the entities format you need either a post-processing step, or if it is easier both a pre-processing and a post-processing step, for instance in the pre-processing step you can replace & with & so “ will be “ and in the post-processing step apply the reverse, replace & with & getting from “ back to “.

Best Regards,
George

Posted: **Thu Mar 24, 2011 4:55 pm**

Thanks...now to figure out how to do that from within Oxygen. I'm hoping to avoid Perl scripts!

Oxygen XML Forum

unicode characters in transformation

unicode characters in transformation

Re: unicode characters in transformation

Re: unicode characters in transformation