Page 1 of 1
unicode characters in transformation
Posted: Thu Mar 24, 2011 2:18 am
by rorris
I just used this forum to figure out how to preserve unicode encoding in hex of special characters in our XML, but I find that the leading zeros are removed. Is this something that can be fixed? I'd rather make as few changes as possible to the coding, so that it's easier for me to check my before and after files to make sure I didn't change anything I wasn't expecting to change.
Example:
Before transform:
“
After transform:
“
The zero before the 201c is gone.
Thanks for the advice. I'm trying to figure out encoding, but currently it's a weak spot for me!
Re: unicode characters in transformation
Posted: Thu Mar 24, 2011 6:40 am
by george
Entities are not part of the data model so basically what happens is that they are converted to characters, processed by XSLT and then the result is serialized. If a character cannot be represented in the output encoding then the serializer will output that as a character entity.
So, basically there is no link between your input representation of character entities and the output - you can expect anything in the output as long as that is XML correct. To preserve the entities format you need either a post-processing step, or if it is easier both a pre-processing and a post-processing step, for instance in the pre-processing step you can replace & with & so “ will be “ and in the post-processing step apply the reverse, replace & with & getting from “ back to “.
Best Regards,
George
Re: unicode characters in transformation
Posted: Thu Mar 24, 2011 4:55 pm
by rorris
Thanks...now to figure out how to do that from within Oxygen. I'm hoping to avoid Perl scripts!