[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] How can I preserve ASCII Encoding Character Sets?

Subject: Re: [xsl] How can I preserve ASCII Encoding Character Sets?
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Tue, 06 Nov 2012 16:27:58 -0500

At 2012-11-06 14:05 -0500, Philip Vallone wrote:
This question is related to a question I asked a few months ago about finding a way to distribute a set of stylesheets that makes it difficult for the stylesheets to be modified or stolen (e.g. protect the intellectual property). See http://markmail.org/thread/qv3z7yzdlr5ht7rw

What I ended up doing is creating a stylesheet that combines my stylesheets into one xsl file. I then encrypt the file and store it as a binary. I have created a "wrapper" application that basically decrypts the file back to xsl and passes the xsl string to Saxon to do its work. Everything works great except when I compile my stylesheets into a single xsl file, some ASCII Encoding Character Sets are not preserved:

<xsl:when test="$prefix = 'pf01'">&#x00A0;</xsl:when>

How can I preserve these ASCII Encoding Character Sets?

I've not heard the term "ASCII Encoding Character Set" before, so I'm unsure exactly what it is you are asking.

XSLT works in Unicode.

Your template for your <xsl:when> consists of a single Unicode character for the non-breaking space. This one character is expressed using the markup of the numeric character reference, which is comprised of ASCII characters. There is no "encoding" at the XSLT level ... the XML processor in the XSLT processor has determined the Unicode character from the stream of markup characters which is expressed in a particular encoding. You don't say what encoding you are using in your stylesheet, but the expression of the numeric character reference is likely not impacted by whatever choice you have.

I expect it is your encryption/decryption process, likely using XML libraries instead of being using a native XML processing language, is messing you up.

If you are using a non-XML-based processing language and simply treating the stylesheet as a string and not as XML, then "&#x00A0;" should simply be seven characters of text in the string. Your encrypt/decrypt should then produce seven characters of text in a string. There shouldn't be a problem.

So if you are supplying the XML processor inside of Saxon with the string to be parsed as XML, and you have an XML declaration at the start of your stylesheet to tell the XML processor about the character set in your string, I cannot see where the problem might be.

But if I were to guess based on not having any evidence in your post, I would guess that you are treating the stylesheet as an XML stream and not a simple string, which means the XML processor in your environment is converting the numeric character reference into a Unicode character that is somehow expressed in the string that you encrypt and decrypt and what comes out the other end is not a numeric character reference but some encoding of the Unicode character dictated by your string library which is inconsistent with the XML declaration at the top of the file. When Saxon sees your decrypted string, the string library's choice of encoding the NBSP character (the numeric character reference is long gone) does not match the encoding implied or stated at the start of the string.

But I don't have much to go on in your description.

I hope this helps you find your problem.

. . . . . . . . Ken

Contact us for world-wide XML consulting and instructor-led training
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm
Crane Softwrights Ltd.            http://www.CraneSoftwrights.com/s/
G. Ken Holman                   mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Google+ profile: https://plus.google.com/116832879756988317389/about
Legal business disclaimers:    http://www.CraneSoftwrights.com/legal

Current Thread