[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] encoding problem while reading the content in


Subject: [xsl] encoding problem while reading the content in
From: "Pramodh Peddi" <peddip@xxxxxxxxxxxxxxxx>
Date: Tue, 11 Nov 2003 16:49:41 -0500

Hi,
I am not sure if this problem is regarding Tranformation or general Java
basics.
I am reading an xml file from an sftp location and passing that through the
Transformer (using Java1.4.1's API). The xml file has "windows-1252"
encoding declared. It has special characters like ® directly put in the
file. It also has entity characters like &#8482; - which is TM.

When i read the bytes in from sftp location, it is converting ® into "?"
symbols (but retaining the &#8482 smbols as it, which is good and expected).
The reason I know is, I am printing the String built from the bytes.

This what I am doing in the application.
**************************************************************
ByteArrayOutputStream rawfileOutputStream = new ByteArrayOutputStream();

if (filePath != null) {

//read from the ftp server

sftp.get(filePath, rawfileOutputStream);

rawfileOutputStream.close();

}

byte[] rawData = rawfileOutputStream.toByteArray();


String tempStr = new String(rawData, "iso-8859-1");//also tried 1. new
String(rawData, "windows-1252") and 2. specifying no encoding


log.info("\n\n\n\n\ntempStr is: " + tempStr + "\n\n\n\n\n");

//perform transformation

transformer.transform(

new StreamSource(new StringReader(taxonomyStr)),

new StreamResult(

new OutputStreamWriter(out));

***********************************************************************
I actually tried more than what is there in the code snippet above. I tried
to build the String using different encodings like "windows-1252" and tried
specifying no encoding. I also tried not to convert the bytes to String but
prepare an InputStream and send it to the transformer (also changed the
encoding specification in the StreamReader and writer while transforming),
which is:
*****************************************************************
ByteArrayInputStream rawfileInputStream = new
ByteArrayInputStream(rawfileOutputStream.toByteArray());

transformer.transform(

new StreamSource(new InputStreamReader(rawfileInputStream, "windows-1252")),

 new StreamResult(new OutputStreamWriter(out, "windows-1252")));

****************************************************************************
**

Even that din't work. I also tried specifying the encoding in toByteArray()
method. Nothing works. I am not sure if I am missing basics or something is
beyond our control. Big problem is, it works fine on one Solaris machine (it
spits right reg mark chars) but fails on another Solaris machine (spits ?
for reg marks). It also works fine on Windows machine (may be bacuse the
encoding is windows-1252).

I don't know what else to do. I desperately need this to get done ASAP. Many
components are depending on this fix.

Any one has any suggestions? I would greatly appreciate!

Thanks in advance,

pramodh.


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords