[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] 16-bit chars rendered as "?" in UTF-8?


Subject: Re: [xsl] 16-bit chars rendered as "?" in UTF-8?
From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx>
Date: Wed, 15 Aug 2012 16:32:53 +0200

You can catch Non-ASCII XML characters using this regexp:

$ od -tx1 00 | grep -E " (0[^9ad]|[189a-f]|7[^f])"
0000000 00
$ od -tx1 0D | grep -E " (0[^9ad]|[189a-f]|7[^f])"
$ od -tx1 7E | grep -E " (0[^9ad]|[189a-f]|7[^f])"
0000000 7e
$ od -tx1 7F | grep -E " (0[^9ad]|[189a-f]|7[^f])"
$ od -tx1 80 | grep -E " (0[^9ad]|[189a-f]|7[^f])"
0000000 80
$


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294



  From:       John English <john.foreign@xxxxxxxxx>

  To:         David Carlisle <davidc@xxxxxxxxx>,

  Cc:         xsl-list@xxxxxxxxxxxxxxxxxxxxxx

  Date:       08/15/2012 04:07 PM

  Subject:    Re: [xsl] 16-bit chars rendered as "?" in UTF-8?






On 15/08/2012 16:58, David Carlisle wrote:
> On 15/08/2012 14:38, John English wrote:
>> I then search the file with "od -b | grep ' [4-7][0-7][0-7]'",
>
> a file containing i  in iso-8859-1 would have octal byte 351 which would
> not match that grep. You need to check no characters bigger than 127
> (octal 177 if you prefer)

Ooops, quite right! However, still nothing shows up...

--
John English


Current Thread
Keywords
xml