[oXygen-user] How to I get Oxygen to identify invalid characters? [SEC=UNCLASSIFIED]
Eliot Kimber
Thu Mar 31 08:45:56 CDT 2011
If you're trying to track down encoding issues (which is what this must be),
the SC Unipad tool is an excellent resource for Windows users:
www.unipad.org.
Cheers,
E.
On 3/31/11 7:16 AM, "George Cristian Bina" <> wrote:
> Dear John,
>
> There is nothing wrong with the "<" character in an XML document, this
> is allowed so oXygen cannot report an error when there is none.
> If you want to check if your document contains such characters then you
> can pass it though a Schematron validation using a Schematron schema
> like below:
>
> <?xml version="1.0" encoding="UTF-8"?>
> <schema xmlns="http://www.ascc.net/xml/schematron">
> <pattern name="testInvalidCharacter">
> <rule context="*">
> <assert test="not(exists(text()[contains(., '<')]))">
> The "<" character is not allowed!
> </assert>
> </rule>
> </pattern>
> </schema>
>
> Best Regards,
> George
> --
> George Cristian Bina
> <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
> http://www.oxygenxml.com
>
> On 3/31/11 9:01 AM, wrote:
>> Hi George,
>>
>> I notice that the content below doesn't include the M$ characters. I expect
>> it is because I have sent the email in plain text. I have temporarily loaded
>> the document up onto the following URL:
>>
>> http://asdd.ga.gov.au/asdd/work/OESRexmple.xml
>>
>> Uploading the file seems to have changed the characters from â~@~T to
>> \342\200\224 or my editor is showing it differently on a different machine.
>>
>> I hope that this helps.
>>
>> Thanks.
>>
>>
>> John
>>
>>> -----Original Message-----
>>> From:
>>> [mailto:] On Behalf Of Hockaday John
>>> Sent: Thursday, 31 March 2011 4:24 PM
>>> To:
>>> Cc:
>>> Subject: Re: [oXygen-user] How to I get Oxygen to identify
>>> invalid characters? [SEC=UNCLASSIFIED]
>>>
>>> Hi George,
>>>
>>> Here is a snippet of the offending content:
>>>
>>> <gmd:abstract>
>>> <gco:CharacterString>The Housing Rental Vacancy Rates
>>> Brief summarises data from a quarterly
>>> survey of Queensland real estate agencies. The data
>>> presented relates to vacancy rates of
>>> residential rental detached houses and units, and
>>> is broken down by 5 regions: Inner
>>> Brisbane, Remainder of Brisbane LGA, Brisbane
>>> Surrounds, Gold Coast and Rest of
>>> Queensland. Information on each quarter is posted
>>> on the website according to the
>>> following schedule: March Quarter-first week of
>>> April June Quarter-first week of July
>>> September Quarter-first week of October December
>>> Quarter-first week of January . Surveys
>>> have been conducted in 2002-2003, 2003-2004,
>>> 2004-2005, 2005-2006, and 2006-2007.
>>> </gco:CharacterString>
>>> </gmd:abstract>
>>>
>>> Note that the hyphens are of the format: â~@~T when viewed
>>> in a vi editor on a Solaris platform. Our XML declaration is:
>>>
>>> <?xml version="1.0" encoding="UTF-8"?>
>>>
>>> Other validator report these as invalid characters. So I
>>> would like to set Oxygen so that it detects and reports these
>>> characters.
>>>
>>> Thanks.
>>>
>>>
>>> John
>>>
>>>> -----Original Message-----
>>>> From: George Cristian Bina [mailto:]
>>>> Sent: Thursday, 31 March 2011 10:22 AM
>>>> To: Hockaday John
>>>> Cc:
>>>> Subject: Re: [oXygen-user] How to I get Oxygen to identify
>>>> invalid characters? [SEC=UNCLASSIFIED]
>>>>
>>>> Dear John,
>>>>
>>>> Can you please provide a cut down sample file and a short
>>>> description of
>>>> the exact steps we should follow to reproduce this issue.
>>>>
>>>> Thank you,
>>>> George
>>>> --
>>>> George Cristian Bina
>>>> <oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
>>>> http://www.oxygenxml.com
>>>>
>>>> On 3/31/11 1:50 AM, wrote:
>>>>> Hi All,
>>>>>
>>>>> I have a ISO 19139 metadata record that someone has cut and
>>>> paste M$ word content into one of the CharacterString fields.
>>>> According to the Validome XML online validator these
>>>> characters (â~@~T or M$ long hyphens) are invalid. Also, when
>>>> we use SAXON to translate this metadata record into XHTML,
>>>> SAXON says that there is a validation error.
>>>>>
>>>>> When I load this document into Oxygen, version 10.0 build
>>>> 2008102212, it validates OK. Is there some way for me to
>>>> catch these invalid characters during validation?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> John Hockaday
>>>>> Spatial Standards Group (OSDM)
>>>>> http://www.osdm.gov.au/
>>>>> john.hockaday\@osdm.gov.au
>>>>> _______________________________________________
>>>>> oXygen-user mailing list
>>>>>
>>>>> http://www.oxygenxml.com/mailman/listinfo/oxygen-user
>>>>
>>> _______________________________________________
>>> oXygen-user mailing list
>>>
>>> http://www.oxygenxml.com/mailman/listinfo/oxygen-user
>>>
> _______________________________________________
> oXygen-user mailing list
>
> http://www.oxygenxml.com/mailman/listinfo/oxygen-user
--
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 512.554.9368
www.reallysi.com
www.rsuitecms.com
More information about the oXygen-user
mailing list