Berkley DBXML is messing up W3C standards (Doc.asString())
Questions about XML that are not covered by the other forums should go here.
Berkley DBXML is messing up W3C standards (Doc.asString())
I posted this message some time ago, And got some reactions this was impossible.
It is NOT. the character " <" is handled differently from ">"
As These characters are XML structure the whole XML is messed UP !
in the way that one is translated the other is not.
THis problem occurs as one uses document.asString
instead of document.tocontent.
Clearly different code is used for the same implementation
clearly asString does bad, and inconsistent translations
I Think people want to know.
I'm running DBXML of berkley.
I need to us < > and other XML foreign characters in my XML structure
W3c standards tell me to ue < an > as encoding.
This is fine for me ...
As I load the xml record in the DB
as : <test var="<>" />
the xml record is returned as <test vat="<>" />
it seems to interprete the > opposed to the <
as a result the XML string is corrupted.
It is NOT. the character " <" is handled differently from ">"
As These characters are XML structure the whole XML is messed UP !
in the way that one is translated the other is not.
THis problem occurs as one uses document.asString
instead of document.tocontent.
Clearly different code is used for the same implementation
clearly asString does bad, and inconsistent translations
I Think people want to know.
I'm running DBXML of berkley.
I need to us < > and other XML foreign characters in my XML structure
W3c standards tell me to ue < an > as encoding.
This is fine for me ...
As I load the xml record in the DB
as : <test var="<>" />
the xml record is returned as <test vat="<>" />
it seems to interprete the > opposed to the <
as a result the XML string is corrupted.
Whow another day in testing
Hi,
As I explaind in the previous post, from XML point of view the > may be represented either way, that is either as > or >. The relevant part of the XML specification is here:
http://www.w3.org/TR/2004/REC-xml-20040204/#syntax
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings "&" and "<" respectively. The right angle bracket (>) MAY be represented using the string ">", and MUST, for compatibility, be escaped using either ">" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section.
So as you see < MUST be escaped while > MAY be escaped.
Best Regards,
George
As I explaind in the previous post, from XML point of view the > may be represented either way, that is either as > or >. The relevant part of the XML specification is here:
http://www.w3.org/TR/2004/REC-xml-20040204/#syntax
The ampersand character (&) and the left angle bracket (<) MUST NOT appear in their literal form, except when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. If they are needed elsewhere, they MUST be escaped using either numeric character references or the strings "&" and "<" respectively. The right angle bracket (>) MAY be represented using the string ">", and MUST, for compatibility, be escaped using either ">" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section.
So as you see < MUST be escaped while > MAY be escaped.
Best Regards,
George
Messing up of W3C characters in Berkley DBXML
Sorry, I do not agree on your remark.
XML and W3C standards have some purpose
the Xml record as you mentioned is correct : <test var="<>" />
though the record <test var="<,>" /> is structuraly incorrect.
You as a human can see there is no difference but the parser does not think so.
For this reason i guess the character ">" is banned from the content of the tag VAR.
A parser sees that at least as an indication of the end of XML line.
As <test var="<>" /> is entered in the database using the latest Berkley DB
displaying the document with an DOC.asString() statement, the reocrd is returned as
<test var="<,>" /> making the information completely corrupted and so unusable.
There is a bypass using DOC.getContent showing the record as <test var="<>" />
as it should be.
At least to be reliable as DB structure for professional data < and > should be treated the same way. It is NOT the Case
Best regards, but had to counterdict your reply
XML and W3C standards have some purpose
the Xml record as you mentioned is correct : <test var="<>" />
though the record <test var="<,>" /> is structuraly incorrect.
You as a human can see there is no difference but the parser does not think so.
For this reason i guess the character ">" is banned from the content of the tag VAR.
A parser sees that at least as an indication of the end of XML line.
As <test var="<>" /> is entered in the database using the latest Berkley DB
displaying the document with an DOC.asString() statement, the reocrd is returned as
<test var="<,>" /> making the information completely corrupted and so unusable.
There is a bypass using DOC.getContent showing the record as <test var="<>" />
as it should be.
At least to be reliable as DB structure for professional data < and > should be treated the same way. It is NOT the Case
Best regards, but had to counterdict your reply
Whow another day in testing
george,
I think i misread your answer, I think we both agree on what should be.
the problem is is not that W3C nor the standards are wrong, they are NOT.
Berkley DB XML badly codes this W3C standard. and alters the defintion
> to > while in its internal interpretation.
Theoreticaly a parse could find its way out.
<TEST var=">" /> but it does NOT the record is flaged as an error as it should be
a correct record
<test var="%lt" /> is mishandled and created into a bad syntax, <TEST var=">" /> wich it detects itself as being bad. This is in all sensitive purposes a corruption
A BUG to be corrected !!!!
I think i misread your answer, I think we both agree on what should be.
the problem is is not that W3C nor the standards are wrong, they are NOT.
Berkley DB XML badly codes this W3C standard. and alters the defintion
> to > while in its internal interpretation.
Theoreticaly a parse could find its way out.
<TEST var=">" /> but it does NOT the record is flaged as an error as it should be
a correct record
<test var="%lt" /> is mishandled and created into a bad syntax, <TEST var=">" /> wich it detects itself as being bad. This is in all sensitive purposes a corruption
A BUG to be corrected !!!!
Whow another day in testing
Return to “General XML Questions”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service