Problem with whitespaces when parsing with Xerces
This should cover W3C XML Schema, Relax NG and DTD related problems.
Problem with whitespaces when parsing with Xerces
Post by Guest »
Hi,
I'm using Xerces DOM Parser as below:
import org.apache.xerces.parsers.DOMParser;
DOMParser parser = new DOMParser();
parser.setFeature("http://apache.org/xml/features/validation/schema",true);
parser.setFeature("http://xml.org/sax/features/validation", true);
parser.setErrorHandler(new errHandler());
parser.parse(xmlFile);
The xml file being parsed looks like this:
<CCphysical xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="MySchema.xsd">
<root name="Name">
<directory name="doc" />
<directory name="doc2" />
<directory name="XML" />
</root>
<CCphysical/>
Extract from the schema:
<xsd:element name="root">
<xsd:complexType>
<xsd:sequence>
<xsd:element maxOccurs="unbounded" minOccurs="0" ref="directory" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:String" use="optional" />
</xsd:complexType>
</xsd:element>
When parsing the file I don't get 3 childNodes for root but 6 because the parser also counts the whitespaces. Even when setting the feature
parser.setFeature("http://apache.org/xml/features/dom/incl ... ace",false);
it does not work. What's wrong and what do I have to do so that whitespeces are ignored?
Thanks for your help,
Fabian
I'm using Xerces DOM Parser as below:
import org.apache.xerces.parsers.DOMParser;
DOMParser parser = new DOMParser();
parser.setFeature("http://apache.org/xml/features/validation/schema",true);
parser.setFeature("http://xml.org/sax/features/validation", true);
parser.setErrorHandler(new errHandler());
parser.parse(xmlFile);
The xml file being parsed looks like this:
<CCphysical xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="MySchema.xsd">
<root name="Name">
<directory name="doc" />
<directory name="doc2" />
<directory name="XML" />
</root>
<CCphysical/>
Extract from the schema:
<xsd:element name="root">
<xsd:complexType>
<xsd:sequence>
<xsd:element maxOccurs="unbounded" minOccurs="0" ref="directory" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:String" use="optional" />
</xsd:complexType>
</xsd:element>
When parsing the file I don't get 3 childNodes for root but 6 because the parser also counts the whitespaces. Even when setting the feature
parser.setFeature("http://apache.org/xml/features/dom/incl ... ace",false);
it does not work. What's wrong and what do I have to do so that whitespeces are ignored?
Thanks for your help,
Fabian
-
- Posts: 9438
- Joined: Fri Jul 09, 2004 5:18 pm
Workaround
Yes, indeed the root node has as children besides the directory nodes the line break text nodes.
The safe thing to do (and that would work for any xml file no matter how many white spaces are between tags) would be that when iterating through the root's children to add this condition for each node: and only take into consideration the element nodes.
The safe thing to do (and that would work for any xml file no matter how many white spaces are between tags) would be that when iterating through the root's children to add this condition for each node:
Code: Select all
n.getNodeType() == Node.ELEMENT_NODE
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service