[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
Re: RE: RE: [xsl] DOM and XML parser
Subject: Re: RE: RE: [xsl] DOM and XML parser From: Mike Brown <mike@xxxxxxxx> Date: Sat, 17 Aug 2002 10:11:30 -0600 (MDT) |
ashu t wrote: > One thing I asked that who creates the tree like structure (even > on conceptual level)XML Parser or XSLT Processor? Both. The parser reads the raw file(s) that comprise the XML document, decoding bytes into characters, condensing character references, (e.g., the 4 characters "!" become the 1 character "!"), normalizing whitespace in attribute values, using the DTD to fill in default attribute values and resolve entities, and checking for well-formedness. The parser passes along the 'important' information about the XML document to the application (the XSLT processor). The information it passes is pretty much exactly what the processor needs in order to model the XPath/XSLT node tree. For example, the parser says things like "there is an element named 'stylesheet' in namespace 'http://www.w3.org/1999/XSL/Transform', its lexical name is 'xsl:stylesheet', it has an attribute named 'version' with value '1.0', it contains an element named 'template'..." and so on. SAX and DOM parsers do this in very different ways, but the idea is the same. The parser does not report lexical differences. For example, <foo a1="one" a2="two">1 & 2 are < 3</foo> and a mess like <foo a1 = "one " a2 = "two" ><![CDATA[1 & 2 are < 3]]></foo> mean exactly the same thing and are reported the same; the XSLT processor will never know the original looked one way or the other. It just knows that the following logical information items exist and have this relationship to each other: element type 'foo' in no namespace | \__attribute name 'a1', value character data 'one' | \__attribute name 'a2', value character data 'two' | |__character data '1 & 2 are < 3' The processor is required to treat this information as if it were structured according to the XPath/XSLT node tree model, like this: element node named 'foo' in no namespace | \__namespace node binding prefix 'xml' to name 'http://www.w3.org/XML/1998/namespace' | \__attribute node named 'a1', value character data 'one' | \__attribute node named 'a2', value character data 'two' | |__text node encapsulating '1 & 2 are < 3' A DOM parser uses a similar kind of tree of nodes that is implicit through the interfaces it provides. However, this tree is not entirely compatible with an XPath/XSLT tree, and it requires more memory than it should, so AFAIK most XSLT processors, if they take a DOM document as input, walk the DOM tree and build their own XPath/XSLT tree from it, so they can discard the DOM. This is slow, too, so most XSLT processors prefer to use a SAX parser when possible. A SAX parser is event-based and just zips through the document once, reporting what it finds along the way, by calling methods that the application has implemented to handle the reported events. - Mike ____________________________________________________________________________ mike j. brown | xml/xslt: http://skew.org/xml/ denver/boulder, colorado, usa | resume: http://skew.org/~mike/resume/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: RE: RE: [xsl] DOM and XML parse, ashu t | Thread | Re: Re: RE: RE: [xsl] DOM and XML p, ashu t |
Re: [xsl] Inserting softHyphens in , Gustaf Liljegren | Date | [xsl] Escaping a utf-8 string, Wesley W. Terpstra |
Month |