[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
At 10:55 AM 8/28/2003, Taro wrote (to Mike):
But that's the nub of the problem.
Define the "semantics" of white space, and we're done. :->
In SGML, in which the DTD is required for a document to be processed, it is possible to define "insignificant" whitespace by reference to an element's content model. (If #PCDATA appears, whitespace is significant.)
In XML, in which a DTD may or may not be processed, it's impossible to define what whitespace is significant (must therefore be left alone) and what is insignificant (may be munged and remunged without damage) in the general case. To ameliorate this, XSLT gives you xsl:preserve-space and xsl:strip-space, which allows you some control by element type. This makes the problem more tractable in XSLT, providing you're willing to hand-wire the semantics in at that level.
It'd be nice of xmllint to leave CDATA marked sections alone, but that's just the tip of the iceberg. (Think of how much poetry on the web is marked up with <pre> to control whitespace. Ugh.)
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
RE: [xsl] xmllint -format, xsltproc and CDATA section
Subject: RE: [xsl] xmllint -format, xsltproc and CDATA section From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Thu, 28 Aug 2003 15:32:00 -0400 |
At 10:55 AM 8/28/2003, Taro wrote (to Mike):
Your book talks about canonical documents, and what I want is a tool that indents a document without changing its semantics.
But that's the nub of the problem.
Define the "semantics" of white space, and we're done. :->
In SGML, in which the DTD is required for a document to be processed, it is possible to define "insignificant" whitespace by reference to an element's content model. (If #PCDATA appears, whitespace is significant.)
In XML, in which a DTD may or may not be processed, it's impossible to define what whitespace is significant (must therefore be left alone) and what is insignificant (may be munged and remunged without damage) in the general case. To ameliorate this, XSLT gives you xsl:preserve-space and xsl:strip-space, which allows you some control by element type. This makes the problem more tractable in XSLT, providing you're willing to hand-wire the semantics in at that level.
It'd be nice of xmllint to leave CDATA marked sections alone, but that's just the tip of the iceberg. (Think of how much poetry on the web is marked up with <pre> to control whitespace. Ugh.)
Cheers, Wendell
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] xmllint -format, xsltproc, Michael Kay | Thread | Re: [xsl] xmllint -format, xsltproc, David Carlisle |
AW: [xsl] format-number abd numeric, Markus Abt | Date | [xsl] conditional element counting, Darren Kuik |
Month |