[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

XMLish code in HTML and Microsoft bashing [long] (was: Re: efficient filtering...)


Subject: XMLish code in HTML and Microsoft bashing [long] (was: Re: efficient filtering...)
From: Mike Brown <mike@xxxxxxxx>
Date: Tue, 4 Apr 2000 23:10:39 -0600 (MDT)

David Carlisle wrote:
> well formed????????
> 
> <!--[if gte mso 9]><xml>
> <![endif]

It's all just character data inside an HTML comment, which is perfectly
valid HTML. Which parts of that character data are intended to be
interpreted as XML is something known only to the application that is
designed to recognize and interpret those comments in a special way.
It would be no different to have in an HTML document:

<!-- <?! mike brown has mustard on his chin :-O>
<!here comes some xml!>
<myData>
  <foo>bar</foo>
</myData>
<!!!there was some xml!!!>
-->

...so long as the application interpreting that comment knew that anything
between <!here comes some xml!> and <!!!there was some xml!!!> in an HTML
comment was to be interpreted as XML. 

I agree though that if they were going to put something that looks like
XML into a document they should have followed the rule about not starting
an element name with 'xml'.

> It's the fact that Microsoft systems produce this kind of rubbish that
> makes many people suspicious of Microsoft's actions and intentions in
> the XML (and other) areas.

Since there were and still are no standards for putting XML data into HTML
comments, I think it would be an overreaction to add this particular
feature to our lengthy list of Microsoft's crimes against humanity.

Other than what is either an erroneous DOCTYPE declaration or the
misplacement of xmlns attributes in the <html> element, they haven't
violated HTML 4.0. They haven't declared their embedded XML impersonation
as actually being XML 1.0 (which it could never truly be, since it's not a
separate entity). This is hardly subversion of standards.

What it is, is symptomatic of a corporation's unwillingness to tell people
that an HTML document is not a good place to store complex,
application-specific formatting information. Instead, they "made it fit"
in such a way that it doesn't break anything, all the formatting data is
there for certain applications to utilize, and the HTML version of the
document is virtually indistinguishable from the proprietary .doc or .rtf
versions.

It kept somebody employed, sure, but it also accomplished some arguably
useful goals ("don't lose anything when you Save As HTML, but produce a
usable HTML document"). Some programmer came up with a way to use both XML
and HTML and do it 98% within the standards. Now that XHTML is an option,
it would be almost trivial to make it 100%. I don't feel the fact that a
company would put someone up to this is a reason to hate that company.
There are plenty of much more legitimate reasons.

And if you're going to complain, offer some constructive suggestions about
what they should have done instead. What would your solution have been?

<!-- //begin MSO formatting data that is unreadable crap
and isn't meant to resemble XML, Base64, or anything else
that has a standard you can say is being undermined //
ADLF7T%*@%GBNBKS:TU3875DAF@$G@($*!FAFAFADJGUR@$
//end -->
?

That would be ms-tnef, something far more worthy of despising Microsoft
for. :)

My constructive suggestion would be for MS to not be afraid to put out a
product that tells people "look, you can't do that because the specs
aren't robust enough for what you want to do, and we intend to respect the
guidelines set forth by the standards bodies". It would make my life
trying to process HTML form data (esp. XML docs sent as form data) much,
much easier...

   - Mike
___________________________________________________________
Mike J. Brown, software engineer, Webb Interactive Services
XML/XSL stuff: http://www.skew.org/    http://www.webb.net/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords