[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] What is a good way to style and show tabular data [snip]


Subject: RE: [xsl] What is a good way to style and show tabular data [snip]
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Mon, 18 Aug 2003 12:28:25 -0400

Abhishek,

Your line of thought is interesting, because you are treading into the midst of an area where the common XML dogma "separation of format from content" breaks down....

Peter made the classic argument for semantically-meaningful names, etc. (with which I entirely agree), but your counter example shows that the whole issue of "tabular data" is just the tip of a very large, deep iceberg, which threatens to split our ocean liner in two, half tending towards modeling our content semantically -- because we know that's a Good Idea -- robust, scalable, long-lived, versatile etc. etc. --, half towards giving up and modeling for presentation -- because a generalized semantic model proves to be very difficult, whereas just modeling a two-dimensional rows and columns is a clear way forward to our immediate goal. Why does a generalized semantic model prove to be difficult? One reason is that XML's "natural" tree structure is not capable, without enhancement, of modeling the n-dimensional matrices which often prove to underlie tabular renditions.

(And yes, folks, look at printed materials of any complexity and you will not uncommonly find 3-, 4- or more-dimensional matrices presented as tables.)

In view of these challenges, you might consider how the classic solution -- for example, the OASIS (CALS) table (which Peter mentions) -- goes about it. Model a conventional rows-and-columns (2-dimensional) table; annotate cells with attributes to identify their roles and semantic relations.

This is kind of a split-the-difference solution, which "solves" the problem for presentation, but not necessarily or entirely for generalized processing. This is because semantic relations of critical importance to the integrity of the data set are pushed out into the attribute names and values, thereby requiring special kinds of checks, validation etc. to keep everything together.

The other approach is to forget completely how something is to be presented, and simply (heh) model everything to its own proper semantics. In the domain of publishing, for which XML was (at least partly) designed for, this is often prohibitively difficult, since each table has its own semantics (prices for parts by quantities here, over there populations by region and age group, etc. etc.), and this threatens indefinite extension to the tag set. If however, you are working in a problem space where such data structures are regular and predictable, this is the preferred solution. Model it for what it is (a set of name-value pairs, an object hierarchy, whatever), and worry about how to display it later.

As for the generalized n-dimensional solution, supporting querying, transposition etc. to get the tabular "views" -- that's still an interesting area for research.

Cheers,
Wendell


I initially had a <td>, <th> kind of structure but I realized that it
was locked to have a head that flowed horizontally only so I added the
idea of a matrix that could flow in V or H directions and could be
transposed at will. Of course in this structure if there is a "sparse"
matrix (one with very few "filled" cells) then I will have to have a
large number of "empty" cell holders for positioning things properly.

The other way to "enrich" the data would be to add attributes of "i and
j" to define positions of the cells so that position information is not
dependent on the "actual" placement in the XML.


At 10:57 AM 8/18/2003, you wrote:
Peter,

I totally agree with you. I truly believe it would be better to have
"rich" and "well defined" markup than to try and go crazy in the
stylesheet.

I identify with the "list" example you mentioned and have done a similar
thing. But that is when the "cells or data holders" were simply
associative in ONLY "one" direction.

When there is "cell associativity" in 2 Dimensions (or directions) it
needs to be in a table form (more precisely it needs to be in a
2Dimensional Matrix).

But I have realized that there is a trade-off when trying to add markup
to the XML to make XSLs easier. So, I have tried to keep certain aspects
of the markup "open". For example, the list is a good way of eliminating
an unnecessary "2 column table".

Now, to the specifics:

This is how I have currently modeled the XML for the table/matrix.

<Matrix HeadFlow="Vertical | Horizontal">
        <!-- (optional) -->
        <MatrixHeadArray>
                <MatrixHeadCell>a</MatrixHeadCell>
                <MatrixHeadCell>b</MatrixHeadCell>
                <MatrixHeadCell>c</MatrixHeadCell>
        </MatrixHeadArray>
        <MatrixBodyArray>
                <MatrixBodyCell>d</MatrixBodyCell>
                <MatrixBodyCell>e</MatrixBodyCell>
                <MatrixBodyCell>f</MatrixBodyCell>
        </MatrixBodyArray>
</Matrix>

I initially had a <td>, <th> kind of structure but I realized that it
was locked to have a head that flowed horizontally only so I added the
idea of a matrix that could flow in V or H directions and could be
transposed at will. Of course in this structure if there is a "sparse"
matrix (one with very few "filled" cells) then I will have to have a
large number of "empty" cell holders for positioning things properly.

The other way to "enrich" the data would be to add attributes of "i and
j" to define positions of the cells so that position information is not
dependent on the "actual" placement in the XML.

So I am thinking what would be a good way to go about this. Please do
let me know.

Thanks,

Abhishek Sanwal
HP - Houston Campus
abhishek.sanwal@xxxxxx


-----Original Message----- From: Peter Flynn [mailto:peter@xxxxxxxxxxx] Sent: Sunday, August 17, 2003 4:23 PM To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [xsl] What is a good way to style and show tabular data [snip]

On Fri, 2003-08-15 at 23:41, SANWAL, ABHISHEK (HP-Houston) wrote:
> What is a good way to style and show tabular data

In a table, usually.

> and what is a good way to store Tabular data in XML

There is only one "good" way to store data in XML -- any data -- and
that is in a well-designed structure which preserves as much information
about the data as is needed for you to work with it successfully.
Storing
data (or text) in a suboptimal structure, with misleading names, poor
content models, ambiguous attributes, failing to take advantage of some
of
the obvious features of XML, perpetrating Tag Abuse, creating Pernicious
Mixed Content, and all the other nasties in the corner; and in a manner
which loses some metadata as the data is stored, is a recipe for tears
and
grief, as we see daily here and elsewhere :-)

Get the data model right first, and the rest will follow.
Get the data model wrong, and you will spend excessive time
having to undo it or cope with it while it remains wrong.

But in a production situation this is not always possible: data from
elsewhere in a silly format, management insisting white is black when
you
know it's actually green, client or vendor insistence on unsuitable,
often
proprietary, formats for political reasons, etc.

For tabular data, there are many solutions, and without seeing an
example
any advice has to be general. My personal preference is to use
meaningful
names (ie not TR, TD, and TH :-) and to ensure that sufficient metadata
is stored to enable the original structure to be recreated (the "round
trip"
test). But in the pressure to get stuff done, you may choose to do
otherwise.

But there is one classical case where data is often stored as a table
quite wrongly: the labelled list. Consider:

        USA     You pay insurance for healthcare, only people
                below a (very low) income level get state-funded care

        UK      You get healthcare free at the point of usage in
                most cases, paid for via income deductions, but
                insurance-funded ("private") healthcare is available

(Forgive me USA and UK if I have the facts wrong :-). To most
wordprocessor users this looks like a table, because it's the only
way their wordprocessor has of formatting it. But it's not a table:
it's a list whose format resembles a columnar layout. Storing it as
a list in XML lets you choose how to format it for output very easily.
Storing it as a table risks making it uneditable and unformattable
except with much greater difficulty.

> especially when the only purpose for that data is to be styled by
> stylesheets into XSL-FO (PDF) and HTML - such that the stylesheets
> are efficient, extensible and not too cumbersome.

There are three common table models: CALS, SASOUT, and HTML (see
Chapter 2, section 3.7 of my book on SGML and XML Tools for a
detailed description). Another one, ISO/IEC TR 9573 I have lost
track of -- maybe it still exists. CALS is huge, but lets you store
all kinds of fine detail about layout and appearance; SASOUT tries
to let you define the relationships between rows, columns, and cells,
but I don't think it ever really caught on; HTML is simplistic to the
point of crudity, but better supported in browsers than anything else
(but harder to use for good quality print).

The better the quality of XML markup, the easier it is to work with.
If your markup allows you to record what the data is, and why it is
stored in this way, it's usually much easier to write a stylesheet
to format it than having to spend large amounts of time coding large
nests of conditionals to try and deduce aspects of the nature of the
data which ought to have been stored explicitly. But as I said, this
is the ideal: in practice most people stuff it into TR, TD, and TH
and hope for the best.

///Peter




XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list


======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list




Current Thread
Keywords
xml