[XSL-LIST Mailing List Archive Home]
[By Thread]
[By Date]
On 4 Dec 2009, at 12:37 , Sara Mitchell wrote:
What information do the labels of columns convey?
What tables would you want to produce for the documents
(1) <e/>
(2) <e><e n="23"/><e n="45">Pax</e></e>
(3) <table>
<row a="1" b="2" c="34">998</row>
<row a="2" b="22" c="34">999</row>
<row a="3" b="2" c="3">1000</row>
<row a="4" b="24" c="">1001</row>
<row a="5" x="Viva Villa!" c="34">998</row>
</table>
(4) <p>This isn't mixed content, because the schema says I'm a string.</p>
?
<a><b/><b/><b/><c/><c/><c/></a>
be treated differently from
<a><b/><c/><b/><c/><b/><c/></a>
hth
Re: [xsl] Generic stylesheet to flatten XML hierarchy
Subject: Re: [xsl] Generic stylesheet to flatten XML hierarchy From: "C. M. Sperberg-McQueen" <cmsmcq@xxxxxxxxxxxxxxxxx> Date: Fri, 4 Dec 2009 19:35:21 -0700 |
On 4 Dec 2009, at 12:37 , Sara Mitchell wrote:
...
With input like this: <rss ...some attributes> ... </rss>
I would like XML output like this:
<root>
<row>
<rss-attr1>value</rss-attr1>
...
</row>
<row>...again rss attributes, channel attributes, non-repeating children of channel followed by fields for second item </row>
...more rows ...
</root>
I'm having trouble seeing exactly what should be going on here, because I can't see anything in your sample input (elided here without loss of generality) that gives rise to the name 'rss-attr1'. It's hard to correlate input with output if all the values are spelled 'value' and some details in one half of the input / output pair correspond to ellipses in the other.
This example is for a single level of repeating descendants, but my solution has to be able to handle any level of repeating descendants. More over, the stylesheet has no knowledge of the structure of the input document.
My very strong gut reaction here is to suspect that such an absolutely generic transformation is unlikely to produce helpful (or: meaningful) output in some unknown but possibly large percentage of cases.
Perhaps the transformation you have in mind is intended to work generically on all XML documents that follow certain conventions in structuring the information they represent? Can you say what those conventions are?
Perhaps you have a very clear understanding of the transform you want, but so far this discussion has not elicited a clear description from you. The following questions are intended to try to elicit some more clarity.
In a generic XML document, there are elements with parents, left and right siblings, children, descendants, and attributes.
In a generic table, there are rows and columns. Each row but the first or last has a predecessor and a successor, and ditto each column but the first or last.
What is the relationship between the elements, attributes, containment and sibling relations in the input, and the rows and columns and their sequence relations in the output?
Given your output table, should I expect to have all the information present in the XML? Can I recreate the XML from your table?
Do all your rows have the same number of columns? (I suppose they must, or it's not much of a table, but perhaps I'd better check?)
When does an XML document give rise to a single row in the output table? When does it give rise to exactly three rows? When does the resulting table have exactly one column?
What information do the labels of columns convey?
What tables would you want to produce for the documents
(1) <e/>
(2) <e><e n="23"/><e n="45">Pax</e></e>
(3) <table>
<row a="1" b="2" c="34">998</row>
<row a="2" b="22" c="34">999</row>
<row a="3" b="2" c="3">1000</row>
<row a="4" b="24" c="">1001</row>
<row a="5" x="Viva Villa!" c="34">998</row>
</table>
(4) <p>This isn't mixed content, because the schema says I'm a string.</p>
?
I have a solution that works ok by traversing the input document in doc order -- but it does not handle the siblings of repeating nodes that are not themselves repeating.
I have thought of doing this the opposite way, get a key of all repeating nodes and process only those at the lowest depth to generate rows. I haven't actually written the logic.
I gather that the tables you want to generate have something to do with multiple occurrences of elements with the same name. Does adjacency matter, or would
<a><b/><b/><b/><c/><c/><c/></a>
be treated differently from
<a><b/><c/><b/><c/><b/><c/></a>
? (Assume if you like, for purposes of discussion, that the b and c and a elements all have interesting attributes.)
Any better ideas would be welcome.
Your example reminds me of the contortions I've seen people go to trying to represent structured information in RFC 822 attribute-value pairs. So the best idea I have at the moment is: Save yourself! Don't do it!
But probably you know exactly what you're doing, there is a perfectly reasonable algorithm for what you want, and I just haven't understood.
hth
-- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Generic stylesheet to fla, Sara Mitchell | Thread | RE: [xsl] Generic stylesheet to fla, Michael Kay |
[xsl] convert css3 to xsl-fo, Jack Bates | Date | Re: [xsl] XPath - accessing nodes w, Mukul Gandhi |
Month |