[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] Reflecting on: csv data to xml

Subject: [xsl] Reflecting on: csv data to xml
From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
Date: Sun, 30 Jun 2013 09:49:25 +0200

The thread "csv data to xml" was triggered by a relatively simple
problem: converting CSV data to XML. There were one or two voices
advocating the use of Perl (or similar) "for this kind of problem" in
preference to XSLT, and there were claims that it would be a simple
matter to use XSLT's analyze-string... Now I'm not going to vote
either way - I'd just like to post some observations I made while
investigating this. If you are impatient, skip down to "conclusion".

I decided to implement this in Perl and was hoping to be able to
compare this with an equivalent implementation in XSLT, concentrating
on ease of development and maintainability. Ken's implementation
<http://www.CraneSoftwrights.com/resources/#csv> filled the XSLT slot.

I had a quick Perl 5 filter solution up and running in 30 minutes, no
program parameters, hard-coded names for document and row elements,
but using the first CSV line for obtaining the names for the cells.

10 Minutes of that time were spent on getting a couple of Perl
packages from CPAN, one for parsing CSV and another one for writing XML,
which reduced the code I had actually to write to 23 lines.

Considering this to be too sloppy, I spent some more time, adding
a *nix-style CLI (for file names, element names,...), data checking
(invalid element names, excess cells in a row), default element names
for cells (using "A", "B",...), CLI documentation etc.

Ken's solution falls short on a few points I was able to add easily. I can't
say how difficult they would be to add to Ken's existing solution - it might
not be a matter of minutes for some of those add-ons.


Perl's CPAN is a great asset. Certainly, the quality of its offerings varies,
but the packages are tested and users report on their experience. (Why
doesn't XSLT have anything like it?)

Ken used a proprietary (?) solution for embedding documentation that can
be extracted into HTML. Now that's great, but it is a solitary answer to the
problem. Perl's pod is a somewhat clunky solution but it is supported with
a rich toolset, along with the Perl distribution. I consider the
existence of a documentation format that is defined along with the
language as "state of
the art" and essential for sustainable SW development.

XSLT is "special purpose" for XML handling and consequently easy to use,
but it isn't better than the average language for string processing.


Current Thread