[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] unparsed-text and normalize-space when parsing CSV files

Subject: [xsl] unparsed-text and normalize-space when parsing CSV files
From: "Hank Ratzesberger xml@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 5 Dec 2014 19:36:21 -0000


I ran into a strange issue where I was running transforms on a Windows
platform, but under Cygwin. I was trying to parse a csv file.

The problem was that I was defining a variable for the newline, which
I expected would match the native system:

<xsl:variable name="nl">

and then parse the file like this:

<xsl:variable name="lines" select="tokenize($csv, $nl)" as="xs:string+" />

but it turns out that this does not really solve the issue of
mixed-source line endings since one or the other could have been
edited on a different file system. So I think this is a common issue
of parsing these kinds of files.

I was able to rely on normalize-space() to remove an extra CR, but
that function could make unwanted changes to other content.

Anyone recommend a safe way for this?

Thank you,

Hank Ratzesberger

Current Thread