[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] Why does the doc() function reject non-ASCII characters in the filename?

Subject: Re: [xsl] Why does the doc() function reject non-ASCII characters in the filename?
From: "G. Ken Holman" <gkholman@xxxxxxxxxxxxxxxxxxxx>
Date: Fri, 05 Oct 2012 11:26:10 -0400

At 2012-10-05 15:22 +0000, Costello, Roger L. wrote:
I was given a large number of XML files to
convert to HTML using an XSLT program I have written.

I collected all the filenames into a file.

In my XSLT program I have a loop that:

    - reads in a filename
    - uses the filename in the doc() function to read in the file
    - process the file

The problem is that some filenames contain odd
characters. Here is an example of a filename:


Notice the two dots over the 'y' character (diaeresis).

When my XSLT program gets to that filename it throws an error:

Failed to read input file ...


The argument to doc() isn't a filename, it is a URI.

Why do I get that error?

Why does the doc() function care if the filename contains a diaeresis?

It cares that the URI is properly formed. Have you tried running encode-for-uri() on the filename in order to have it follow the RFC3986 rules?

Second question: do you have suggestions on how
to locate in my file (that contains all the filenames) the offending

By using doc-available() on each URI first to get a true()/false() response.

I hope this helps.

. . . . . . . . Ken

Contact us for world-wide XML consulting and instructor-led training
Free 5-hour lecture: http://www.CraneSoftwrights.com/links/udemy.htm
Crane Softwrights Ltd.            http://www.CraneSoftwrights.com/s/
G. Ken Holman                   mailto:gkholman@xxxxxxxxxxxxxxxxxxxx
Google+ profile: https://plus.google.com/116832879756988317389/about
Legal business disclaimers:    http://www.CraneSoftwrights.com/legal

Current Thread