Page 1 of 1

docbook, xhtml and absolute filenames for images

Posted: Thu May 22, 2014 7:32 pm
by fsteimke
Hi,

i have an docbook document with an image defined as an entity:

Code: Select all

<!DOCTYPE article [
<!ENTITY img SYSTEM "image.png" NDATA PNG>
]>
<article xmlns="http://docbook.org/ns/docbook" version="5.0">
<title>Test Document</title>
<para>Contains a <mediaobject><imageobject><imagedata entityref="img"/></imageobject></mediaobject></para>
</article>
Using the docbook xsl 1.78.1 that comes bundled with the oxygen version 16.0 i try to transform this to xhtml. However, the image is not visible. The problem is:
  • The stylesheet uses the unparsed-entity-uri() function. It's value is "file:/C:/temp/image.png" when using the saxon engine.
  • the process.image template in graphics.xsl has to generate the @src attribute in the html:img element.
  • it tries to determine if the given filename is an absolute filename. If so, it leaves it untouched.
  • The important condition is "not(contains($output_filename, '://'))". This, however, does not match with the result of the unparsed-entity-uri() function, because there is only one slash after the colon, not two.

Obviously, there is a mismatch. Question is, whether this is a correct URL for an absolute filename.
If it is not, then there is a problem with the unparsed-entity-uri() function, since it should never return an invalid URL.
If it is, then there is a problem with the process.image template, since it should be able to process URLs with absolute filenames.
I asked this question on the docbook-apps mailing list, and got an funny answer: It depends on your definition of correctness. Hmmm. So i had a look in wikipedia for file urls and found the general schema:

Code: Select all

file://host/path
host can be omitted, resulting in three slashes. But, still, unparsed-entity-uri() gives only one ...

The advice how to deal with the situation was: avoid absolute filenames and feel free to post a bug report.

Frankly, i'm lost. Is this a bug? If so, who is wrong? Is the implementation of unparsed-entity-uri() in saxon incorrect, or is there a bug in the stylesheets since they can't handle correct URLs?

And, last but not least, do you have a better advice than just avoid absolute filenames?

Sincerely,
Frank

Re: docbook, xhtml and absolute filenames for images

Posted: Fri May 23, 2014 9:02 am
by Radu
Hi,

Thanks for the details.

This is a cross post from the Docbook Apps Users List:

https://lists.oasis-open.org/archives/d ... 00041.html

I will look into this and try to reply to you on the Docbook Apps users list.

Regards,
Radu

Re: docbook, xhtml and absolute filenames for images

Posted: Fri May 23, 2014 10:31 am
by Radu
Hi Frank,

Actually I will answer here some of your questions:
Obviously, there is a mismatch. Question is, whether this is a correct URL for an absolute filename.
Yes, file:/host/path (with only one slash) is a correct Windows URL path. Also file:///host/path is a correct equivalent form. The alternative with two // is not a correct URL syntax for a path to a local resource because it assumes there is also a host name present in the URL, which for local file paths is missing.
If it is, then there is a problem with the process.image template, since it should be able to process URLs with absolute filenames.
So indeed in the XSL:

OXYGEN_INSTALL_DIR\frameworks\docbook\xsl\xhtml\graphics.xsl

the XSLT template d:imagedata calls "process.image" which at some point tries to compute the "output_filename" variable by looking at a variable which was computed by calling "mediaobject.filename". This particular computation of the output_filename variable looks like this in the stylesheet:

Code: Select all

  <xsl:variable name="output_filename">
<xsl:choose>
<xsl:when test="@entityref">
<xsl:value-of select="$filename"/>
</xsl:when>
<!--
Moved test for $keep.relative.image.uris to template below:
<xsl:template match="@fileref">
-->
<xsl:otherwise>
<xsl:value-of select="$filename"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
so there seems to be no attempt in the stylesheet to make a relative reference out of the absolute one.
So in my opinion this looks like a bug and you should try to report it, the people on the Docbook bugs list are very responsive.
The XSL:

OXYGEN_INSTALL_DIR\frameworks\docbook\xsl\common\common.xsl

contains a template called "relative-uri" from which you could probably borrow some code and use it to make the reference relative to the URL of the current parsed XML document.
And, last but not least, do you have a better advice than just avoid absolute filenames?
Probably as entity refs are not very much used anymore (Docbook 5 is Relax NG based and it's kind of strange mixing Relax NG with DTD subsets), the Docbook XSLs have remained with this primitive way of treating them.

Hope this helps.
Regards,
Radu

Re: docbook, xhtml and absolute filenames for images

Posted: Sat May 24, 2014 9:47 am
by fsteimke
Thanks Radu.

I have submitted a bug report.

Yes, its true, Entites and DocBook 5 don't go together very well. But after all, this bug does not depend on the use of entities. The stylesheets do not support the syntax with only one slash after the colon for absolute file urls. So there is a good chance to run into trouble whenever these absolute file URLs for images are calculated or constructed in an automated way. Using the unparsed-entity-uri() function is just one example where the value is not written manually.

Thanks again,
Frank

Re: docbook, xhtml and absolute filenames for images

Posted: Tue Jan 06, 2015 12:11 am
by fsteimke
Hi Folks,

the bug in the DocBook Stylesheet has been fixed, see http://sourceforge.net/p/docbook/bugs/1336/.

Sincerely,
Frank

Re: docbook, xhtml and absolute filenames for images

Posted: Tue Jan 06, 2015 9:34 am
by Radu
Hi Frank,

Thanks for updating the thread.

Probably when a new stable version of the Docbook XSLs will be officially made available we'll integrate them with Oxygen.

It's a pity that Source Forge does not seem have a way to link commits to issues, probably you could try to ask on the opened issue Robert Stayton what XSLs he changed and port those changes to your XSLs.

Regards,
Radu