Page 1 of 1

Support for data URIs in XQuery with Saxon in oXygen

Posted: Sun Aug 19, 2018 9:37 pm
by Martin Honnen
To find out whether I can use XQuery that generates XQuery code and load/run it with the W3C standard function https://www.w3.org/TR/xpath-functions/# ... ery-module I tested whether I can use data URIs https://en.wikipedia.org/wiki/Data_URI_scheme inside of XQuery or XSLT.

To my surprise the following works with an XQuery Saxon 9.8.0.12 (tried EE and HE) transformation scenario in oXygen 20.1:

Code: Select all


doc('data:application/xml,' || encode-for-uri('<root><foo>bar</foo></root>'))//foo
and returns

Code: Select all


<foo>bar</foo>
so it seems Saxon inside of oXygen has support for data URIs.

That is a surprise as running Saxon 9.8 from the command line gives an error "FODC0002: I/O error reported by XML parser processing ... unknown protocol: data".

So my first question is: is that support for data URIs something added on top of Saxon in oXygen or is that a particular Saxon configuration?

For which functions or in which contexts should the use of data URIs work?

I also get it to work for further simple but kind of meaningless stuff like

Code: Select all


unparsed-text('data:text/plain,' || encode-for-uri('This is a test.'))
but I then hoped to use it with the mentioned load-xquery-module function by providing such a data URI as the location hint of the XQuery module to load:

Code: Select all


let $query as xs:string := 'xquery version "3.1"; module namespace foo = "http://example.com/foo"; declare function foo:f1() as xs:string { "test" };',
$module as map(*) := load-xquery-module('http://example.com/foo', map { 'location-hints' : 'data:application/xquery,' || encode-for-uri($query) })
return $module?functions(QName('http://example.com/foo', 'f1'))(0)()
but when I execute that with an XQuery Saxon 9.8 EE transformation scenario I get an error "XQuery module validation/execution is not supported. Please validate/execute the main XQuery file.".

Should the above work? Why does it fail, and why with that error?

Interestingly enough, when I try to reduce the sample to something that does not use data URIs but the same inline code with

Code: Select all


let $query as xs:string := 'xquery version "3.1"; module namespace foo = "http://example.com/foo"; declare function foo:f1() as xs:string { "test" };',
$module as map(*) := map { 'functions' : map { QName('http://example.com/foo', 'f1') : map { 0 : function() { 'test' }}}}
return $module?functions(QName('http://example.com/foo', 'f1'))(0)()
then oXygen gives me the same error "XQuery module validation/execution is not supported. Please validate/execute the main XQuery file." although I can run that code fine with Saxon 9.8 EE from the command line outside of oXygen.

So I am not sure whether the error I get with the previous attempt of "load-xquery-module" is due to problems with the use of a "data" URI in the location-hints or due to the way oXygen calls/uses Saxon. I hope you can clarify that.

As a last test I tried to avoid inlining the XQuery code but still using a data URI with

Code: Select all


let $query as xs:string := unparsed-text('test2018081903.xq'),
$module as map(*) := load-xquery-module('http://example.com/foo', map { 'location-hints' : 'data:text/plain,' || encode-for-uri($query) })
return $module?functions(QName('http://example.com/foo', 'f1'))(0)()
interestingly enough this then gives a different error: "String index out of range: -21" (oXygen XQuery Saxon 9.8.0.12 EE scenario).

As I can't run such code directly with Saxon as there I get a "java.net.MalformedURLException: unknown protocol: data" I would like to know whether that error is within Saxon or oXygen using Saxon.

For completeness, the sample test2018081903.xq simply has

Code: Select all


xquery version "3.1"; module namespace foo = "http://example.com/foo"; declare function foo:f1() as xs:string { "test" };

Re: Support for data URIs in XQuery with Saxon in oXygen

Posted: Mon Aug 20, 2018 9:25 am
by Radu
Hi Martin,

Both Oxygen and Saxon are Java application. There are various URL protocols like "file", "http" or in your case "data". Some of them are implemented in the standard Java libraries bundled with the Java Virtual Machine but "data" is not one of them. Oxygen's main Java library adds support for this "data" URL protocol which is used also by Saxon when running from Oxygen. But when Saxon runs from the command line, it uses the standard Java URL protocols and so it no longer has support for "data".
About your other problem:
but when I execute that with an XQuery Saxon 9.8 EE transformation scenario I get an error "XQuery module validation/execution is not supported. Please validate/execute the main XQuery file."
I looked at the sample xquery file you pasted, honestly I'm not sure what it is. You seem to be using the "module namespace" keywords in it. Is it an XQuery module? But if it's a module, why does it have a return statement? Did you actually want to use the "define namespace" keywords instead?
Because you use the "module namespace" keywords in the XQuery, Oxygen suspects it's an XQuery module and because it does not have support to validate XQuery modules it issues that error message that you posted. Usually if you want to validate XQuery modules in Oxygen you can try to create a custom validation scenario and validate the main XQuery file instead.

Regards,
Radu

Re: Support for data URIs in XQuery with Saxon in oXygen

Posted: Mon Aug 20, 2018 11:24 am
by Martin Honnen
Thanks for your answer and the explanations so far.

The XQuery code

Code: Select all


let $query as xs:string := 'xquery version "3.1"; module namespace foo = "http://example.com/foo"; declare function foo:f1() as xs:string { "test" };',
$module as map(*) := load-xquery-module('http://example.com/foo', map { 'location-hints' : 'data:application/xquery,' || encode-for-uri($query) })
return $module?functions(QName('http://example.com/foo', 'f1'))(0)()
is a https://www.w3.org/TR/xquery-31/#dt-main-module that happens to contain the code for an XQuery library module as a string. It tries to feed that string, using a URI with the "data" protocol, to the "load-xquery-module" function and then it tries to call a function from that loaded XQuery module.

So does oXygen actually try to run that code when it gives the "XQuery module validation/execution is not supported. Please validate/execute the main XQuery file" and Saxon fails to run it or is that error caused by some non-XQuery evaluation, code sniffing or parsing attempt that in this case then would seem to decide over-eagerly that it has found a module, given that those code keywords are inside a string literal and not part of code?

I hope you can also look into the error caused by

Code: Select all


let $query as xs:string := unparsed-text('test2018081903.xq'),
$module as map(*) := load-xquery-module('http://example.com/foo', map { 'location-hints' : 'data:text/plain,' || encode-for-uri($query) })
return $module?functions(QName('http://example.com/foo', 'f1'))(0)()
is that Saxon code or oXygen code that gives the "String index out of range: -21"?

The source code of the used test2018081903.xq is at the end of the original question, so I hope I don't have to repeat it.

Re: Support for data URIs in XQuery with Saxon in oXygen

Posted: Mon Aug 20, 2018 12:20 pm
by Radu
Hi Martin,

So:
So does oXygen actually try to run that code when it gives the "XQuery module validation/execution is not supported. Please validate/execute the main XQuery file" and Saxon fails to run it or is that error caused by some non-XQuery evaluation, code sniffing or parsing attempt that in this case then would seem to decide over-eagerly that it has found a module, given that those code keywords are inside a string literal and not part of code?
As you say, Oxygen tries to sniff around the XQuery content to see if it's a module or not, it finds the keywords "module namespace" and decides that it is a module and no longer validates it. We'll work to improve our module detection, for now if you add an extra space before "module" and "namespace" you should trick our current primitive module detection.
I hope you can also look into the error caused by
............
is that Saxon code or oXygen code that gives the "String index out of range: -21"?
As a general rule, whenever after a transformation you have errors in the results view, you can right click them, "Show Message" and then see if the dialog which popup up has a "More details" link. If it does, then this gives additional information about the error stack trace.
In this particular case, our "data" URL handler breaks when retrieving content from your "data" protocol. And it's a bug, we'll fix it on our side.
In the meantime you can do something like:

Code: Select all

'location-hints' : 'data:text/plain;charset=utf-8,'
adding that extra charset there should bypass this problem in our code.

Regards,
Radu

Re: Support for data URIs in XQuery with Saxon in oXygen

Posted: Fri Feb 22, 2019 4:39 pm
by Radu
Hi,

Just to update this thread, we released Oxygen 21 which can validate XQuery module and we also fixed that unhandled error with the data URIs.

Regards,
Radu