[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: [xsl] XSLT 3.0 JSON processing -- a few comments from a friend


Subject: Re: [xsl] XSLT 3.0 JSON processing -- a few comments from a friend
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 7 Jan 2015 21:56:22 -0000

Thanks for the comments.

On 7 Jan 2015, at 19:50, Dimitre Novatchev dnovatchev@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> I recently had a chat with an old friend and among other things he
> expressed some thoughts on the current capabilities of XSLT 3.0 to
> process JSON.
>
> With his kind agreement, I am publishing these thoughts intact.
>
> Would appreciate feedback from anybody:
>
> 1. What of the below points you agree with, why?
>
> 2. What of the below points you disagree with, why?
>
> 3. Can you propose a better solution?
>
>> In case your interested, here's my honest opinion about XSLT 3.0's JSON
implementation...
>> This is from my perspective - working with systems that heavily utilise
JSON both for external facing
>> APIs and  internally within our systems (both for storage and data
interchange within a micro-
>> services architecture.
>>
>> The only way to read JSON is to convert it to XML - using the json-to-xml()
function.

This will change in XPath 3.1, where arrays become available in the data
model, allowing an additional option of converting JSON to a structure using
maps and arrays.

>> There are several issues with this...
>>
>> ( a ) the JSON must be passed as a whole string to that function (no means
to stream it in)

I don't think it's at all difficult for an implementation to implement
xs:string as a lazy data structure, or to optimize the functional composition
of json-to-xml() with unparsed-text().
>>
>> ( b ) the XML produced is rather ugly (see
http://www.w3.org/TR/xslt-30/#json-to-xml-mapping) - if
>> anyone working in the XML domain saw that representation of data they'd
probably laugh... where
>> the type of data becomes the element name and the name of the data becomes
an attribute value...
>> yack!

The keys in a JSON map/object are arbitrary strings. Not every string is a
valid NCName. Therefore JSON keys cannot be mapped to XML names; they have to
be mapped to attribute values.

I've come across mappings where the JSON keys are used as element names where
possible, and are converted to attribute values otherwise. That seems much
harder for programmers to deal with in the general case, even though it might
produce prettier XML for a subset of JSON documents.
>>
>> ( c ) for some odd reason the design of this deceided to use the word 'map'
to describe 'object'.  Whilst
>> they are the same thing, the vast majority of people using JSON would refer
to it as an 'object' (see
>> http://json.org/)

The problem here is that Javascript/JSON use the term "object" in a way that
is quite out of line with the rest of the industry. There are many names in
use for the concept in different programming languages: map, associative
array, dictionary are some of the most common. Only JS calls it an object, and
it has none of the qualities that the OO community associates with objects,
such as encapsulation.
>>
>> ( d ) JSON can use any unicode character in strings, e.g. U+0000 and U+000C
(form-feed) are legitimate
>> characters in a string.  I don't see how XSLT can possibly accomodate this
as it follows the allowable
>> characters in XML - which exclude such codepoints.

Well, we've done our best, and that's part of the ugliness. Essentially we
provide you the option to retain the character in escaped form, as "\u0001".
It's not nice, but it means that every legal JSON text is accepted and
produces XML that is distinct from that produced by every other JSON text,
which meets the requirement for no loss of information.
>>
>> The other issue that I have with the incorporation of JSON
>> into XSLT 3.0 is that I think it has been done without consideration to the
ecosystems in which JSON
>> tends to live.

I think support for HTTP is quite orthogonal to support for JSON. Yes, people
often need both, but HTTP is needed for XML access just as much as for JSON.
In practice people use a variety of solutions: the document() function is
often enough; if not, it can usually be customized using a user-written
URIResolver or similar; or there are third-party libraries that add HTTP
support to XSLT.

Yes, it would have been nice to do HTTP as well, but resources are finite.

Michael Kay
Saxonica


Current Thread
Keywords