[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

Re: mapping (Was: Re: [xsl] Re: . in for)


Subject: Re: mapping (Was: Re: [xsl] Re: . in for)
From: Joerg Pietschmann <joerg.pietschmann@xxxxxx>
Date: Fri, 11 Jan 2002 14:53:54 +0100

Jeni Tennison <jeni@xxxxxxxxxxxxxxxx> wrote:
> Yes. If I understand lambda expressions correctly, there are even some
> in XPath 1.0 (i.e. predicates, steps in path expressions).

Indeed you are right, the expressions within predicates are
lambda expressions (not sure about steps, i'm too lazy to
look up the definition). They are, however, somewhat special
as they can have only one parameter, the ".".
With the "for" operator and its companions, lambda expressions
with more than one parameter are introduced:
  for $i in $seq1, $j in $seq2  return ($i, $j)
You probably got the right gut feeling when you first proposed
your map operator, reusing the existing predicate-form of
lambda expressions
  $seq -> (. + 50)

I want to note at this point that it may seem to be a neat idea
to avoid explicit names for the lambda parameters by using sort
of positional parameters
  for $seq1, $seq2 return ($1, $2)
Unfortunately, this wont work for nested "for" operators because you
can't access the parameter iterating over the sequence of the outer "for"
  for $seq1 return for $seq2 return ($???, $1)
Actually, is
  for $i in $seq1 return for $j in $seq2 return ($i, $j)
legal and does it work as it ought to be?

When discussing operators which take multiple sequences as operands,
one has to decide whether the lambda iterates over the sequences
in parallel (as the various map* functions in LISP do) or whether
it iterates over the cartesian product (as DB table joins to).
Having similar syntax for both can be confusing.
BTW expressing a join in LISP with established functionality
would probably look like
  (defun map-join (expr seq1 seq2)
    (mapcar '(lambda (arg1)
               (mapcar '(lambda (arg2) (funcall expr arg1 arg2))
                  seq2)
       seq1))
Can one of the experts confirm this? It's been a while since i
was active...

> No, but I think it is helpful for commonly used full data types to
> have a consistent lexical representation.

That's correct. But we don't need to have lambda expressions as
XPath data types as long as we don't want to pass them around
in variables and parameters. Note that it's already to late to
have a consistent lexical representation for lambda expressions
unless you want to restrict them to have a "." parameter only, as
this would influence expressions in predicates.

> In other words, we should be careful, since lambda expressions are
> being introduced to XPath now, to consider the fact that they might
> become full data types in the future. We should take care that what we
> do now does not make their adoption as a full data type in the future
> impossible/difficult/messy.

Good point. Who will take responsibility to hammer this home in
the  WG? :-)

> > I mean, the "for" operator is cool but it doesn't help
> > us in situations which could be easily solved with lamda
> > expressions, i.e. defining the equivalent of mapconcat and such.
> Perhaps you could expand on this a bit. It would be a good use case
> for the note Dimitre's hopefully going to put together.

Suppose we have XPath as it is in the 2.0 WD with "for", "some"
and similar operators (sorry, i did'n bother to read it in depth,
so i may have gotten sometihing wrong). Suppose furhter the XSLT
community demands a functionality which applies an expression
to the items in a sequence and concatenates the results as strings
with a delimiter between them (probably what
 <xsl:value-of select="for $i in $stuff return floor($i)" delimiter=":"/>
does)
Obviously it would be nice to have a similar syntax as "for", but
we have to stuff the delimiter somewhere. Perhaps it ought to be
  concat-for $i in $stuff delimit-by ':' return floor($i)
I suppose that the delimiter operand is not yet catered for in
one of the grammar production rules, which means the grammar has
to be extended. This in turn is usually a reason to have this
operator in the language core, whether XQuery likes it or not
(modularization of grammars is not easy).

With a functional approach and lambda expressions, even if they
have to be literals (no full data types), XSLT could easily
define a XSLT specific function
  concat-for(expression,delimiter,sequence) => string
to be used like
  concat-for(expression(floor($x),x),':',$stuff)
There would be no need to change the grammar and thereby forcing
this functionality into the language core. Furthermore this avoids
the exponential growing expense for checking and avoiding ambiguities
in the grammar (that's the main reason i don't like operators,
especially operators which are words like "for" and "return").
There is only one extension of the grammar needed to get lambda
expressions, everything else is covered by the function invocation
rule.

The lambda expression syntax might be somewhat tricky to get
right, "expression(expr,arg,...)" may be too much like a function
invocation. Perhaps #(arg1,arg2,...,expr) ? Example:
  concat-for(#(x,floor(x)),':',$stuff)
Note: It's a convention for functions to have optional parameters
at the end of the parameter list, therefore expression() has
the parameter names after the expression, while #() is different.

It may also be an idea to allow the predicate expression special
form elsewhere in case the lambda has only one argument which
isn't refered to by embedded lambdas.
  concat-for(floor(.),':',$stuff)
Can one of the implementors comment on potential parsing problems
and perhaps write up some grammar rules?

Next point: having lambda expressions even as data types is not the
same as having higher order functions. There is a hierarchy of
complexity:
1. Literal lambda expressions do not add complexity to the current
 draft, as we can claim for all practical purposes that we have them
 already. Especially they do not threaten existing optimizations or
 the possibility to compile XSLT programs into other programming
 languages.
2. Lambda expressions as data types. I don't think this is a big step
 from level 1 as implementors are likely to invent a framework for
 implementing the various "for" operators which involves a sort of
 function objects. If we have them we can as well pass them around
 elsewhere. However, it will inhibit writing compilers from XSLT to
 languages which have only statically bound functions like FORTRAN77,
 you'd need function pointers or OO-objects with virtual methods.
3. Higher order functions. I'm walking on thin ice here but i
 believe this requires the possibility to compose lambda expressions
 from run-time supplied data. Toghether with a funcall() function, this
 should be equivalent to the evaluate() function proposed elsewhere.
 (funcall(expr,arg1,arg2,...) evaluates the lambda expression with the
  expression's arguments bound to arg1, arg2 etc.)
 That's a step not taken lightly, as history has shown this introduces
 performance and security problems more often than not. Especially
 this will bother people who want to write compilers which produce
 executables with a small footprint. (yes, there are LISP compilers
 which produce performant and compact code, but last time i checked
 they all barfed on encountering eval() and such).

I'm not sure whether "higher order functions" also requires binding
lambdas to names dynamically or whether this is already covered
by the possibility of assigning lambdas to variables.
 (In LISP you can write
   (setf 'my-functionname '(lambda (x) (+ x 1))
   (my-functionname 1)
     evaluates to 2 here
   (setf 'my-functionname '(lambda (x) (- x 1))
   (my-functionname 1)
     evaluates to 0 here)
I'm not convinced that such a feature is needed or even desirable.

Next: use cases
As for lambda expressions as literals, there is no need to present
any. What's we should do is convince the WG that it is better for
the modularity and extensibility of XPath and related standards
to have lambda expression literals and functions which can take
them as arguments rather than to invent new "for"-like operators
every time someone feels the need to do something which involves
iterating a sequence. The problem here is that lambda expressions
can only be used for XPath functions and extensions defined by
standards like XSLT but they are not available for user defined
functions. This means, if there is an urgent need to have the
concat-for() functionality above, we'll have to wait for the next
standard revision and can't plug this hole by developing an user
defined function.
For lambda expressions as data types, well, the most obvious use
case is presented just above. It would also allow Dimitre to
reformulate most of his "generic template" stuff in a more compact
and usable form.
As for true higher order functions, i still can't come up with
convincing use cases apart from evaluate(). Until now, only
Dimitre expresses some urgent need, and even he hasn't yet presented
a convincing use case. After all, we don't want to implement expert
systems and programs for researching machine lerning in XSLT, do we?
Note that C/C++ or VB are widely used but don't have higher order
functions, even though C++ has some sort of lambda expressions
aquired with the STL (and even there i suspect that "we'll prove
the naysayers wrong" was one of the major driving forces).

> > - Hiding lambda-like semantics in other definitions is not what i
> >   would like.
> What constitutes 'hiding'? Are you referring to my suggestion of
> 'pretending' that we're not introducing lambda expressions when we are
> (in order to keep the XQuery people quiet) or the fact that various
> expressions in the XPath 2.0 WD introduce lambda expressions without
> being explicit about it?
Well, both, but especially the latter for reasons stated above.

Regards
J.Pietschmann

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



Current Thread
Keywords