[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] how to efficiently extract unique list of URI's for writing files

Subject: [xsl] how to efficiently extract unique list of URI's for writing files
From: Robby Pelssers <Robby.Pelssers@xxxxxxx>
Date: Mon, 16 Jul 2012 12:55:37 +0200

Hi all,

Again the data is completely made up but is easy to explain what I'm trying to
accomplish.  In below input I have a sequence of objects. Each needs to be
written to some URI.


The problem is that the input may contain duplicate entries (not deep-equal
though) so for user the <id> would be the unique identifier.  If I try to use
simple pattern matching I will run into an error as I can't write twice to the
same URI.

I came across this post explaining how to solve this issue
http://www.stylusstudio.com/xsllist/200705/post10050.html  but it is still not
clear to me what would be the best way to approach this.

Should I write e.g. functions for all possible types to extract the distinct
URIs and do a second iteration that drops the URI from the sequence and if
it's not present anymore skip processing the object twice?

<xsl:function name="pelssers:getURI">
  <xsl:param name="crontask" as="element(crontask)"/>

<xsl:function name="pelssers:getURI">
  <xsl:param name="user" as="element(user)"/>

"Here's a solution that I normally use. Take all URIs that you want to write
to, pack them in a sequence and deduplicate them (use the function
distinct-values or similar) and go from there (if possible) or, if you can't,
you can use a micro-pipeline. I.e., the first transforms and changes the input
and adds _1, _2 etc to the names, to ensure uniqueness, the second is the
transformation where you create the actual result document."

<?xml version="1.0" encoding="UTF-8"?>
    <name>SyncDB</name> <!-- name is identifier -->
    <definition>Syncs filesystem with database</definition>
    <id>12345</id>  <!-- id is identifier -->
    <name>Robby Pelssers</name>
  <!--   duplicate entry for same user needs to be skipped although name is
different  -->
    <name>Robby PelssersXX</name>

Current Thread