[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

RE: [xsl] Re: xsl-list Digest 11 Nov 2006 06:10:00 -0000 Issue 955


Subject: RE: [xsl] Re: xsl-list Digest 11 Nov 2006 06:10:00 -0000 Issue 955
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Sat, 11 Nov 2006 09:22:31 -0000

> Thanks for your response. I mean perfectly equivalent 
> XML-wise. 

Sorry, but that's not a good definition. Try looking at the definition of
deep-equal() in XSLT 2.0

http://www.w3.org/TR/xpath-functions/#func-deep-equal

and set out how your definition differs from it. There's only one difference
obvious from your example, which is that you seem to be treating the order
of sibling elements as insignificant, but since you've only tried to define
"perfectly equivalent" by example, and by means of one single example to
boot, we've no way of knowing what other quirks your concept might have. For
example, you've said nothing about your handling of whitespace or comments.

Generally your approach to the solution should be "normalize then group".
That is, define some transformation on the input that computes a grouping
key with the property that two elements which you consider to be "perfectly
equivalent" have the same grouping key, and then use standard grouping
techniques to eliminate duplicates.

Michael Kay
http://www.saxonica.com/





For example, these two expressions are equivalent:
> 
> EXP1:
> <specification src="fileName">
> </specification>
> 
> EXP2:
> <specification src="fileName"/>
> 
> but are terminated differently. And these branches are also 
> equivalent:
> 
> BRANCH1:
> <track name="trackName" type="primary">
>    <el index="0" start="0.04" end="6">
>      <attribute name="AA">-0.78691</attribute>
>      <attribute name="BB">activ</attribute>
>      <attribute name="CC">0.5374</attribute>
>      <attribute name="DD">0.7794</attribute>
>      <attribute name="EE">0.08231</attribute>
>      <attribute name="FF">0.9171</attribute>
>    </el>
> </track>
> 
> BRANCH2
> <track name="trackName" type="primary">
>    <el index="0" start="0.04" end="6">
>      <attribute name="EE">0.08231</attribute>
>      <attribute name="AA">- 0.78691</attribute>
>      <attribute name="FF">0.9171</attribute>
>      <attribute name="CC">0.5374</attribute>
>      <attribute name="DD">0.7794 </attribute>
>      <attribute name="BB">activ</attribute>
>    </el>
> </track>
> 
> But the internal elements are ordered differently.
> 
> So what I want in the merged output is just one version of 
> either, plus anything that is truly different (new). In the 
> example I provided, that would mean the following section (in 
> the body of FILE2, right before the branch quoted before):
> 
> <track name="anotherTrackName" type="primary">
>    <el index="0" start=" 7.24" end="8.52">
>      <attribute name="type">preparation</attribute>
>    </el>
> </track>
> 
> Thanks a lot for your time!
> 
> Rodrigo
> 
> 
> > Date: Fri, 10 Nov 2006 08:29:50 -0000
> > To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
> > From: "Michael Kay" <mike@xxxxxxxxxxxx>
> > Subject: RE: [xsl] Fwd: Merging XML files
> > Message-ID: <005d01c704a2$63ca8be0$6401a8c0@turtle>
> >
> > > However, just looking for diffs between the two files, perfectly 
> > > equivalent nodes in both files which are spelled or ordered 
> > > differently end up being duplicated in the output.
> >
> > Looking for diffs is quite a challenging task. The first job is to 
> > specify it clearly. For example, the concept of "perfectly 
> equivalent 
> > nodes which are spelled differently" seems an odd one. You need to 
> > define your notion of "perfect equivalence" rather precisely.
> >
> > Michael Kay
> > http://www.saxonica.com/
> >
> > > I am new into xsl programming and unfortunately, the examples I 
> > > found on the web would not work for me. How can I accomplish this 
> > > task from the command line using, say, Saxon?
> > >
> > > Thanks a lot!
> > >
> > > Rodrigo
> > >
> > > ----------------------FILE 1-----------------------
> > >
> > > <?xml version="1.0" encoding="ISO-8859-1" ?> <annotation 
> > > xml_tb_version="3.1">
> > >   <head>
> > >     <specification src="fileName">
> > >     </specification>
> > >   </head>
> > >   <body>
> > >     <track name="trackName" type="primary">
> > >       <el index="0" start="0.04" end="6">
> > >         <attribute name="AA">-0.78691</attribute>
> > >         <attribute name="BB">activ</attribute>
> > >         <attribute name="CC">0.5374</attribute>
> > >         <attribute name="DD">0.7794</attribute>
> > >         <attribute name="EE">0.08231</attribute>
> > >         <attribute name="FF">0.9171</attribute>
> > >       </el>
> > >     </track>
> > >   </body>
> > > </annotation>
> > >
> > > ----------------------FILE 2-----------------------
> > >
> > > <?xml version="1.0" encoding="ISO-8859-1"?> <annotation>
> > >   <head>
> > >     <specification src=" fileName" />
> > >   </head>
> > >   <body>
> > >     <track name="anotherTrackName" type="primary">
> > >       <el index="0" start=" 7.24" end="8.52">
> > >         <attribute name="type">preparation</attribute>
> > >       </el>
> > >     </track>
> > >     <track name="trackName" type="primary">
> > >       <el index="0" start="0.04" end="6">
> > >         <attribute name="EE">0.08231</attribute>
> > >         <attribute name="AA">- 0.78691</attribute>
> > >         <attribute name="FF">0.9171</attribute>
> > >         <attribute name="CC">0.5374</attribute>
> > >         <attribute name="DD">0.7794 </attribute>
> > >         <attribute name="BB">activ</attribute>
> > >       </el>
> > >     </track>
> > >   </body>
> > > </annotation>


Current Thread
Keywords