Page 1 of 1

Selecting elements from multiple levels of a hierarchy

Posted: Fri Jul 06, 2012 10:40 pm
by Tracy J.
Hello all,

I am very new at XSLT and am trying to learn it from a book while playing with oXygen. I have a number of XML files that are generated by a software program called the Archivists' Toolkit. From one of these files, I am trying to pull out three particular types of elements which appear at multiple levels of the hierarchy. I only want them when they appear in certain places, but I can't seem to get only the elements I want returned, and I cannot for the life of me figure out what I'm doing wrong.

This is a sample of the XML:

Code: Select all

<ead>
<archdesc>
<dsc>
<c01>
<did>
<unittitle>Something</unittitle>
<unitdate normal="1940/1998" type="inclusive">1940-1998</unitdate>
</did>
<c02>
<did>
<unittitle>A picture</unittitle>
<physdesc>
<extent>1.0 black-and-white negatives</extent>
<extent>00185993</extent>
</physdesc>
</did>
</c02>
<c02>
<did>
<unittitle>Another picture</unittitle>
<unitid>00182531</unitid>
<physdesc>
<extent>1.0 black-and-white negatives</extent>
</physdesc>
</did>
</c02>
<c02>
<did>
<unittitle>A group of pictures of something</unittitle>
<container type="Box">2</container>
<physdesc>
<extent>6.0 color negatives</extent>
<extent>00181291-00181296</extent>
</physdesc>
</did>
<c03>
<did>
<unittitle>A picture of that thing</unittitle>
<unitid>00181293</unitid>
<container type="Box">3</container>
<physdesc>
<extent>1.0 black-and-white negatives</extent>
</physdesc>
<unitdate>Undated </unitdate>
</did>
</c03>
</c02>
</c01>
</dsc>
</archdesc>
<ead>
I want to pull out only the <unittitle>, <unitid>, and <extent> tags for any <c02> and <c03> (they also appear at other places, like the <c01> and <archdesc> level), and I really want to have them separated by tabs so that I can copy this data into a spreadsheet and manipulate it there. The id # is entered in for some items as the <unitid> and for many others as <extent>, and the file is too huge for me to fix all that at the source at this stage. So I want to output to be something like:

Code: Select all

A picture	1.0 black-and-white negatives	00185993
Another picture 00182531 1.0 black-and-white negatives
A group of pictures of something 6.0 color negatives 00181291-00181296
A picture of that thing 00181293 1.0 black-and-white negatives
Obviously, when you get right down to it, I'd rather have the like types of data all lined up and pretty, but I can't even seem to pull out only the tags I want. It seems like it should be a very easy

Code: Select all

<xsl:value-of select="dsc/c01/c02/did/unittitle" />&#x9;
<xsl:value-of select="dsc/c01/c02/did/unitid" />&#x9;
<xsl:value-of select="dsc/c01/c02/did/physdesc/extent" />
but that hasn't worked and I've tried variations with for-each and copy and I just feel like an idiot for not being able to figure out something that seems so simple.

When I do the transformation I get either nothing at all or I get the text value of every single element in the entire XML file with tons of whitespace in between. Once I managed to get all the unittitles in one line with no space in between them, but when I tried to fix that everything got messed up and now I don't even remember how I did that.

I only recently got oXygen and while I used it at my former job, I'm not entirely sure everything is set up correctly.

Please help!

Thanks in advance,
Tracy J.