[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] word list and count from the text in an xml document


Subject: [xsl] word list and count from the text in an xml document
From: "Mark" <mark@xxxxxxxxxxxx>
Date: Sat, 12 Jun 2010 13:51:10 -0700

Hi,
I have been floundering around in the xsl-list archives for a while looking for a way to get a listing and count of all the words in every instance of a specific element. Thousands of hits, but I think I am not using the correct search terms. I know what I want must be in the archive, but I just can't seem to narrow my search enough to find it.


Given a fragment like (and concentrating on the moment on the lang="en" element):
<Description>
<Data lang="cz"> bmla skvrnka na spodnm casti pmsmene L ve SLOV</Data>
<Data lang="en">white splotch on the lower bar on the L in SLOV</Data>
</Description>
<Description>
<Data lang="cz">barevn} bod pod dolnmm ramem vlevo od VHB</Data>
<Data lang="en">dot on the lower frame to the left of VHB</Data>
</Description>


I would like to create a list like the one below (it would be nice to be able to use a "stop word" list also so as to not count stuff like "on", ""the", etc.):
bar 1
dot 1
frame 1
in 1
L 1
left 1
lower 2
of 1
on 3
SLOV 1
splotch 1
the 3
to 1
white 1


Mark


Current Thread