Page 1 of 1

Search and copy XML files into a new directory

Posted: Thu Jan 21, 2010 10:33 pm
by crult
Hello,

i'm using the 10.3 version of Oxygen in windows xp. I have the following problem: i have some folders with many XML files inside. I want to find only the files that contain a specific annotation (for example <Secteur>SCI</Secteur> ). There are some newspaper articles in XML format. The content SCI indicates that this is a science article. Other articles have <Secteur>SPO</Secteur> for Sports for example. I want to find only the science articles and to copy them in a new directory, doing it automatically. There is any solution? I used the option find, i found some results but i can't take each file manually to copy it (600 results). Thanks for your response.

Re: Search and copy XML files into a new directory

Posted: Fri Jan 22, 2010 12:09 pm
by adrian
Hi,

The 'Find/Replace in Files' from Oxygen, as the name implies, only does find and replace. The only copying is done to backup the old files(with a custom file extension) during replace but I don't think that would be very useful.
So the answer is no, there isn't a way to do this automatically.

Regards,
Adrian

Re: Search and copy XML files into a new directory

Posted: Fri Jan 22, 2010 1:57 pm
by crult
If i make a minor change on the target XML files, ask for their backup with .xml extension and find them? What's the backup directory? If it isn't the solution, is that possible with another software or tool (or XSLT)?

thank u very much

Re: Search and copy XML files into a new directory

Posted: Fri Jan 22, 2010 3:57 pm
by adrian
They are not backed up to a different directory, they are placed in the same directory as the modified file, they are just appended that custom extension(default is bak).

You could probably do this in ANT but you have to do a little research.
Here's a starting point:
http://mail-archives.apache.org/mod_mbo ... hoo.com%3E

Regards,
Adrian

Re: Search and copy XML files into a new directory

Posted: Fri Jan 22, 2010 9:10 pm
by crult
Thank you very much for the instructions, i have to try this!

Re: Search and copy XML files into a new directory

Posted: Thu Jan 28, 2010 9:28 pm
by crult
hi,

someone proposed me something like this:

''If you have XSLT 2.0, you can store the locations of all files in an extra file, then for each location in that file, check this category, and then produce a new document that is the copy-of the document. AFAIK, Oxygen's XSLT processor is XSLT 1.0. Does the editor have SAXON? SAXON is an XSLT 2.0 processor, so it could do the job (assuming that using the editor every time you want to trigger SAXON is OK for you).''

Can you give me some information? Thank you very much!

Re: Search and copy XML files into a new directory

Posted: Fri Jan 29, 2010 12:07 pm
by adrian
Oxygen does use Saxon 6.5.5 for XSLT 1.0 and Saxon 9.2(Oxygen 11.1) for XSLT 2.0. But you do need a bit of XSLT knowledge to write the stylesheet to do this.

Regards,
Adrian

Re: Search and copy XML files into a new directory

Posted: Sun Jan 31, 2010 6:31 pm
by crult
That's what i need i think!thank you very much for the support. If i have any questions i'll post them. thanks!

Re: Search and copy XML files into a new directory

Posted: Sun Jan 31, 2010 7:39 pm
by crult
So i have to describe you exactly the situation:


All these folders are in the same directory(folder ''XSL'')



1) The name of the folder containing the XML files is ''01''
2) The folder where i want to extract only the files containing <Secteur>SCI</Secteur> is named ''SCI''.
The Xpath is /Document/Article[1]/Secteur[1]. The form of the XML documents is:

<?xml version='1.0' encoding='ISO-8859-15'?>
<Document xyurl='xyl://20040101N0001.xml'>
<DocId>20040101N0001</DocId>
<Article>
<Page Lien='repository/2004/01/01/pages/04010120.pdf'>20</Page>
<Date Annee='2004' Mois='01' Jour='01'/>
<Publication>LeMonde</Publication>
<Secteur>SCI</Secteur> <------ Here is the description of the category
<Taille>34</Taille>
<Corps>
<Titraille>
<Tetiere>AUJOURD&apos;HUI VOYAGES</Tetiere>
<Titres>
<Surtitre>« Hermione », la frégate de Rochefort</Surtitre>
<Titre>
<P>A bord, la vie était rude</P>
</Titre>
<SousTitre/>
</Titres>
</Titraille>
<Chapo/>
<Origine/>
<Texte>
<P>Sur l&apos; Hermione, les affûts de canon étaient peints en rouge pour faciliter le nettoyage du sang des hommes après la bataille. La « frégate de douze » était armée de 26 canons de douze (les boulets pèsent 6 kg) et 6 canons de six (boulets de 3 kg). Elle était beaucoup plus légère, rapide et maniable qu&apos;un vaisseau taillé pour le combat avec 118 canons. A bord, l&apos;eau est rationnée à trois pintes par homme et par jour. Les vers et les charançons infestent les biscuits de mer. L&apos;absence de fruits et légumes frais rend le scorbut ravageur. La fièvre typhoïde, la petite vérole et la gangrène sont des maladies fréquentes. L&apos;hygiène est absente, le sommeil mauvais. Deux matelots alternent dans un hamac, souvent trempé, à l&apos;entrepont, espace confiné où vivent aussi les moutons embarqués vivants. Le capitaine prend soin de sa chair à canon comme d&apos;un cheptel : il lui faut assez d&apos;hommes vivants pour livrer combat. A cette époque, le service dans la marine est obligatoire - un an sur trois - dans les provinces maritimes du royaume. </P>
<P/>
</Texte>
<SignaturePubliee/>
<Note/>
<Images/>
</Corps>
</Article>
<Indexation>
<TagAdmin1/>
<TagAdmin2/>
<TagAdmin3/>
<TitreComplementaire>2 articles - description de la vie des matelots à bord de l&apos;"Hermione"</TitreComplementaire>
<Commentaire>Q0101/675650;</Commentaire>
<Categories>
<Categorie>DESCRIPTION</Categorie>
<Categorie>ENCADRE</Categorie>
<Categorie>ENSEMBLE</Categorie>
</Categories>
<Lien/>
<Oeuvre>
<TitresOeuvre/>
<GenresOeuvre/>
<AuteursOeuvre/>
</Oeuvre>
<SignaturesIndexees>
<SignatureIndexee/>
</SignaturesIndexees>
</Indexation>
<Etat Statut='EXPORTE'>
<Documentaliste>DAR</Documentaliste>
<MisesAJour>
31-12-2003
</MisesAJour>
</Etat>
<Historique>
<France/>
<Etranger/>
<Personnes/>
</Historique>
</Document>





3) I created a new XSL stylesheet. Something like this:


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">



<xsl:template match="/Document/Article[1]/Secteur[1]/SCI">




<xsl:result-document href="">

<xsl:copy-of select="document(.) "></xsl:copy-of>

</xsl:result-document>




</xsl:template>


</xsl:stylesheet>



Now i want to tell XSLT to search the folder ''01'', to find only tha files containing <Secteur>SCI</Secteur>, and to copy them (without any changes) to the folder ''SCI''. Can you help me with th XSLT stylesheet?
Another question: How can i apply my scenario to the whole folder ''01''. I can't do this for each only XML file.


Thank you very much!