Merge Multiple XML Files into One XML File

Questions about XML that are not covered by the other forums should go here.
ry_fisher
Posts: 1
Joined: Tue Mar 24, 2015 11:32 pm

Merge Multiple XML Files into One XML File

Post by ry_fisher »

Hello, I have a zip folder that has multiple different XML files that I want to merge into one XML file. All files are formatted the same. Is there an easy way to merge the directory into one file?
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Hi,

There is with the help of XSLT.
So, since you mentioned all files are formatted the same I will assume they all have the same root element name and namespace.
This stylesheet (based on Oxygen/samples/xhtml/copy.xsl) copies the root element of the file you apply the transformation on and the children from all the XML files located in the same folder and deeper.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml"/>
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates mode="rootcopy"/>
</xsl:copy>
</xsl:template>

<xsl:template match="node()" mode="rootcopy">
<xsl:copy>
<xsl:variable name="folderURI" select="resolve-uri('.',base-uri())"/>
<xsl:for-each select="collection(concat($folderURI, '?select=*.xml;recurse=yes'))/*/node()">
<xsl:apply-templates mode="copy" select="."/>
</xsl:for-each>
</xsl:copy>
</xsl:template>

<!-- Deep copy template -->
<xsl:template match="node()|@*" mode="copy">
<xsl:copy>
<xsl:apply-templates mode="copy" select="@*"/>
<xsl:apply-templates mode="copy"/>
</xsl:copy>
</xsl:template>

<!-- Handle default matching -->
<xsl:template match="*"/>
</xsl:stylesheet>
To use this, create a new XSLT file (File > New > XSLT Stylesheet and place in it the stylesheet above. Save the file as "merge.xsl".

You should also add the files (or folder) to an Oxygen project (Project view) and create a scenario of the "XML transformation with XSLT" type for one XML file.

To set up the scenario you can either:
- open one XML file and look in the "Transformation Scenarios" view (Window > Show View > Transformation Scenarios)
- open one XML file and from the main menu invoke Document -> Transformation -> "Configure Transformation Scenario(s)" (there's a corresponding button in the toolbar)
- right click on one of the XML files from the Project view and from the contextual menu choose Transform -> "Configure Transformation Scenario(s)".

In the "Configure Transformation Scenario(s)" dialog press "New" and select "XML transformation with XSLT" to create a new scenario:
1. Give it an appropriate name (e.g. "merge all files from folder").
2. Leave the "XML URL" field to its default(${currentFileURL})
3. In the "XSL URL" field browse for your stylesheet, "merge.xsl".
4. This is an XSLT 2.0 stylesheet so you have to choose from the Transformer combo Saxon-PE or Saxon-EE.
5. You can further tune the "Output". Please note that the "Save as" field must refer a single file, NOT an output directory. Use the editor variables to compose a generic name instead of a fixed one.
e.g in the "Save as" field you can specify: ${cfd}/${cfn}-out.xml which translates into <current-file-directory>/<current-filename>-out.xml
6. Press OK in the editing dialog and "Save and close".

To apply this scenario, in the Project view right click on one of the XML files from the folder you want merged and from the contextual menu pick Transform -> "Transform with...", then select your scenario from the list and press "Apply selected scenarios
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Taniamt
Posts: 1
Joined: Thu Nov 26, 2015 8:45 am

Re: Merge Multiple XML Files into One XML File

Post by Taniamt »

Hi, I am new to Oxygen and am finding it excellent. I used these instructions to merge 3 XML's into one very successfully. However I want to be able to re-run this merge with new data sets but every time I run it it adds to the last merge. How do I make it forget the last merge and just merge the 3 files? Thanks!
Radu
Posts: 9018
Joined: Fri Jul 09, 2004 5:18 pm

Re: Merge Multiple XML Files into One XML File

Post by Radu »

Hi,

The particular XSLT stylesheet Adrian gave as an example uses the collection XSLT 2.0 function to gather all *.xml resources from the current folder so you can either change the XSLT to parse a set of fixed XML files or make sure that in the current folder you only have the XMLs which need to be merged.

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
ckarpinski
Posts: 1
Joined: Thu Jan 21, 2016 2:32 am

Re: Merge Multiple XML Files into One XML File

Post by ckarpinski »

I used this style sheet on a folder of xml files that are all formatted the same, but it did not include the root element for each record. The original xml files have <?xml version="1.0" encoding="UTF-8"?> then the root element is <metadata> with a series of elements inside. This merged all the files into one but does not include the <metadata> element which I need.

How do I get this to copy also?

Thank you!
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Hi,
ckarpinski wrote:I used this style sheet on a folder of xml files that are all formatted the same, but it did not include the root element for each record. The original xml files have <?xml version="1.0" encoding="UTF-8"?> then the root element is <metadata> with a series of elements inside. This merged all the files into one but does not include the <metadata> element which I need.

How do I get this to copy also?

Thank you!

For what you want (merge all XMLs, each with its own root) there's a slight variation, you need to start with a common root (there must be a single root element) and place everything within that. It's actually a simpler stylesheet.
Replace the first two templates (match="/" and match="node()" mode="rootcopy") with this one:

Code: Select all

    <xsl:template match="/">
<root>
<xsl:variable name="folderURI" select="resolve-uri('.',base-uri())"/>
<xsl:for-each select="collection(concat($folderURI, '?select=*.xml;recurse=yes'))">
<xsl:apply-templates mode="copy" select="."/>
</xsl:for-each>
</root>
</xsl:template>
The common root where all files are merged is named "root", you can change it as you need.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
murilozilli
Posts: 1
Joined: Sat Aug 20, 2016 1:33 am

Re: Merge Multiple XML Files into One XML File

Post by murilozilli »

The application exceeded the available memory: 1333MB. I get this error when trying to merge my files. Can you guys give me a hint on how to increase the memory on oXygen use of Java?
Radu
Posts: 9018
Joined: Fri Jul 09, 2004 5:18 pm

Re: Merge Multiple XML Files into One XML File

Post by Radu »

Hi,

You can modify some settings in the Oxygen installation folder in order to increase the memory allocated to the Oxygen executable:

https://www.oxygenxml.com/doc/versions/ ... -launchers

Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
mhGLEIF
Posts: 43
Joined: Tue Jul 26, 2016 6:31 pm

Re: Merge Multiple XML Files into One XML File

Post by mhGLEIF »

Hello Adrian,

firstly many thanks for this demonstration. It definitely works!

I wonder if you can suggest some simple modifications to make it work for this similar use case?

I've got a directory of input XML files just as described in the initial post, and I want to merge the content.

However, additionally the input files have the following structure and I'd like the output file to be merged in conformance with that, i.e.:

INPUT (illustrative):

Code: Select all

<Root>

<Header>
<!-- ...file info... -->
<RecordCount>n</RecordCount>
<!-- ...file info... -->
</Header>

<Records>
<Record></Record>
...etc..
</Records>

</Root>
OUTPUT (even more illustrative):

Code: Select all

<Root>

<Header>
<!-- ...more file info... -->
<RecordCount>N</RecordCount>
<!-- ...more file info... -->
</Header>

<Records>
<Record>
<!-- Copied input record node -->
</Record>

<!-- more concatenated input record nodes -->
</Records>

</Root>
Where N is the sum of n over all input files.

Would I be right in thinking there is a nice way to do this using 3 templates, something like:

Code: Select all


    <xsl:template name="main"/>



<xsl:template name="header"/>



<xsl:template name="record"/>
...so that then, it would be possible to add further transformation detail for each of the file, header and record levels?
lithicas
Posts: 2
Joined: Wed Oct 17, 2018 2:54 pm

Re: Merge Multiple XML Files into One XML File

Post by lithicas »

adrian wrote:Hi,
ckarpinski wrote:I used this style sheet on a folder of xml files that are all formatted the same, but it did not include the root element for each record. The original xml files have <?xml version="1.0" encoding="UTF-8"?> then the root element is <metadata> with a series of elements inside. This merged all the files into one but does not include the <metadata> element which I need.

How do I get this to copy also?

Thank you!

For what you want (merge all XMLs, each with its own root) there's a slight variation, you need to start with a common root (there must be a single root element) and place everything within that. It's actually a simpler stylesheet.
Replace the first two templates (match="/" and match="node()" mode="rootcopy") with this one:

Code: Select all

    <xsl:template match="/">
<root>
<xsl:variable name="folderURI" select="resolve-uri('.',base-uri())"/>
<xsl:for-each select="collection(concat($folderURI, '?select=*.xml;recurse=yes'))">
<xsl:apply-templates mode="copy" select="."/>
</xsl:for-each>
</root>
</xsl:template>
The common root where all files are merged is named "root", you can change it as you need.

Regards,
Adrian
Is there anyway of doing this without including the XML file's roots? And instead add a common root for all the new merged file?

Say all XML files have the root <OriginalRoot>, if you have 5 files, <OriginalRoot> will be included 5 times in the merged file. How would I make it so that all the content in the merged file is under the root <Root>?
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Try:

Code: Select all

    <xsl:template match="/">
<Root>
<xsl:variable name="folderURI" select="resolve-uri('.',base-uri())"/>
<xsl:for-each select="collection(concat($folderURI, '?select=*.xml;recurse=yes'))/*/node()">
<xsl:apply-templates mode="copy" select="."/>
</xsl:for-each>
</Root>
</xsl:template>
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
lithicas
Posts: 2
Joined: Wed Oct 17, 2018 2:54 pm

Re: Merge Multiple XML Files into One XML File

Post by lithicas »

adrian wrote:Hi,

There is with the help of XSLT.
So, since you mentioned all files are formatted the same I will assume they all have the same root element name and namespace.
This stylesheet (based on Oxygen/samples/xhtml/copy.xsl) copies the root element of the file you apply the transformation on and the children from all the XML files located in the same folder and deeper.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml"/>
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates mode="rootcopy"/>
</xsl:copy>
</xsl:template>

<xsl:template match="node()" mode="rootcopy">
<xsl:copy>
<xsl:variable name="folderURI" select="resolve-uri('.',base-uri())"/>
<xsl:for-each select="collection(concat($folderURI, '?select=*.xml;recurse=yes'))/*/node()">
<xsl:apply-templates mode="copy" select="."/>
</xsl:for-each>
</xsl:copy>
</xsl:template>

<!-- Deep copy template -->
<xsl:template match="node()|@*" mode="copy">
<xsl:copy>
<xsl:apply-templates mode="copy" select="@*"/>
<xsl:apply-templates mode="copy"/>
</xsl:copy>
</xsl:template>

<!-- Handle default matching -->
<xsl:template match="*"/>
</xsl:stylesheet>
To use this, create a new XSLT file (File > New > XSLT Stylesheet and place in it the stylesheet above. Save the file as "merge.xsl".

You should also add the files (or folder) to an Oxygen project (Project view) and create a scenario of the "XML transformation with XSLT" type for one XML file.

To set up the scenario you can either:
- open one XML file and look in the "Transformation Scenarios" view (Window > Show View > Transformation Scenarios)
- open one XML file and from the main menu invoke Document -> Transformation -> "Configure Transformation Scenario(s)" (there's a corresponding button in the toolbar)
- right click on one of the XML files from the Project view and from the contextual menu choose Transform -> "Configure Transformation Scenario(s)".

In the "Configure Transformation Scenario(s)" dialog press "New" and select "XML transformation with XSLT" to create a new scenario:
1. Give it an appropriate name (e.g. "merge all files from folder").
2. Leave the "XML URL" field to its default(${currentFileURL})
3. In the "XSL URL" field browse for your stylesheet, "merge.xsl".
4. This is an XSLT 2.0 stylesheet so you have to choose from the Transformer combo Saxon-PE or Saxon-EE.
5. You can further tune the "Output". Please note that the "Save as" field must refer a single file, NOT an output directory. Use the editor variables to compose a generic name instead of a fixed one.
e.g in the "Save as" field you can specify: ${cfd}/${cfn}-out.xml which translates into <current-file-directory>/<current-filename>-out.xml
6. Press OK in the editing dialog and "Save and close".

To apply this scenario, in the Project view right click on one of the XML files from the folder you want merged and from the contextual menu pick Transform -> "Transform with...", then select your scenario from the list and press "Apply selected scenarios
Regards,
Adrian
Using this approach, is there anyway to add a XSD schema to the root tag?
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Using this approach, is there anyway to add a XSD schema to the root tag?
Not sure if this question is with regard with my last answer (with "Root" element) or the quoted answer (root copied from file). For the copied root you need one more XSL template that handles the root element.

Code: Select all

<xsl:template match="myRoot" mode="rootcopy">
For the "Root" element you can simply add the declaration within this element in the stylesheet.
e.g.

Code: Select all

<Root xmlns="http://myns" xsi:schemaLocation="http://myns schemaURI.xsd">
or, if it has no namespace:

Code: Select all

<Root xsi:noNamespaceSchemaLocation="schemaURI.xsd">
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
pponos
Posts: 2
Joined: Wed Dec 12, 2018 3:44 pm

Re: Merge Multiple XML Files into One XML File

Post by pponos »

Hello all,

and thank you very much Adrian for the quick guide.

I've tried to merge multiple .dita files but I received the following error message:
Engine name: Saxon6.5.5
Severity: error
Description: Error in expression resolve-uri('.',base-uri()): Unknown system function: resolve-uri
Start location: 8:0
Could you please let me know what shall i do?

Regards
Pavlos
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Hi,

You seem to have missed a step in the scenario configuration:
4. This is an XSLT 2.0 stylesheet so you have to choose from the Transformer combo Saxon-PE or Saxon-EE.
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
pponos
Posts: 2
Joined: Wed Dec 12, 2018 3:44 pm

Re: Merge Multiple XML Files into One XML File

Post by pponos »

Hello Adrian,

thank you very much for the reply, indeed it works w/o issues!

Quick (and maybe stupid) question: I've merged 5 .dita files w/o errors. Could you please let me know where the merged file is stored cause I cannot find it?

Regards
Pavlos
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Hi,

Assuming you have followed the step-by-step guide, the output file location is set at step 5:
5. You can further tune the "Output". Please note that the "Save as" field must refer a single file, NOT an output directory. Use the editor variables to compose a generic name instead of a fixed one.
e.g in the "Save as" field you can specify: ${cfd}/${cfn}-out.xml which translates into <current-file-directory>/<current-filename>-out.xml
If you left the "Save as" field empty, then there is no file being saved. If you just want a fixed file, use the Browse for local file button on the right side of the field and pick a location and a file name.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
jackwilliams4837
Posts: 1
Joined: Thu Jun 27, 2019 10:36 am

Re: Merge Multiple XML Files into One XML File

Post by jackwilliams4837 »

Hi. I got the following error while merging XML files. I don't know anything about coding. I just followed the steps and got the error. Please specify the problem of reason and solution.

I/O error reported by XML parser processing file:/C:/Users/2019CBHC/Desktop/xml%20files/PMC6536601.xml: C:\Users\2019CBHC\Desktop\xml files\JATS-archivearticle1.dtd (The system cannot find the file specified)
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Hi,

I should mention that the merging described here is mostly for XMLs holding data records. It is not really suited for XMLs with complex structures, unless you're willing to manually trim the merged output to make it valid.

Your XML refers (via DOCTYPE) a DTD that you don't seem to have. If you just want to merge the XML files and don't care about the DTD, I would advise creating a dummy (empty) DTD of the same name (JATS-archivearticle1.dtd) next to your files and do the merge. If the merged XML should have the same type as the source documents, you might want to also copy the DOCTYPE to the merged file.

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
xylo223
Posts: 1
Joined: Fri Jul 19, 2019 3:18 pm
Location: Universe
Contact:

Re: Merge Multiple XML Files into One XML File

Post by xylo223 »

adrian wrote: Thu Mar 26, 2015 7:48 pm Hi,

There is with the help of XSLT.
So, since you mentioned all files are formatted the same I will assume they all have the same root element name and namespace.
This stylesheet (based on Oxygen/samples/xhtml/copy.xsl) copies the root element of the file you apply the transformation on and the children from all the XML files located in the same folder and deeper.

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
    <xsl:output method="xml"/>
    <xsl:template match="/">
        <xsl:copy>
            <xsl:apply-templates mode="rootcopy"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="node()" mode="rootcopy">
        <xsl:copy>
            <xsl:variable name="folderURI" select="resolve-uri('.',base-uri())"/>
            <xsl:for-each select="collection(concat($folderURI, '?select=*.xml;recurse=yes'))/*/node()">
                <xsl:apply-templates mode="copy" select="."/>
            </xsl:for-each>
        </xsl:copy>
    </xsl:template>

    <!-- Deep copy template -->
    <xsl:template match="node()|@*" mode="copy">
        <xsl:copy>
            <xsl:apply-templates mode="copy" select="@*"/>
            <xsl:apply-templates mode="copy"/>
        </xsl:copy>
    </xsl:template>

    <!-- Handle default matching -->
    <xsl:template match="*"/>
</xsl:stylesheet>
To use this, create a new XSLT file (File > New > XSLT Stylesheet and place in it the stylesheet above. Save the file as "merge.xsl".

You should also add the files (or folder) to an Oxygen project (Project view) and create a scenario of the "XML transformation with XSLT" type for one XML file.

To set up the scenario you can either:
- open one XML file and look in the "Transformation Scenarios" view (Window > Show View > Transformation Scenarios)
- open one XML file and from the main menu invoke Document -> Transformation -> "Configure Transformation Scenario(s)" (there's a corresponding button in the toolbar)
- right click on one of the XML files from the Project view and from the contextual menu choose Transform -> "Configure Transformation Scenario(s)".

In the "Configure Transformation Scenario(s)" dialog press "New" and select "XML transformation with XSLT" to create a new scenario:
1. Give it an appropriate name (e.g. "merge all files from folder").
2. Leave the "XML URL" field to its default(${currentFileURL})
3. In the "XSL URL" field browse for your stylesheet, "merge.xsl".
4. This is an XSLT 2.0 stylesheet so you have to choose from the Transformer combo Saxon-PE or Saxon-EE.
5. You can further tune the "Output". Please note that the "Save as" field must refer a single file, NOT an output directory. Use the editor variables to compose a generic name instead of a fixed one.
e.g in the "Save as" field you can specify: ${cfd}/${cfn}-out.xml which translates into <current-file-directory>/<current-filename>-out.xml
6. Press OK in the editing dialog and "Save and close".

To apply this scenario, in the Project view right click on one of the XML files from the folder you want merged and from the contextual menu pick Transform -> "Transform with...", then select your scenario from the list and press "Apply selected scenarios
Regards,
Adrian
Thank you Adrian, you saved my time. I followed these steps and just as you said to create a dummy DTD for the merger, I did the same and did not got any errors. All my XML files smoothly merged into one.
dbokser
Posts: 1
Joined: Fri Dec 27, 2019 5:47 pm

Re: Merge Multiple XML Files into One XML File

Post by dbokser »

The "Merge" XSLT is great, but instead of getting ALL records, I need to select for only those where, say, the publisherName field has "Acme Publishing Company". Seems so simple but how can I do it? I still want to output all fields as original XML.
adrian
Posts: 2850
Joined: Tue May 17, 2005 4:01 pm

Re: Merge Multiple XML Files into One XML File

Post by adrian »

Hi,

You need to add another template rule that matches your record element name and only copy the contents if your condition is satisfied.

Code: Select all

  <xsl:template match="*:recordElementName" mode="copy">
    <xsl:if test="publisherName='Acme Publishing Company'">
      <xsl:copy>
        <xsl:apply-templates mode="copy" select="@*"/>
        <xsl:apply-templates mode="copy"/>
      </xsl:copy>
    </xsl:if>
  </xsl:template>
Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Post Reply