Easiest way to extract same variables from many XML files?
Questions about XML that are not covered by the other forums should go here.
Easiest way to extract same variables from many XML files?
Hi everyone,
I have over 30,000 XML files that are in the same format in a local folder with different names. I want to obtain the same 2 variables from each one and place into an excel spreadsheet.
I feel like this shouldn't be a hard task but I'm struggling to figure out how to do it. I haven't worked with XML or databases/queries/sql before but know a bit of programming.
I tried to import the XML files into excel using developer tools but it gave me a "switch from current encoding to specified encoding not supported".
As far as I can understand, this is the encoding:
<?xml version="1.0" encoding="utf-16"?>
The first variable I want to collect is email:
The second variable is test score from score class = '5'.
Is this something that can be done relatively easily and if so, any advice on how to do it for a beginner?
Thank you in advance
I have over 30,000 XML files that are in the same format in a local folder with different names. I want to obtain the same 2 variables from each one and place into an excel spreadsheet.
I feel like this shouldn't be a hard task but I'm struggling to figure out how to do it. I haven't worked with XML or databases/queries/sql before but know a bit of programming.
I tried to import the XML files into excel using developer tools but it gave me a "switch from current encoding to specified encoding not supported".
As far as I can understand, this is the encoding:
<?xml version="1.0" encoding="utf-16"?>
The first variable I want to collect is email:
Code: Select all
<applicant>
<person/>
<contact>
<email>
<type>1</type>
<address>Example@Email.com</address>
</email>
</contact>
Code: Select all
<testscores>
<testscore>
<score class="3">70</score>
</testscore>
<testscore>
<score class="5">92</score>
</testscore>
</testscores>
Thank you in advance
Re: Easiest way to extract same variables from many XML files?
Hi,
About this error:
{quote}switch from current encoding to specified encoding not supported{quote}
I do not know anything about how the Excel import works, if you want to change the encoding for all files from UTF-16 to UTF-8 for example, you can add in the Oxygen Project view a reference to the folder containing the files, then right click the folder "Find/Replace in Files" and replace "<?xml version="1.0" encoding="utf-16"?>" with "<?xml version="1.0" encoding="utf-8"?>". I do not know if the import will work on UTF-8 files but it could be tested..
If you want some XML specific technology to solver this, Oxygen has the possibility to create an XSLT stylesheet and apply it over a sequence of documents in a folder. But you would need to learn XSLT: https://blog.oxygenxml.com/xslt_training.html
The XSLT could produce an HTML document and if you open the HTML document in a web browser, you could copy its contents and paste them in Excel to populate the table.
Regards,
Radu
About this error:
{quote}switch from current encoding to specified encoding not supported{quote}
I do not know anything about how the Excel import works, if you want to change the encoding for all files from UTF-16 to UTF-8 for example, you can add in the Oxygen Project view a reference to the folder containing the files, then right click the folder "Find/Replace in Files" and replace "<?xml version="1.0" encoding="utf-16"?>" with "<?xml version="1.0" encoding="utf-8"?>". I do not know if the import will work on UTF-8 files but it could be tested..
If you want some XML specific technology to solver this, Oxygen has the possibility to create an XSLT stylesheet and apply it over a sequence of documents in a folder. But you would need to learn XSLT: https://blog.oxygenxml.com/xslt_training.html
The XSLT could produce an HTML document and if you open the HTML document in a web browser, you could copy its contents and paste them in Excel to populate the table.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Return to “General XML Questions”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service