Easiest way to extract same variables from many XML files?
Questions about XML that are not covered by the other forums should go here.
-
- Posts: 2
- Joined: Wed May 11, 2022 10:41 am
Easiest way to extract same variables from many XML files?
Hi everyone,
I have over 30,000 XML files that are in the same format in a local folder with different names. I want to obtain the same 2 variables from each one and place into an excel spreadsheet.
I feel like this shouldn't be a hard task but I'm struggling to figure out how to do it. I haven't worked with XML or databases/queries/sql before but know a bit of programming.
I tried to import the XML files into excel using developer tools but it gave me a "switch from current encoding to specified encoding not supported".
As far as I can understand, this is the encoding:
<?xml version="1.0" encoding="utf-16"?>
The first variable I want to collect is email:
The second variable is test score from score class = '5'.
Is this something that can be done relatively easily and if so, any advice on how to do it for a beginner?
Thank you in advance
I have over 30,000 XML files that are in the same format in a local folder with different names. I want to obtain the same 2 variables from each one and place into an excel spreadsheet.
I feel like this shouldn't be a hard task but I'm struggling to figure out how to do it. I haven't worked with XML or databases/queries/sql before but know a bit of programming.
I tried to import the XML files into excel using developer tools but it gave me a "switch from current encoding to specified encoding not supported".
As far as I can understand, this is the encoding:
<?xml version="1.0" encoding="utf-16"?>
The first variable I want to collect is email:
Code: Select all
<applicant>
<person/>
<contact>
<email>
<type>1</type>
<address>Example@Email.com</address>
</email>
</contact>
Code: Select all
<testscores>
<testscore>
<score class="3">70</score>
</testscore>
<testscore>
<score class="5">92</score>
</testscore>
</testscores>
Thank you in advance
-
- Posts: 9431
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Easiest way to extract same variables from many XML files?
Hi,
About this error:
{quote}switch from current encoding to specified encoding not supported{quote}
I do not know anything about how the Excel import works, if you want to change the encoding for all files from UTF-16 to UTF-8 for example, you can add in the Oxygen Project view a reference to the folder containing the files, then right click the folder "Find/Replace in Files" and replace "<?xml version="1.0" encoding="utf-16"?>" with "<?xml version="1.0" encoding="utf-8"?>". I do not know if the import will work on UTF-8 files but it could be tested..
If you want some XML specific technology to solver this, Oxygen has the possibility to create an XSLT stylesheet and apply it over a sequence of documents in a folder. But you would need to learn XSLT: https://blog.oxygenxml.com/xslt_training.html
The XSLT could produce an HTML document and if you open the HTML document in a web browser, you could copy its contents and paste them in Excel to populate the table.
Regards,
Radu
About this error:
{quote}switch from current encoding to specified encoding not supported{quote}
I do not know anything about how the Excel import works, if you want to change the encoding for all files from UTF-16 to UTF-8 for example, you can add in the Oxygen Project view a reference to the folder containing the files, then right click the folder "Find/Replace in Files" and replace "<?xml version="1.0" encoding="utf-16"?>" with "<?xml version="1.0" encoding="utf-8"?>". I do not know if the import will work on UTF-8 files but it could be tested..
If you want some XML specific technology to solver this, Oxygen has the possibility to create an XSLT stylesheet and apply it over a sequence of documents in a folder. But you would need to learn XSLT: https://blog.oxygenxml.com/xslt_training.html
The XSLT could produce an HTML document and if you open the HTML document in a web browser, you could copy its contents and paste them in Excel to populate the table.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Return to “General XML Questions”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service