Can Oxygen Convert Tab Delimited To XML?

Questions about XML that are not covered by the other forums should go here.

Can Oxygen Convert Tab Delimited To XML?

Postby Steve Wilkison » Sun May 08, 2005 1:26 pm

I'm relatively new to XML and Oxygen. I've looked through the documentation and can't find anything on this, so I thought I'd ask here. Can Oxygen take a tab delimited file and convert it into basic XML? If not, is there a simple program that can do this (for the Mac)? Thanks for any help, insight or pointers.
Steve Wilkison
 
Posts: 6
Joined: Wed May 19, 2004 6:29 pm

Postby george » Sun May 08, 2005 9:11 pm

Hi Steve,

Oxygen 5.1 does not have this out of the box but you can get that using the XSLT 2.0 support. For instance the following XSLT 2.0 stylesheet (you need to set Saxon8 as transformer):

Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <!-- The tab delimited document, relative to the stylesheet location or an absolute location -->
  <xsl:param name="doc" select="'sample.txt'"/>
  <!-- The encoding for the tab delimited document -->
  <xsl:param name="enc" select="'UTF-8'"/>
  <!-- The result XML root element name -->
  <xsl:param name="root" select="'file'"/>
  <!-- The result XML element name that will mark the values from a line -->
  <xsl:param name="line" select="'line'"/>
  <!-- The result XML element name that will mark each value from the input document -->
  <xsl:param name="entry" select="'entry'"/>
 
  <!--
    main template
  -->
  <xsl:template match="/">
    <xsl:element name="{$root}">
      <xsl:call-template name="tLines">
        <xsl:with-param name="value" select="unparsed-text($doc, $enc)"/>
      </xsl:call-template>
    </xsl:element>
  </xsl:template>
  <!--
    tokenize lines
  -->
  <xsl:template name="tLines">
    <xsl:param name="value" select="''"/>
    <xsl:analyze-string select="$value" regex="\n|\r">
      <xsl:matching-substring/>
      <xsl:non-matching-substring>
        <xsl:element name="{$line}">
          <xsl:call-template name="tValues">
            <xsl:with-param name="value" select="."/>
          </xsl:call-template>
        </xsl:element>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>
  <!--
    tokenize values
  -->
  <xsl:template name="tValues">
    <xsl:param name="value" select="''"/>
    <xsl:analyze-string select="$value" regex="\t">
      <xsl:matching-substring/>
      <xsl:non-matching-substring>
        <xsl:element name="{$entry}">
          <xsl:value-of select="."/>
        </xsl:element>
      </xsl:non-matching-substring>
    </xsl:analyze-string>
  </xsl:template>
</xsl:stylesheet>


applied on a dummy input XML and having the tab delimited file in the same folder named sample.txt
Code: Select all
a1   a2   a3   a4
v1   v2   v3   V4
X1   X2   X3   X4

will give as output:
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<file>
   <line>
      <entry>a1</entry>
      <entry>a2</entry>
      <entry>a3</entry>
      <entry>a4</entry>
   </line>
   <line>
      <entry>v1</entry>
      <entry>v2</entry>
      <entry>v3</entry>
      <entry>V4</entry>
   </line>
   <line>
      <entry>X1</entry>
      <entry>X2</entry>
      <entry>X3</entry>
      <entry>X4</entry>
   </line>
</file>


The tag names, the name of the input file and its encoding can be specified as parameters. Also if you change the regex in
<xsl:analyze-string select="$value" regex="\t">
then you can handle different delimitators like comma, semicolumn, etc.

Support directly in oXygen for this type of conversions will be available in the next release.

Best Regards,
George
george
Site Admin
 
Posts: 2076
Joined: Thu Jan 09, 2003 2:58 pm

Postby stefan » Wed May 18, 2005 2:47 pm

Convert from text files (csv, tab delimited) to XML is now available
http://www.oxygenxml.com/database_import.html
stefan
 
Posts: 40
Joined: Wed Jan 08, 2003 9:43 am

Thank you

Postby Steve Wilkison » Wed May 18, 2005 3:31 pm

Thank you George for you initial reply, it was very helpful. Thank you Stefan for the info on the new version.
Steve Wilkison
 
Posts: 6
Joined: Wed May 19, 2004 6:29 pm

Heh..almost a year later...

Postby diblassio4 » Wed Mar 29, 2006 10:21 pm

Hi George,

I was trying to use the nice example you posted, but using <xsl:analyze-string select="$value" regex="\|"> for pipe-delimited instead.
The problem comes when there is an empty field like such:
field|secondfield|||fifthField (fields 3 and 4 are empty)
field||thirdfield||fifthField (fields 2 and 4 are empty)

then the positions get janked out of consistency in the results...

Any ideas?

Thanks in advance!
diblassio4
 
Posts: 2
Joined: Wed Mar 29, 2006 10:11 pm

Postby Radu » Mon Apr 03, 2006 8:49 am

Radu
 
Posts: 2841
Joined: Fri Jul 09, 2004 5:18 pm

Postby Eiríkr » Mon Nov 19, 2007 7:03 pm

george wrote:Hi Steve,

Oxygen 5.1 does not have this out of the box but you can get that using the XSLT 2.0 support. For instance the following XSLT 2.0 stylesheet (you need to set Saxon8 as transformer): <snip>


The tag names, the name of the input file and its encoding can be specified as parameters. Also if you change the regex in
<xsl:analyze-string select="$value" regex="\t">
then you can handle different delimitators like comma, semicolumn, etc.

Support directly in oXygen for this type of conversions will be available in the next release.

Best Regards,
George


Hello George --

I've now got oXygen 9, and I'm trying this solution out on a .csv file of my own. It seems to work fine in the debugger when the .csv file is explicitly named in the .xsl, and when the .csv file is not the file chosen as the source. But what if I need a more general solution? No matter if I leave the filename out by omitting the following:
Code: Select all
<xsl:param name="doc" select="'sample.txt'"/>

... so long as I specify the .csv file as the source in the dropdowns at the top of the window, I cannot use the basic transformation capabilities of oXygen's debug view, as I get the following error:
Code: Select all
Content is not allowed in prolog.

Clicking the error takes me to a webpage showing a list of Saxon error codes, suggesting that the error was generated by the Saxon parser. Is there any way I'm missing of telling Saxon not to parse the source file as XML, but rather as straight-up text, but *without* having to specify the source file in a parameter?

Cheers,

Eiríkr
Eiríkr
 
Posts: 7
Joined: Mon Nov 19, 2007 6:53 pm
Location: Puget Sound

Postby sorin » Tue Nov 20, 2007 4:08 pm

Hello,

Eiríkr wrote:
george wrote:Oxygen 5.1 does not have this out of the box but you can get that using the XSLT 2.0 support.
...
Support directly in oXygen for this type of conversions will be available in the next release.


Hello George --

I've now got oXygen 9, and I'm trying this solution out on a .csv file of my own.


The support for conversion from CSV files to XML that George mentioned is the import feature available from File -> Import -> Text file (the comma delimiter) as you can see above in Stefan's post. You can use this feature for converting your CSV files to XML and if you need other XML format you should apply an XSLT stylesheet to convert the result of the import operation to your XML format.

Eiríkr wrote:Is there any way I'm missing of telling Saxon not to parse the source file as XML, but rather as straight-up text, but *without* having to specify the source file in a parameter?


Any XSLT transformer requires a well-formed XML document and a valid XSLT stylesheet as inputs. You have to set the name of the CSV file with a parameter to the XSLT stylesheet as you already did with the xsl:param element.


Regards,
Sorin
sorin
 
Posts: 3967
Joined: Fri Mar 28, 2003 2:12 pm

Postby Eiríkr » Tue Nov 20, 2007 6:24 pm

sorin wrote:Any XSLT transformer requires a well-formed XML document and a valid XSLT stylesheet as inputs. You have to set the name of the CSV file with a parameter to the XSLT stylesheet as you already did with the xsl:param element.


Thank you, Sorin. I was wondering if there might be some sort of processor instruction to tell the parsing engine to handle the source document differently, but your comment and further thinking it through tells me that the needed processor instruction would likely have to be in the source document itself anyway, which clearly won't work in my case where the source .csv file is not known prior to runtime, and where the .csv file needs to be read-only to boot. So instead, I'll have to come up with some way of programmatically changing the specific xsl:param attribute value just before running the transformation, and simply handle that side of things via some different engine.

Cheers,

Eiríkr
Eiríkr
 
Posts: 7
Joined: Mon Nov 19, 2007 6:53 pm
Location: Puget Sound

Re: Can Oxygen Convert Tab Delimited To XML?

Postby KermitTensmeyer » Sat Mar 08, 2008 12:38 am

The simple solution is a two step process.

1) convert csv to an excel spreadsheet (open empty spreadsheet, import csv file)

2) oxygen, [file]/[import] spread sheet which brings up a nice interface
(it can be most useful if the first row of the spreadsheet contains column labels which then can become element tags into the converted XML file)

change root and row labels as need, and convert!
KermitTensmeyer
 
Posts: 7
Joined: Sat Mar 08, 2008 12:03 am

Re: Can Oxygen Convert Tab Delimited To XML?

Postby george » Sat Mar 08, 2008 9:08 am

Eiríkr wrote:[...] So instead, I'll have to come up with some way of programmatically changing the specific xsl:param attribute value just before running the transformation, and simply handle that side of things via some different engine.


The filename is a parameter, that means you can specify its value before transforming. The value specified in the code is a default, that is used when you do not specify the parameter.
In oXygen you can specify the parameter from the transformation scenario dialog, see the Parameters button on the XSLT tab. If you have problems with configuring that let us know.

Best Regards,
George
George Cristian Bina
george
Site Admin
 
Posts: 2076
Joined: Thu Jan 09, 2003 2:58 pm


Return to General XML Questions

Who is online

Users browsing this forum: No registered users and 0 guests

cron