Importing text files & empty tags
Posted: Thu Dec 14, 2006 9:11 pm
All -
does anyone have a recommendation for importing text files (from tab delimited excel sheets) and avoiding empty tags? I've tried changing the settings under Preferences>Database>Import: "Create empty elements for empty values".
Here are some samples:
Text file =
fileName dateScanned bw/color initials caption description genre firstName lastName unknown dateCreated boxNo folderNo topic1 topic2 topic3 topic4 geo1 geo2 geo3 nameSubject1 nameSubject2 nameSubject3 nameSubject4 nameSubject5 nameSubject6 nameSubject7 nameSubject8 nameSubject9 nameSubject10 temporal
aai0039 10-19-09 bw gc Students - 1921 "Group of 5 boys, 1 girl in the background" photograph Unknown 1921 61 20 Students Boys Girls "Gatlinburg, Tennessee" "Founding of Pi Beta Phi Settlement School, Gatlinburg, Tennessee, 1909-1927"
What I'd like to get:
XML =
<row>
<fileName>aai0039</fileName>
<dateScanned>10-19-09</dateScanned>
<bw_color>bw</bw_color>
<initials>gc</initials>
<caption>Students - 1921</caption>
<description>"Group of 5 boys, 1 girl in the background"</description>
<genre>photograph</genre>
<unknown>Unknown</unknown>
<dateCreated>1921</dateCreated>
<boxNo>61</boxNo>
<folderNo>20</folderNo>
<topic1>Students</topic1>
<topic2>Boys</topic2>
<topic3>Girls</topic3>
<geo1>"Gatlinburg, Tennessee"</geo1>
<temporal>"Founding of Pi Beta Phi Settlement School, Gatlinburg, Tennessee, 1909-1927"</temporal>
</row>
And this is what I'm actually getting when I import the file:
Reality =
<row>
<fileName>aai0039</fileName>
<dateScanned>10-19-09</dateScanned>
<bw_color>bw</bw_color>
<initials>gc</initials>
<caption>Students - 1921</caption>
<description>"Group of 5 boys, 1 girl in the background"</description>
<genre>photograph</genre>
<firstName></firstName>
<lastName></lastName>
<unknown>Unknown</unknown>
<dateCreated>1921</dateCreated>
<boxNo>61</boxNo>
<folderNo>20</folderNo>
<topic1>Students</topic1>
<topic2>Boys</topic2>
<topic3>Girls</topic3>
<topic4></topic4>
<geo1>"Gatlinburg, Tennessee"</geo1>
<geo2></geo2>
<geo3></geo3>
<nameSubject1></nameSubject1>
<nameSubject2></nameSubject2>
<nameSubject3></nameSubject3>
<nameSubject4></nameSubject4>
<nameSubject5></nameSubject5>
<nameSubject6></nameSubject6>
<nameSubject7></nameSubject7>
<nameSubject8></nameSubject8>
<nameSubject9></nameSubject9>
<nameSubject10></nameSubject10>
<temporal>"Founding of Pi Beta Phi Settlement School, Gatlinburg, Tennessee, 1909-1927"</temporal>
</row>
Also, I get the same results when I try to directly import the excel file, and skip using the tab delimited file. Can anyone lend me a hand with this or offer some suggestions?
Thanks so much,
B Dyson-Smith
does anyone have a recommendation for importing text files (from tab delimited excel sheets) and avoiding empty tags? I've tried changing the settings under Preferences>Database>Import: "Create empty elements for empty values".
Here are some samples:
Text file =
fileName dateScanned bw/color initials caption description genre firstName lastName unknown dateCreated boxNo folderNo topic1 topic2 topic3 topic4 geo1 geo2 geo3 nameSubject1 nameSubject2 nameSubject3 nameSubject4 nameSubject5 nameSubject6 nameSubject7 nameSubject8 nameSubject9 nameSubject10 temporal
aai0039 10-19-09 bw gc Students - 1921 "Group of 5 boys, 1 girl in the background" photograph Unknown 1921 61 20 Students Boys Girls "Gatlinburg, Tennessee" "Founding of Pi Beta Phi Settlement School, Gatlinburg, Tennessee, 1909-1927"
What I'd like to get:
XML =
<row>
<fileName>aai0039</fileName>
<dateScanned>10-19-09</dateScanned>
<bw_color>bw</bw_color>
<initials>gc</initials>
<caption>Students - 1921</caption>
<description>"Group of 5 boys, 1 girl in the background"</description>
<genre>photograph</genre>
<unknown>Unknown</unknown>
<dateCreated>1921</dateCreated>
<boxNo>61</boxNo>
<folderNo>20</folderNo>
<topic1>Students</topic1>
<topic2>Boys</topic2>
<topic3>Girls</topic3>
<geo1>"Gatlinburg, Tennessee"</geo1>
<temporal>"Founding of Pi Beta Phi Settlement School, Gatlinburg, Tennessee, 1909-1927"</temporal>
</row>
And this is what I'm actually getting when I import the file:
Reality =
<row>
<fileName>aai0039</fileName>
<dateScanned>10-19-09</dateScanned>
<bw_color>bw</bw_color>
<initials>gc</initials>
<caption>Students - 1921</caption>
<description>"Group of 5 boys, 1 girl in the background"</description>
<genre>photograph</genre>
<firstName></firstName>
<lastName></lastName>
<unknown>Unknown</unknown>
<dateCreated>1921</dateCreated>
<boxNo>61</boxNo>
<folderNo>20</folderNo>
<topic1>Students</topic1>
<topic2>Boys</topic2>
<topic3>Girls</topic3>
<topic4></topic4>
<geo1>"Gatlinburg, Tennessee"</geo1>
<geo2></geo2>
<geo3></geo3>
<nameSubject1></nameSubject1>
<nameSubject2></nameSubject2>
<nameSubject3></nameSubject3>
<nameSubject4></nameSubject4>
<nameSubject5></nameSubject5>
<nameSubject6></nameSubject6>
<nameSubject7></nameSubject7>
<nameSubject8></nameSubject8>
<nameSubject9></nameSubject9>
<nameSubject10></nameSubject10>
<temporal>"Founding of Pi Beta Phi Settlement School, Gatlinburg, Tennessee, 1909-1927"</temporal>
</row>
Also, I get the same results when I try to directly import the excel file, and skip using the tab delimited file. Can anyone lend me a hand with this or offer some suggestions?
Thanks so much,
B Dyson-Smith