[oXygen-user] In-element whitespace and Author Mode

Andreas Wagner
Thu Sep 17 04:08:28 CDT 2015


Dear list,

I am sorry for bringing up yet another whitespace question. I almost
believe that in XML the devil lives in the whitespaces. Also, it is
quite possible that our problem is specific to our particular situation
and of no interest to other projects. But then again, maybe someone is
able to help nonetheless.

We deal with TEI xml documents recording linebreaks that in many cases
are not meant to represent a word boundary:

<lb n="016_011"/>que con el pre<lb break="no" rendition="#noHyphen"
    n="016_012"/>sidente o juez que reside en la prouincia: puede
<lb n="016_013"/>hazer thesoreros y receptores en su prouincia:

In order to improve readability of the XML source, all our lines begin
at the leftmost position of the line no matter the nesting level the
current the paragraph is at. The exception is lines that begin with four
spaces in order to align the @n-attribute with other lines and yet have
the lb-element begin without intervening whitespace at the end of the
preceding line/word fragment.

But when I edit the document in author mode, it removes the linebreak
within the element, so that the first of the following is a very long
line and the snippet is only two lines long:

<lb_n="016_011"/>que_con_el_pre<lb_break="no"_rendition="#noHyphen"_n="016_012"/>sidente_o_juez_que_reside_en_la_prouincia:_puede
<lb n="016_013"/>hazer thesoreros y receptores en su prouincia:

(Whitespace and indenting preferences are mentioned below.)

Now our workflow relies on an external file providing links to certain
places in the TEI file:

<a href="W0004.xml#line=449;column=1">016_013</a>

Therefore it is somewhat annoying that editing the TEI leads to ("hard")
lines being drawn together and the external file increasingly pointing
to wrong places.

I understand that author mode parses the XML into a DOM tree and
re-serializes it on save, so I don't know if this behaviour can be
changed at all.
But then what would you suggest how we should be approaching this problem?
(Can we point to the relevant place based on the @n-attribute of the lb
element? If we had to provide all the lbs with @xml:ids I think it would
thwart our attempts to make the xml sources better readable. And all of
this would help us with linking the two files, but the xml file would
still end up with bad readability.)


Thank you for any idea,

Andreas


P.S. I have selected the "Preserve empty lines", "Preserve text as it
is" and "Preserve line breaks in attributes" in Options | Preferences |
Editor/Format/XML and added "*" to the "Preserve Space" Elements. I also
think I have deactivated pretty printing everywhere I could. In Editor |
Edit Modes | Author | Format and indent, I have chosen "only the
modified content".


-- 
Dr. Andreas Wagner
Project "The School of Salamanca"
Academy of Sciences and Literature, Mainz
and Institute of Philosophy
Goethe University Frankfurt
http://salamanca.adwmainz.de

IGF HP 25 / R 2.455
Norbert-Wollheim-Platz 1
60629 Frankfurt am Main
Tel. +49 (0)69/798-32774
Fax  +49 (0)69/798-32794


More information about the oXygen-user mailing list