Page 1 of 1

Confine pretty-printing to teiHeader

Posted: Wed Apr 24, 2019 1:53 am
by david_himself
Hi

I wrote an XSLT to edit a folder of 400+ XML documents. It was meant to adjust many elements inside teiHeader/fileDesc without changing anything substantive in the rest of the teiHeader or in the text portion. When I added indent="yes" to the XSLT, it pretty-printed the whole header of the XMLs, which was fine, but also reformatted the text portion, which was a disaster. Not only was the text now very hard to read and work with, but I discovered much too late -- after a lot of further work -- that addition or removal of spaces between tags meant that the resulting processed text had occasional missing interword spaces and conversely, unwanted intraword spaces. I assume I'll have to revert and start again. If so, I need to be sure of not corrupting my texts.

Is it possible to use indent="yes" in an XSLT file but confine its operation to the teiHeader only, or even to just those parts of the file affected by specific templates? Alternatively, to first prettify a folder of files with Format and Indent, but again ONLY formatting their teiHeaders, and then to use an XSLT with indent="no" which completely faithfully copies unchanged elements -- in this case, much of the header and the whole text portion? Thanks for any pointers.

D

Re: Confine pretty-printing to teiHeader

Posted: Wed Apr 24, 2019 9:22 am
by Radu
Hi David,

I do not know a way to do this using XSLT. Maybe you can use some kind of scripting language to extract the TEI header separately, indent it using XSLT and then push it back in the XML file.

Regards,
Radu

Re: Confine pretty-printing to teiHeader

Posted: Wed Apr 24, 2019 9:48 am
by david_himself
Thanks, Radu. I've found this useful discussion: http://tei-l.970651.n3.nabble.com/oXyge ... 46233.html

I'm currently thinking of doing the following. (1) Reverting my XMLs to the latest version before the XSLT edit which included indent="yes". (2) Removing all tabs and blank lines from the TEI headers. (3) Adding the element text to the Preserve space list in Options > Preferences > Editor > Format > XML. (4) Then running Document > Source > Format and Indent element. That seems to pretty-print the header but leave the text portion untouched, as far as I can see at a quick look. Then (5) run the XSLT without indent="yes".

Uh-oh: no macros in Oxygen. How would I do step (2) and especially step (4) on a whole bunch of files?

best
David

Re: Confine pretty-printing to teiHeader

Posted: Wed Apr 24, 2019 10:41 am
by david_himself
PS. Or globally edit the text element in each XML to add the attribute xml:space="preserve" (and adjust schema accordingly), then allow the original XSLT to run with indent="yes". That looks a (deceptively?) simple fix. Is there a catch?

David

Re: Confine pretty-printing to teiHeader

Posted: Wed Apr 24, 2019 12:22 pm
by Radu
Hi David,

You can format and indent a set of files using Oxygen:

https://www.oxygenxml.com/doc/versions/ ... odes2.html

and the format and indent settings should be used for each of them.
If you want to add a certain attribute to a certain element for lots of files, you can go to the Oxygen main menu Tools->XML Refactoring->"Add/change attribute".

Regards,
Radu

Re: Confine pretty-printing to teiHeader

Posted: Wed Apr 24, 2019 1:03 pm
by david_himself
Thank you for quick responses. Should be OK now, touch wood.

D