Page 1 of 1
Customize formatting behavior
Posted: Fri Feb 06, 2015 2:19 pm
by misty
We recently moved our XML source to Git and had the "smart" idea to run an XML formatting script across it. It looked fine with spot checks so we imported that into Git. Afterward we noticed some quirks to do with whitespace. Some of these are real, and others seem to be to do with the way that oXygen displays the XML in Author mode. I'm hoping there may be some way for me to use oXygen to fix the real problems and maybe configure it differently to work around the false ones. Let me explain.
First, the "fake" problem. If you have code like this:
Code: Select all
<p>
The paragraph text here.
</p>
oXygen shows the carriage return after the <p> tag as an extra space, and shows an extra space before the </p> tag, in Author mode. This is disconcerting but has no actual effect in the transform. It seems that it might be a bug in the way that oXygen is displaying the WYSIWYG output. Maybe we can tweak this behavior somehow.
Now, the real problem that I'm finding it hard to fix is this type of thing:
Code: Select all
<li id="li_nw1_nw1_fn">When you are satisfied with the assignments, click
<b>Continue</b>
. The Database Setup page displays.</li>
We would really like this to show up as:
Code: Select all
<li id="li_nw1_nw1_fn">When you are satisfied with the assignments, click
<b>Continue</b> The Database Setup page displays.</li>
This happens other places too, where obviously our formatter didn't understand the difference between block and inline elements or something. Is there any way I can configure oXygen to fix these types of things, preferrably in bulk? Or do you know of any other tool?
Re: Customize formatting behavior
Posted: Mon Feb 09, 2015 1:58 pm
by adrian
Hello,
1. The "fake" problem is rather real, but it may be insignificant (or not obvious) for the published result.
If your initial paragraph looked like this:
it is not equivalent (for Oxygen and for publishing) with what you have now:
For elements that are not "space preserve", Oxygen normalizes the whitespaces when switching to Author mode (not a bug). Normalizing means that a sequence of whitespaces (spaces, tabs and line breaks) is replaced with a single space character. This is the reason you now see an extra space before and after the mentioned text.
2. Regarding the "real" problem, you could try to use the Oxygen formatting (
Document > Source > Format and Indent) to fix that. For me it looks like this after formatting:
Code: Select all
<li id="li_nw1_nw1_fn">When you are satisfied with the assignments, click <b>Continue</b> .
The Database Setup page displays.</li>
<b>Continue</b> is on the same row due to the longer line width, 100 (default), allowed by the Oxygen formatting options.
You can adjust the formatting options in the option page
Options > Preferences, Editor > Format and its
XML subpage.
Regards,
Adrian
Re: Customize formatting behavior
Posted: Tue Feb 10, 2015 12:18 am
by misty
Hi Adrian, thanks for answering my questions. I'd like to explore the question of the <p> tags on their own line a bit further. Even OASIS has the opening <p> in its own line in their example in the reference for the <p> element itself:
Code: Select all
<p>
It is probable that <q>temporary</q> or <q>new</q> stars, as these
wonderful apparitions are called, really are <term>conflagrations</term>;
not in the sense of a bonfire or a burning house or city, but in that of
a sudden eruption of <i>inconceivable</i> heat and light, such as would
result from the stripping off the shell of an encrusted sun or the crashing
together of two mighty orbs flying through space with a hundred times
the velocity of the swiftest cannon-shot.</p>
(from
OASIS)
I really don't think that they indended there to be an extra space after the opening <p> tag in this example. There is no requirement that <p> be on the same line as the sentence starting the paragraph. In fact, I don't think space (beyond a single space between words) is supposed to be significant inside the <p> tag. Am I mistaken?
Re: Customize formatting behavior
Posted: Wed Feb 11, 2015 5:10 pm
by adrian
Hi,
I can't comment on the OASIS choice for example, but it is just that, an example.
It's true that for DITA, the publishing seems to trim the leading and trailing whitespaces of the text, so they are considered insignificant.
However, even though the space may not be intended to be shown for the DITA p element, since Oxygen is WYSIWYM (What you see is what you mean) and not WYSIWYG, it doesn't have to perfectly mimic the intended published output, just preserve its meaning.
Oxygen keeps a 1:1 mapping between the contents of the XML source (Text mode) and the visual editing mode (Author). That's the reason it will still show you the space, because it is there in the Text mode (XML source). Removing it from the Author means you're also removing it from the XML source (and vice-versa).
Let's agree to call this a limitation that Oxygen enforces in order to keep the XML source document as close as it can to the original source after editing it in the visual mode.
Regards
Adrian
Re: Customize formatting behavior
Posted: Sun Mar 15, 2015 10:01 am
by jrussell
It sounds like a good opportunity for a pre-commit hook, for organizations using git for source control. If it was an author choice or company policy not to include spacing in their source immediately next to paragraph tags, such as:
<p> Some text.
More text. </p>
then those spaces could be removed automatically prior to commit.