Page 1 of 1

Can rejectElements specifications be interpreted as true schema disallowance?

Posted: Fri Jan 21, 2022 8:30 pm
by chrispitude
For content structure and formatting reasons, we specialize <ol> into <procedure> (and <li> into <step>):

Code: Select all

 <elementdomain filename="snpsDomain.rng" domain="snps-d">
  <specialize elements="procedure" from="ol"/>
  <specialize elements="step" from="li"/>
 </elementdomain>
Then, in cc_config_ext.xml, we try to disallow list-item elements in procedure elements and vice versa:

Code: Select all

<!-- configure some content model restrictions -->
<elementProposals path="ol" rejectElements="step"/>
<elementProposals path="ul" rejectElements="step"/>
<elementProposals path="procedure" rejectElements="li"/>
However, the rejectElements specification only seems to affect element insertion; it is not treated as a true schema disallowance. For example, if I attempt to insert a <step> into an <ol>, the insertion itself refuses the context, but then the search for the next-nearest-valid context creates another <ol> and puts it there:

insert_step_in_ol.gif
insert_step_in_ol.gif (82.24 KiB) Viewed 989 times

Note that there is no red violation highlighting for <step> in <ol>. It seems inconsistent to me to disallow element insertion in some ways (element insertion), but allow it in others (next-nearest-valid context search, and existing elements once inserted).

I'm sure Oxygen is operating as designed here, but could this consistency be improved by treating rejectElements specifications as true schema disallowances? Or are there technical reasons why this cannot be done, like being unable to modify the internal schema representation in memory?

Here is a testcase, although it requires a DITA-OT plugin (included) to be installed:

oxygen_rejectElements_disallowance.zip
(29.26 KiB) Downloaded 129 times

Thanks!

Re: Can rejectElements specifications be interpreted as true schema disallowance?

Posted: Tue Jan 25, 2022 5:06 pm
by chrispitude
Some follow-up thoughts...

Schema disallowances (disallowing an element in a containing element) are time-consuming and error-prone to implement in DTD and RelaxNG. If rejectElements could be treated as a true schema disallowance (in the editor, book validation, etc.), it would provide the following advantages:
  • Disallowances could be implemented using rejectElements instead of custom schemas and document-type shells.
  • Even if schema modification is ultimately desired, rejectElements could be used to prototype and test disallowances on existing content, allowing them to be fine-tuned before spending the effort to implement them.
If rejectElements was never intended to be a true disallowance, then perhaps a new disallowElements attribute could be introduced that more clearly describes itself as a true schema disallowance.

However, maybe this is technically impractical to implement. If Oxygen's schema validation comes from a validation engine that directly consumes the DTD or RelaxNG files on disk, it would be difficult to intercept and modify that processing. But if it is possible, I see a lot of value in it.

Re: Can rejectElements specifications be interpreted as true schema disallowance?

Posted: Wed Jan 26, 2022 12:24 pm
by Radu
Hi Chris,
I added an internal issue to look into this:

EXM-49784 Content completion configuration reject elements not used when deciding possible ancestors in applied strategies

At a first glance through our code sometimes when trying to create valid content Oxygen will attempt to detect what possible parent or ancestor elements need to be created for a certain inserted XML fragment, from what I see in our code the content completion configuration filter is not used at all when making these decisions. And I think it could be used.

Regards,
Radu

Re: Can rejectElements specifications be interpreted as true schema disallowance?

Posted: Wed Jan 26, 2022 3:33 pm
by chrispitude
Hi Radu,

Thank you for filing the issue!

I should explain the reason for my current attention to rejectElements. I am seeing increasing cases of writers inserting elements in unusual and unintended places. My guess is that many novice writers keep tags turned off in the editor, and so unusual insertions occur like <fig> in <p> elements or bare text intermixed with block elements in <li> elements. It looks visually fine in the editor with tags off, but the structure is too uncontrolled.

My RelaxNG schema creator at

https://github.com/chrispy-snps/DITA-plugin-utilities/

has a <disallow> specification for disallowing elements in other elements. It attempts to do this subtraction in the RelaxNG domain by recursing into references and uniquifying them back up as needed, but the support is fragile and needs some work. Until I can improve the <disallow> support, I am hoping to use rejectElements to bring some order to the authoring chaos.

Another advantage of rejectElements is that my senior writers could take ownership of updating the cc_config_ext.xml file to add disallowances, as modifications to that file are lightweight and easy to roll out (no schema updates or DITA-OT re-integrations needed, everyone inherits them via Git immediately). Once we have a set of disallowances that we feel confident in, they could be converted to <disallow> in my RelaxNG schema creator.

And if rejectElements is treated as a true schema disallowance in Validation and Check for Completeness, then I could use Oxygen Scripting to validate *all* our books using (1) rejectElements and (2) the resulting RelaxNG schema, and compare those results for identicality.

Radu, in the details for EXM-49784, could you also mention EXM-49670 and EXM-49754 from this discussion as a related issue?

post64831.html
EXM-49670 - support text() in rejectElements specifications
EXM-49754 - consider rejectElements for "insert list item" (and similar) toolbar actions

These are all related to keeping rejectElements specifications consistent across operations.

Thanks!