Page 1 of 1

Useful Schematron checks for DITA authoring

Posted: Wed Mar 18, 2020 10:11 pm
by chrispitude
Hi all,

I wanted to start a discussion thread where we can share useful Schematron checks for DITA authoring.

Here's a check that reports text elements that begin/end with spaces:

Code: Select all

  <rule context="p|ph|codeph|filename|indexterm|xref|user-defined|user-input" role="warning">
    <let name="firstNodeIsElement" value="node()[1] instance of element()"/>
    <let name="lastNodeIsElement" value="node()[last()] instance of element()"/>
    <report test="(not($firstNodeIsElement) and matches(.,'^\s',';j')) or (not($lastNodeIsElement) and matches(.,'\s$',';j'))">Textual elements should not begin or end with whitespace.</report>
  </rule>
You can set the context to the list of elements to check. This check has some extra machinery to avoid false reporting of elements like this:

Code: Select all

<p><xref ...> provides more information on this.</p>
where the <xref> resolves as empty and the space after the <xref> causes it to report.

Re: Useful Schematron checks for DITA authoring

Posted: Thu Mar 19, 2020 9:33 am
by Radu
Hi Chris,

Adding a reference to the Schematron rules we use in the Oxygen User's Guide:

https://github.com/oxygenxml/userguide/ ... DITA/rules

Most of our Schematron rules are here:

https://github.com/oxygenxml/userguide/ ... vanced.sch

Besides Schematron rules we also have a style guide, a separate DITA Map with topics in which our technical documentation writers write rules about how to write the documentation.
An older blog post describes some of the rules:

https://blog.oxygenxml.com/topics/SchematronBCs.html

Also George Bina created this project:

https://github.com/oxygenxml/dim

which has some sample rules contributed by Comtech and attempts to produce the Schematron rules using XSLT from DITA topics which describe them.

Regards,
Radu

Re: Useful Schematron checks for DITA authoring

Posted: Tue Mar 31, 2020 4:09 pm
by chrispitude
Here is a check that reminds writers to populate cross-book links with reference text:

Code: Select all

<pattern id="refs">
  <rule context="link|xref" role="error">
    <report test="not(node()) and contains(@keyref, '.')">Empty cross-book reference; please add the target text.</report>
  </rule>
</pattern>

Re: Useful Schematron checks for DITA authoring

Posted: Wed Apr 01, 2020 8:31 am
by Radu
Hi,

It's a good rule if you use key scopes only for cross publication references. If you use them for internal links then it would show errors even if the processor is able to come up with by itself with titles for links.

Regards,
Radu

Re: Useful Schematron checks for DITA authoring

Posted: Wed Feb 17, 2021 1:49 am
by chrispitude
If you keep all your DITA content inside a dita/ directory, here is a check for any @href file references that references a file above the dita/ directory level:

Code: Select all

<!-- compute how many directory levels exist past '/dita/' (or -1 if it doesn't exist) -->
<let name="this_file_depth" value="if (contains(base-uri(), '/dita/')) then (count(tokenize(substring-after(base-uri(), '/dita/'), '/'))-1) else (-1)"/>

<!-- make sure there aren't more '../' than our directory levels past '/dita/' -->
<pattern id="check_depth">
  <rule context="*[contains(@href, '../') and not(@scope = 'external')]" role="error">
    <report test="($this_file_depth >= 0) and ((count(tokenize(@href, '\.\./'))-1) > $this_file_depth)">@href refers to a file outside the current dita/ directory.</report>
  </rule>
</pattern>

Re: Useful Schematron checks for DITA authoring

Posted: Wed Feb 17, 2021 9:23 am
by Radu
Hi Chris,

Thanks for posting your custom Schematron check, maybe others will find it useful.

Regards,
Radu

Re: Useful Schematron checks for DITA authoring

Posted: Tue Jun 15, 2021 10:34 pm
by chrispitude
Some of our writers kept their Oxygen project directory in their Microsoft OneDrive folder (Windows only). Unfortunately, OneDrive's aggressive filesystem locking prevents Oxygen's Git plugin from working correctly.

To resolve this, I added the following check to our map-level and topic-level Schematron files:

Code: Select all

<pattern id="onedrive_check">
 <rule context="/" role="error">
  <report test="contains(base-uri(), 'OneDrive')">Oxygen projects using Git should not be placed in OneDrive folders.</report>
 </rule>
</pattern>

Re: Useful Schematron checks for DITA authoring

Posted: Wed Jun 16, 2021 9:48 pm
by chrispitude
One of our writers got confused in the <mapref> configuration window and created a <mapref> that does not reference anything (no @keyref or @href):

Code: Select all

<mapref format="ditamap" keyscope="another_book" navtitle="another_book"
      processing-role="resource-only" scope="peer"/>
I added the following check to catch this:

Code: Select all

<rule context="mapref" role="error">
  <report test="not(@href or @keyref) and (not(@navtitle) or @processing-role='resource-only')">'mapref' does not reference any content or provide a @navtitle.</report>
</rule>

Re: Useful Schematron checks for DITA authoring

Posted: Thu Jun 09, 2022 8:31 pm
by chrispitude
We found that when the following CSS property is applied to make long words breakable in table cells:

Code: Select all

.table { word-break: break-word; }
and a table has a mix of fixed-value and proportional-allocation width values:

Code: Select all

<tgroup cols="3">
  <colspec colname="c1" colnum="1" colwidth="2in"/>
  <colspec colname="c2" colnum="2" colwidth="1*"/>
  <colspec colname="c3" colnum="3" colwidth="2*"/>
then the proportional allocations greedily consume the width, causing the fixed-width column to be compressed in the HTML5 and WebHelp outputs:

image.png
image.png (22.59 KiB) Viewed 1343 times

(This seems like a bug to me to ignore a fixed-width spacing, but Firefox, Chrome, and Edge all do it.)

To help writers avoid this condition, we added the following Schematron check:

Code: Select all

<pattern id="tables_error">
  <rule context="tgroup" role="error">
    <report test="colspec/@colwidth[contains(., '*')] and colspec/@colwidth[matches(., '\d') and not(contains(., '*'))]">Do not mix proportional ("*") widths with fixed widths; use blank values for automatically allocated widths.</report>
  </rule>
</pattern>
Here is a testcase for the browser behavior (but not with the Schematron check), if anyone wants to experiment with it:

html5_table_column_widths.zip
(3.42 KiB) Downloaded 216 times

Re: Useful Schematron checks for DITA authoring

Posted: Wed Jul 06, 2022 9:06 pm
by chrispitude
I noticed today that my writers sometimes made typographical errors in their units in their @colwidth (table column width) values, such as "i" instead of "in". Here is a Schematron check to report invalid units:

Code: Select all

<pattern id="tables_error">
  <rule context="tgroup" role="error">
    <report test="colspec/@colwidth[matches(., '\d(?!\d|\.|cm|em|in|mm|pi|pt|px|[*%\s]|$)', ';j')]">Valid column width units are blank (automatic), * (proportional), in, pt, px, em, cm, mm, and pi.</report>
  </rule>
</pattern>
Although the CALS table specification lists only certain units as supported, PDF Chemistry and WebHelp (browsers) support some additional CSS-based units that I permitted as valid in the check.