Schematron Rule for Trailing Spaces in Elements
Posted: Wed Jan 05, 2022 6:40 pm
I created a schematron rule to attempt to report on any element that begins or ends with a space character. I initially tried limiting the context to text() as follows, but that solution reported on elements that contained a child element as the first or last child as well as those that start or end with a space character.
<sch:pattern id="LeadingOrTrailingSpace">
<sch:rule context="text()" role="error">
<sch:report test="starts-with(., ' ')">Elements should not begin with a leading space.</sch:report>
<sch:report test="ends-with(., ' ')">Elements should not end with a trailing space.</sch:report>
</sch:rule>
</sch:pattern>
I then changed the pattern to have a context of all elements and that successfully works for leading space reporting.
<sch:pattern id="LeadingOrTrailingSpace">
<sch:rule context="*[(@class, ' . ')]" role="error">
<sch:report test="(not(node()[1] instance of element())) and starts-with(., ' ')">Elements should not begin with a leading space.</sch:report>
<sch:report test="(not(descendant::node()[last()] instance of element()) and ends-with(., ' '))">Elements should not end with a trailing space.</sch:report>
</sch:rule>
</sch:pattern>
However, trailing space reporting is still problematic. For some reason, elements such as <conbody> and <ul> are being reported upon as having a trailing space. It appears as though schematron is finding the spaces between <conbody> and <p> and <ul> and <li>. Here's the sample topic I'm using to test:
<concept id="leadingTrailingSpace">
<title>Leading Space Trailing Space</title>
<shortdesc>Test of schematron rules for leading and trailing space identification.</shortdesc>
<conbody>
<p> This paragraph starts with a leading space and is handled correctly.</p>
<p>This paragraph ends with a trialing space and is handled correctly. </p>
<p><ph>This paragraph starts with a phrase element</ph> and continues with plain text and is handled correctly.</p>
<p>This paragraph starts with plain text and <ph>ends with a phrase element and is handled correctly.</ph></p>
<p>This paragraph does not begin or end with a space and is handled correctly.</p>
<ul>
<li>List Item #1</li>
<li>List Item #2</li>
<li>List Item #3</li>
</ul>
</conbody>
</concept>
Does anyone have any ideas of how I can solve the issue with trailing space reporting on non-textual parent elements?
Thanks so much!
Zak Binder
zak.binder@ukg.com
<sch:pattern id="LeadingOrTrailingSpace">
<sch:rule context="text()" role="error">
<sch:report test="starts-with(., ' ')">Elements should not begin with a leading space.</sch:report>
<sch:report test="ends-with(., ' ')">Elements should not end with a trailing space.</sch:report>
</sch:rule>
</sch:pattern>
I then changed the pattern to have a context of all elements and that successfully works for leading space reporting.
<sch:pattern id="LeadingOrTrailingSpace">
<sch:rule context="*[(@class, ' . ')]" role="error">
<sch:report test="(not(node()[1] instance of element())) and starts-with(., ' ')">Elements should not begin with a leading space.</sch:report>
<sch:report test="(not(descendant::node()[last()] instance of element()) and ends-with(., ' '))">Elements should not end with a trailing space.</sch:report>
</sch:rule>
</sch:pattern>
However, trailing space reporting is still problematic. For some reason, elements such as <conbody> and <ul> are being reported upon as having a trailing space. It appears as though schematron is finding the spaces between <conbody> and <p> and <ul> and <li>. Here's the sample topic I'm using to test:
<concept id="leadingTrailingSpace">
<title>Leading Space Trailing Space</title>
<shortdesc>Test of schematron rules for leading and trailing space identification.</shortdesc>
<conbody>
<p> This paragraph starts with a leading space and is handled correctly.</p>
<p>This paragraph ends with a trialing space and is handled correctly. </p>
<p><ph>This paragraph starts with a phrase element</ph> and continues with plain text and is handled correctly.</p>
<p>This paragraph starts with plain text and <ph>ends with a phrase element and is handled correctly.</ph></p>
<p>This paragraph does not begin or end with a space and is handled correctly.</p>
<ul>
<li>List Item #1</li>
<li>List Item #2</li>
<li>List Item #3</li>
</ul>
</conbody>
</concept>
Does anyone have any ideas of how I can solve the issue with trailing space reporting on non-textual parent elements?
Thanks so much!
Zak Binder
zak.binder@ukg.com