Schematron: start element with a capital

pieterjan_vdw
Posts: 27
Joined: Wed Jun 20, 2018 11:30 am

Schematron: start element with a capital

Post by pieterjan_vdw » Tue Apr 14, 2020 12:16 pm

Hi,

I am writing a rule to check capitalization at the beginning of a sentence.
To do this, I used some of the code found here: post57735.html?hilit=schematron#p57587

Code: Select all

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema queryBinding="xslt2" xmlns:sch="http://purl.oclc.org/dsdl/schematron" xmlns:sqf="http://www.schematron-quickfix.com/validator/process">
	<sch:pattern abstract="true" id="starts-with-capital">
		<sch:rule context="$element" role="information">
			<sch:let name="firstNodeIsElement" value="node()[1] instance of element()"/>
			<sch:report test="(not($firstNodeIsElement) and (not(matches(text(), '^[A-Z|0-9]'))))">Start the element &lt;$element&gt; with a capital.</sch:report>
		</sch:rule>
	</sch:pattern>
	<sch:pattern is-a="starts-with-capital">
		<sch:param name="element" value="title"/>
	</sch:pattern>
	<sch:pattern is-a="starts-with-capital">
		<sch:param name="element" value="li"/>
	</sch:pattern>
</sch:schema>
This rule already works fine in most cases.
I only get an error message when I have for example something like in the third <li>:

Code: Select all

		<ul id="ul_zrm_wc1_jlb">
			<li>This is the first item in the list.</li>
			<li>Second item in the list.</li>
			<li>3<sup>rd</sup> item in the list.</li>
		</ul>
Then I get the following error message: A sequence of more than one item is not allowed as the first argument of fn:matches() ("3", " item in the list.")

How should I change my rule to avoid this?

Thank you.

tavy
Posts: 219
Joined: Thu Jul 01, 2004 12:29 pm

Re: Schematron: start element with a capital

Post by tavy » Tue Apr 14, 2020 2:41 pm

Hello,

The problem is that the matches function allows only one node. If you specify "text()" as first argument, this means that all the text nodes from the current element will be matched.
You have two options:
1. You can pass all the text content from the current element, by changing the test something like this:

Code: Select all

(not($firstNodeIsElement) and (not(matches(., '^[A-Z|0-9]'))))
2. You can pass only the first text node from the current element, by changing the test something like this:

Code: Select all

(not($firstNodeIsElement) and (not(matches(text()[1], '^[A-Z|0-9]'))))
Best Regards,
Octavian
Octavian Nadolu
<oXygen/> XML Editor
http://www.oxygenxml.com

Post Reply