locating a string with text() in Schematron

This should cover W3C XML Schema, Relax NG and DTD related problems.
david_himself
Posts: 40
Joined: Mon Oct 01, 2018 7:29 pm

locating a string with text() in Schematron

Post by david_himself »

In my ODD file (for Relax NG XML) I want a Schematron test to identify straight quotes in text in mixed content files. I've tried this:

<elementSpec ident="text" module="textstructure" mode="change">
<!-- Reminder about curly quotes in text -->
<constraintSpec ident="quotes" scheme="schematron">
<constraint>
<sch:report test=".//text()[matches(.,'&quot;')]" role="warning">Replace straight double quotes in text [but NOT in mark-up tags!] by &#8220;curly&#8221; quotes, &amp;#8220; for left and &amp;#8221; for right.
</sch:report>
</constraint>
</constraintSpec>

The rule does fire with the appropriate files, but it doesn't send the user to the actual line containing straight quotes. I suspect that needs a simple adjustment to my use of text() in XPath, but I haven't figured out what. Help appreciated.
david_himself
Posts: 40
Joined: Mon Oct 01, 2018 7:29 pm

Re: locating a string with text() in Schematron

Post by david_himself »

Just realised (Duh!) that I might be able to use the <lb/> at the start of every line. Should make the problem more tractable. Will try again with that in mind, and if that does it, apologies for having troubled you.
David
tavy
Posts: 364
Joined: Thu Jul 01, 2004 12:29 pm

Re: locating a string with text() in Schematron

Post by tavy »

Hello,

Thanks for your feedback.
When a problem is reported on a text node the highlight will be added on the parent or on the previous element. We have an issue on our issue tracker to add a better support for locating text nodes. I added your comment on the issue and I increased its priority. When this will be solved we will update this thread.

Best Regards,
Octavian
Octavian Nadolu
<oXygen/> XML Editor
http://www.oxygenxml.com
david_himself
Posts: 40
Joined: Mon Oct 01, 2018 7:29 pm

Re: locating a string with text() in Schematron

Post by david_himself »

Hi Octavian. Many thanks for your answer. As you suggest, highlighting the beginning of the line isn't perfect, but it would be good enough for now. I've moved several errors and warnings for particular text strings from the <text> context to <lb>. The (relatively unimportant) warning about straight quotes has the simplest regex and will serve as as an example. I've tried several variants, e.g.

(1)
<constraintSpec ident="quotes" scheme="schematron">
<constraint>
<sch:report test="./following-sibling::text()[1][matches(.,'&quot;')]" role="warning">[message]
</sch:report>
</constraint>
</constraintSpec>

(2)
<constraintSpec ident="quotes" scheme="schematron">
<constraint>
<sch:report test="./following::text()[matches(.,'&quot;')]" role="warning">[message]
</sch:report>
</constraint>
</constraintSpec>

The <text> portion of the TEI/XML file that's being validated is illustrated in the following snippet:

<lb/>him <add place="above">in a whisper</add> to come out of <choice><abbr>ye.</abbr><expan>the</expan></choice> Room to ask these young <choice n="hyp"><orig>Gentle=
<lb break="no"/>=mens</orig><reg>Gentlemens<lb/></reg></choice> pardon, but he instead of <choice><orig>listning</orig><reg>listening</reg></choice> to me, like
<lb/>a true designing mean, sneaking Coward, cried

Report (1) fires correctly if the offending string occurs in a line with no other node after the <lb>, e.g. the last one in the snippet above, but it doesn't fire if the offending string occurs inside or after any other elements on the line. Report (2) picks up the string anywhere but issues multiple warnings, only the last of which is to the relevant line. What is needed is an XPath search which will examine ALL textual material between one <lb> and the next. How would I do that?

Thanks as ever.
David
Post Reply