Page 1 of 1

Multiple values in target

Posted: Thu Oct 15, 2020 11:23 am
by david_himself
I want Schematron to check that a target in a TEI/XML file corresponds to an xml:id in another file. My (possibly clumsy) report uses

test="not(document('../personography/HAMpersons.xml')//tei:TEI//tei:person/[@xml:id] = substring-after(./@ref,'#'))"

This works fine for the majority of cases, but obviously not for targets that contain more than one value, as in

<rs ref="../personography/HAMpersons.xml#EDP ../personography/HAMpersons.xml#SD">my Sisters</rs>

At present I disregard such cases with an additional argument in the test (and I note that ctrl-Click in Oxygen doesn't jump to anything when there are multiple values in ref/@target). How could I adapt the test to cover both single and multiple values? Thanks.

David

Re: Multiple values in target

Posted: Thu Oct 15, 2020 2:47 pm
by tavy
Hello David,

A solution can be to tokenize the value of the @ref attribute, and then use a "for" to iterate the values and extract the IDs. Something like this:

Code: Select all

<sch:let name="refIds" value="for $id in tokenize(@ref, ' ') return substring-after($id, '#')"></sch:let>
<sch:assert test="document('../personography/HAMpersons.xml')//tei:TEI//tei:person/[@xml:id] = $refIds">
                Not all references are added: <sch:value-of select="$refIds"/>
</sch:assert>
Best Regards,
Octavian

Re: Multiple values in target

Posted: Thu Oct 15, 2020 3:02 pm
by david_himself
Thanks so much, Octavian. All my attempts to use a let statement and the tokenize fn had either not validated in the ODD or had thrown an error at a later stage. This solves my immediate problem, and the technique will be useful elsewhere too.
David

Re: Multiple values in target

Posted: Thu Oct 15, 2020 5:22 pm
by david_himself
Still a problem. The assertion you suggested works fine for persName/@ref or rs/@ref with a single value, but with multiple space-delimited values, it only fires if ALL are missing from the personography -- that is, the assertion is satisfied if ANY ONE of the values matches an xml:id. I suspect it's to do with the definition of equality between a node and a set, but I don't know how to ensure that each member of the tokenised set separately satisfies the equality.

D

Re: Multiple values in target

Posted: Fri Oct 16, 2020 11:49 am
by tavy
Hi David,

I changed the Schematron to check all references. I also collect all teh IDs that are missing and I print them in the assert message.

Code: Select all

<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
    <sch:ns uri="http://www.tei-c.org/ns/1.0" prefix="tei"/>
    
    <sch:let name="personIds" value="document('../personography/HAMpersons.xml')/tei:TEI//tei:person/@xml:id"/>
    <sch:pattern>
        <sch:rule context="tei:rs">
            <sch:let name="refIds" value="for $id in tokenize(@ref, ' ') return substring-after($id, '#')"></sch:let>
            <sch:let name="missingIds" value="for $id in $refIds return (if($id = $personIds) then '' else $id)"/>
            
            <sch:report test="$missingIds != ''">
                The following ids "<sch:value-of select="$missingIds"/>" are not define in "<sch:value-of select="$personIds"/>"
            </sch:report>
        </sch:rule>
    </sch:pattern>
</sch:schema>
Best Regards,
Octavian

Re: Multiple values in target

Posted: Fri Oct 16, 2020 3:17 pm
by david_himself
Dear Octavian,

Thanks once again. I'm currently using a very long schema ODD and a short standalone Schematron file. I'm embarrassed to say that in my ignorance I haven't managed either to embed your new Schematron fragment in my ODD or to use it as a standalone file. I don't want to post a very long file here. Happy to send it off-list, or maybe this description will be enough for you to diagnose what's needed. Currently my schema ODD starts as follows:

<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:rng="http://relaxng.org/ns/structure/1.0" xml:lang="en" xmlns:sch="http://purl.oclc.org/dsdl/schematron">
<teiHeader>
[... conventional TEI header]
</teiHeader>
<text>
<body>
[...]
<schemaSpec ident="HAM_TEI_schema" start="TEI teiCorpus" prefix="h" docLang="en">
[etc., including a number of Schematron <report> and <assert>]

My working standalone Schematron file begins as follow:

<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2"
xmlns:sqf="http://www.schematron-quickfix.com/validator/process"> <ns
prefix="tei" uri="http://www.tei-c.org/ns/1.0"/>
<pattern>
<rule context="tei:text//text()">
[etc.]

How should I adapt the fragment you wrote (and/or my existing ODD)?

best
David

Re: Multiple values in target

Posted: Fri Oct 16, 2020 4:05 pm
by tavy
I think you just need to copy the following fragment and paste it after a pattern from your current Schematron file.

Code: Select all

<sch:let name="personIds" value="document('../personography/HAMpersons.xml')/tei:TEI//tei:person/@xml:id"/>
<sch:pattern>
        <sch:rule context="tei:rs">
            <sch:let name="refIds" value="for $id in tokenize(@ref, ' ') return substring-after($id, '#')"></sch:let>
            <sch:let name="missingIds" value="for $id in $refIds return (if($id = $personIds) then '' else $id)"/>
            
            <sch:report test="$missingIds != ''">
                The following ids "<sch:value-of select="$missingIds"/>" are not define in "<sch:value-of select="$personIds"/>"
            </sch:report>
        </sch:rule>
</sch:pattern>
 

Re: Multiple values in target

Posted: Fri Oct 16, 2020 4:10 pm
by david_himself
It's OK! Got it working as a standalone file. I am really grateful for your patience and help.

best
David