Schematron and tokenize
This should cover W3C XML Schema, Relax NG and DTD related problems.
			- 
				david_himself
- Posts: 45
- Joined: Mon Oct 01, 2018 7:29 pm
Schematron and tokenize
Post by david_himself »
Hi
I'm embedding Schematron rules in a schema for validating a TEI/XML personography. I have eight reports that constrain the attributes and values in the context of <relation>. Seven fire correctly and one never fires. For some rules I'm tokenizing the value of @mutual and looking at the individual tokens. The schema is an ODD file that produces Relax NG XML syntax.
This rule works, not looping through the tokens but relying on the fact that seq1 = seq2 is true if they have at least one element in common:
<sch:report test="@mutual and not(tokenize(@mutual) = concat('#',./ancestor::tei:person/@xml:id))" role="error">Value of @mutual must include <sch:value-of select="concat('#',./ancestor::tei:person/@xml:id)"/> as one of its targets.</sch:report>
This rule doesn't work, even though it seems to rely on the same point:
<sch:report test="@mutual and (tokenize(@mutual) = '\A[^#]\w+?\Z')" role="error">Targets in value of @mutual must all begin with '#'.</sch:report>
What's the difference? If my understanding of sequence equality is garbled and I do need to loop through the tokenized attribute, how should I approach it? Many thanks for any help.
David
			
			
									
									
						I'm embedding Schematron rules in a schema for validating a TEI/XML personography. I have eight reports that constrain the attributes and values in the context of <relation>. Seven fire correctly and one never fires. For some rules I'm tokenizing the value of @mutual and looking at the individual tokens. The schema is an ODD file that produces Relax NG XML syntax.
This rule works, not looping through the tokens but relying on the fact that seq1 = seq2 is true if they have at least one element in common:
<sch:report test="@mutual and not(tokenize(@mutual) = concat('#',./ancestor::tei:person/@xml:id))" role="error">Value of @mutual must include <sch:value-of select="concat('#',./ancestor::tei:person/@xml:id)"/> as one of its targets.</sch:report>
This rule doesn't work, even though it seems to rely on the same point:
<sch:report test="@mutual and (tokenize(@mutual) = '\A[^#]\w+?\Z')" role="error">Targets in value of @mutual must all begin with '#'.</sch:report>
What's the difference? If my understanding of sequence equality is garbled and I do need to loop through the tokenized attribute, how should I approach it? Many thanks for any help.
David
- 
				Radu
- Posts: 9544
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Schematron and tokenize
Hi David,
In the case where this does not work you seem to test the equality between a set of string literals and a regular expression. As far as I know the only way to check if a regular expression matches a string literal in XSLT is to use the "matches" XSL function, so I would not expect for the XSLT processor to apply the match automatically by detecting that '\A[^#]\w+?\Z' is a regular expression instead of a plain string literal.
Regards,
Radu
			
			
									
									In the case where this does not work you seem to test the equality between a set of string literals and a regular expression. As far as I know the only way to check if a regular expression matches a string literal in XSLT is to use the "matches" XSL function, so I would not expect for the XSLT processor to apply the match automatically by detecting that '\A[^#]\w+?\Z' is a regular expression instead of a plain string literal.
Regards,
Radu
Radu Coravu 
<oXygen/> XML Editor
http://www.oxygenxml.com
						<oXygen/> XML Editor
http://www.oxygenxml.com
- 
				david_himself
- Posts: 45
- Joined: Mon Oct 01, 2018 7:29 pm
Re: Schematron and tokenize
Post by david_himself »
Thanks, Radu. I thought I'd tested with a non-regex string to eliminate the possibility of regex being the cause of failure, but perhaps I did that too hastily. I also tried fn:matches, but it cannot take a sequence as first argument.
Is there a simple way of testing that all space-delimited values of an attribute have '#' as first character in string?
best
David
			
			
									
									
						Is there a simple way of testing that all space-delimited values of an attribute have '#' as first character in string?
best
David
- 
				Radu
- Posts: 9544
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Schematron and tokenize
Hi David,
You could maybe replace in the attribute value all strings which start with "#" with the empty string, then normalize the rest and check its length:
I think the length should be 0 if all tokens started with "#".
Regards,
Radu
			
			
									
									You could maybe replace in the attribute value all strings which start with "#" with the empty string, then normalize the rest and check its length:
Code: Select all
string-length(normalize-space(replace(@mutual, '#[^\s]*', '')))Regards,
Radu
Radu Coravu 
<oXygen/> XML Editor
http://www.oxygenxml.com
						<oXygen/> XML Editor
http://www.oxygenxml.com
- 
				david_himself
- Posts: 45
- Joined: Mon Oct 01, 2018 7:29 pm
			
				Jump to
				
			
		
			
			
	
	- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ Artificial Intelligence (AI Positron Assistant add-on)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service