[XSL-LIST Mailing List Archive Home] [By Thread] [By Date]

[xsl] Validation XSLT using XSLT 1.0 and ARMY


Subject: [xsl] Validation XSLT using XSLT 1.0 and ARMY
From: "Dr. Frank Mabry" <fmabry@xxxxxxxxxx>
Date: Sat, 05 Jul 2008 00:23:26 -0400

Ganesh Babu said:
----------------------------
   Date: Thu, 3 Jul 2008 10:36:57 +0530
   To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
   From: "Ganesh Babu N" <nbabuganesh@xxxxxxxxx>
   Subject: Validation XSLT using XSLT 1.0
   Message-ID:
   <c075eed70807022206s1ad898b2t98cd70c31e5c366c@xxxxxxxxxxxxxx>

Dear All,

   I am writing a validation style sheet. I am struck at following point
   in my validation. Can anyone point me to solution.

   1. I want to validate the @picfile value is equal to entity name and
   tiff images must be equal to the actual images file name on the image
   folder.

My question is how to get entity name and value in to a test.

   <!ENTITY I9780073379470_A_TB004 SYSTEM "9780073379470_A_TB004.tif"
   NDATA TIFF>

<graphic picfile="I9780073379470_A_TB004"/>

   2. In the XML file we are using named entities eg:- &nbsp; &acute;
   instead of &#123; or &#x123; in XSLT how to find if the XML file
   contains entities other than named entities?

   3. How to find non-ascii characters in the XML file and report an
   error using XSLT.

   4. How to find double enters in the XML file and report an error
   using XSLT.

   I could able to write around 45 validation points using XSLT but got
   struck with these 4. Please help me resolving these issues.

   Regards,
   Ganesh
------------------------------

[Sometimes it seems that circumstances conspire to make us do
something we were about to do anyway.]

I believe that two products of research that are about to be announced
may be of interest in addressing these questions. More on this below and next week.


First, let me make a few comments about the concerns of the original author:
   1. I want to validate the @picfile value is equal to entity name and
   tiff images must be equal to the actual images file name on the image
   folder.

My question is how to get entity name and value in to a test.
I don't think that XSLT processors are intended to do comparisons of graphic image
content unless it has been translated into an appropriate XML vocabulary by some external
utility. Don't forget that by the time an XSLT transform operates on your data, entity
information has been translated.
   2. In the XML file we are using named entities eg:- &nbsp; &acute;
   instead of &#123; or &#x123; in XSLT how to find if the XML file
   contains entities other than named entities?
Again the XML parser has already operated on any entity references. Note: if the
document is encoded in UTF-8 a codepoint may have been properly encoded for
transmission and subsequently processed by the parser into the appropriate codepoint.
There are a lot of different codepoints that have different effects that are
similar to NBSP and ACUTE is implemented in many different scripts in Unicode.
   3. How to find non-ascii characters in the XML file and report an
   error using XSLT.
If you restrict the encoding in the XML document then the parser will complain
appropriately for you and you don't have to write any XSLT.
   4. How to find double enters in the XML file and report an error
   using XSLT.
Are you referring to one of the many Unicode codepoints that
resembles the symbol some people associate with enter? There is an
"ENTER SYMBOL" (U2386). Some people have used a downwards
arrow with its tip point leftwards (such as occurs on the keyboard)
[either U2936 or U21B2 would perhaps be feasible for this graphically].
---------------------------------------
My work the past few years has dealt with concerns about the use of
Unicode in a variety of messaging environments. Articulating a
representation of the metafication yield for the codepoints (Articulated
Representation of the Metafication Yield - ARMY and ARMYXML)
requires care. Both ARMY and ARMYXML are XML vocabularies
that represent the information about the use of Unicode in documents (using
UTF-8) (either a Unicode stream or an XML document that uses UTF-8
encoding). Both are XML vocabularies that allow XSLT, XQuery, XPATH, and even
Schematron rule sets to analyze the result of a processor that produces
either representation for a document that uses UTF-8. The processors
(ARMY and ARMYXML) presently support Unicode 5.0.


Frank
--
Dr. Frank Mabry, CISSP
Associate Professor and IT AIAD Coordinator
Dept. of EE&CS
U.S. Military Academy
West Point, New York, 10996

Work Phone: 845-938-2960
work email: frank.mabry@xxxxxxxx
home email: fmabry@xxxxxxxxxx

"The great use of life is to spend it for something that will outlast it."
- William James


Current Thread