Our mascot: the Schematroll

The Schematron Assertion Language 1.6

2000-2002 (C) Rick Jelliffe
and Academia Sinica Computing Centre
2002-10-01
 

Abstract

The Schematron is a language system for specifying and declaring assertions about arbitrary patterns in XML documents, based on the presence or absense, names and values of elements and attributes along paths. Its target uses are for software engineering requiring document validation, for scholarly research over patterns in graph-structured data, for automatic creation of external markup, and to aid accessibility of documents for people with disabilities.
 

Schemas

For markup languages, a schema is a specification of interlocking constraints between information items  in a document.

Any schema paradigm imposes particular limitations on the constraints expressable by a schema. Thus SGML declarations refer to its grammar declarations as 'content models'; there is a difference between (the expectations we have of) a schema versus a model: the former defines canonically or exhaustively, whilst the latter describes as best it can according to its schematic paradigm.

[SGML] and [XML] have provided a large-scale example of a design-by-contract [Meyer] distributed system: the DTD has acted as the language for specifying the invariants of documents diring its life, but over-rideable (using parameter entities in the internal subset) to allow particular process-local invariants to be described. It is common practise in SGML shops to override and extend DTD ad hoc, in order to better describe and verify (validate) the data at various stages. The ability to flexibly alter DTDs as required seems to be one distinguishing feature of successful SGML-based production.

Grammars


Conventionally, starting with SGML DTDs, schemas for markup languages are defined in terms of grammars to regulate element containment, lists to regulate attribute containment, augmented by datatype constraints on various information elements. In XML, these additional constraints are concerned with enabling graph structures to be represented, rather than describing the semantics or types of information elements.  In the late 1990s, many schema languages were developed for XML in anticipation of the development of the World Wide Web Consortium's [XML Schema] schema language. All these schema languages used the grammar-founded approach mentioned above, elaborating on them using objects [SOX], modules [RELAX], production selectors [Assertion Grammars], etc.

But these approaches have certain deficiencies. For a start, SGML used grammars as its schematic paradigm because one does indeed define grammars with it, down to lexical level: a full-featured SGML parser can parse many tagged-data languages regardless of the delimiters used. The grammar paradigm is not necessary for schemas for XML. Secondly, the grammar approach is not sufficient to express any constraints between information items in different branches of the attribute-value tree which forms the primary view of an XML document. The mechanisms for declaring unique identifies and references do not alter this; mechanisms such as that of [XML Schema] and [RELAX] to introduce grammar non-terminals (termed the tag/type distinction) allow an element name to have a different datatype of content model depending on its parent, however this merely allows the type of an element to be constrained by its parent's type as well as by its generic identifier (i.e., by its tag).

More generally, there is no reason to expect an arbitrary web of information (think of an Entity-Relation diagram) to conform itself to a simple tree structures. Consequently there is no reason to expect that the information constraining one structure or value will always be found in the same branch. (Indeed, schema languages themselves may in part be attempts to move non-branch-local information to a separate tree, to simplify the markup, and structures).

These deficiencies are nothing more than schema paradigm limitations. However there are other pragmatic and policy considerations that may make a grammar-based schema paradigm unattractive.

Considerations

The first is a cultural, or perhaps educational and linguistic one. According to [deFrancis], Western written languages can be thought of as having the broad hierarchy 'character', 'word', 'phrase', 'sentence', while Chinese can be thought of as having the broad hierarchy 'character', 'idea', 'phrase', 'sentence'.  A Chinese character does more than a Western character, and represents ideas which are again at a slightly higher-level than Western words (i.e., in non-agglutinating languages.)  Thus in Chinese, it is possible that one moves quickly from a non-grammatic level to a semantic level fast; this may be why Taiwanese students are anecdotally reported[1] to find the idea of formal grammars for natural languages to be laughably theoretical rather than practical.  At the least, we should not dismiss the possibility that using formal grammars as a schematic paradigm may more easily acceptable to members of one language group or culture than another.

Secondly, if a schematic paradigm has language or cultural affinities, that different schema languages may be doubly difficult for people with cognative impairments. I only need to take this point as far as saying the obvious that a complex schema paradigm may be more difficult than a simple one.  And the difficulty of a schematic paradigm may be more than just its complexity to explain but also its complexity to use and implement.

Third, when considering the needs of schemas for constraints on documents which are needed to support accessibility by disabled people, we come up against what I regard as a fundamental shortfall in existing schema languages: they are designed to support definitional schemas which intend to specifiy exhaustively or canonically the required constraints on a document. However, acessibility constraints are typically policy constraints imposed on a document in addition to those constraints required to define that document. Yet these constraints are fundamentally schematic: they relate to invariants about what elements, etc. can be used and where.

Fourthly, after admitting that there can be important non-definitional constraints on a document, the question arises of what other non-definitional constraints can there be?  The main one I identify is the requirements of workflow: that some constraints only may come into existence during some phase of a document's lifecycle. Without some notion of constraints that come into play during a phase, one must either weaken constraints on a schema, until the schema only contains the loosest unparameterized invariants, or arbitrarily switch schemas during the document's life cycle.

Fifthly, building on the notion that it is useful to be able to switch constrains in and out during formally defined phases of a document's life cycle, we can see that the ability to group and switch in and out constraints on an ad hoc basis during editing of a document would be useful.  It is a common difficulty with validating editors of structured documents that otiose errors are reported for documents under construction and incomplete.

Sixthly, the other side to reporting extranous information is that a grammar-based schema language probably does not have any mechanism to explain the significance of particles and groups in its content model: if there is a repeating group of elements, surely it is more interesting to know why they are grouped and repeat rather than merely the bland fact that they are grouped and repeated. A group in a content model represents a kind of manditorily omitted element, where the schema designer has decided, perhaps for pragmatic reasons of markup minimisation, to not allow the structure to be named as an element.

Seventhly, the previous issue raises the further question of how information is to be provided for human interaction with a schema system. In the case of grammars, it is possible to synthesize many useful error massages or diagnostic hints from a content model; however grammar based systems have seemed weak in helping sort out how to fix in-progress or utterly wrong markup.

Thus one consideration leads to the next, and the result can be considerable doubt that grammars provide the optimal schema paradigm for documents for the World Wide Web. These are some of the needs and considerations underpinning development of the Schematron assertion language.

Which is not, of course, to say that grammars are not quite useful when appropriate.

Uses

The Schematron language has been developed with four main use-areas in mind: As part of the Schematron project, exemplar software to do these has been produced and is available on the WWW[2].

A Schematron process does not augment the information set of the document per se. Instead, it is assumed to create an external document containing links or references (human or machine-readable) to the original document.

Assertions

A Schematron schema is made by specifying assertions, which are simple declarative sentences in natural language.

The <assert> element is used to tag positive assertions about a document. For example,

<assert>A 'dog' element should contain two 'ear' elements.</assert>
This asssertion is something that is expected to be true of the document. If a document is validated against the schema, and the test for this assertion fails, an application can take some action. Schematron does not specify any actions: it only allows assertions to be tested, for the parts of assertions to be given roles, for the assertions to be grouped into rules, for the rules to be grouped into patterns, and for the patterns to be activated in various phases.

The <report> element is used to tag negative assertions about a document. For example,

<report>This dog has a bone.</report>

The test attribute on a <assert> or <report> element is an XPath expression evaluated to boolean: informally, XPath expressions are a simple expression language with functions on strings, numbers, booleans, document context, and nodes. The terms can be grouped using parentheses and |. Formally, they must match the production and semantics of production [14] Expr, s.3 Expressions in the [XPath] specification. (See Appendix D below for a combined listing of the various productions.)

<assert test="self::dog and child::bone"
>A 'dog' element should contain two 'ear' elements.</assert>

Within these two elements, it is possible to use a <name> element, which gives the specific name of the context element for which the assert statement failed or the report statement succeeded. The <name> element can also have an attribute path in which an [XPath] expression can be given; this allows the name of an element or attribute different to the context element to be specified. Because some implementations of Schematron may format these names differently. For better formatting, an element <emph> is also allowed; its only use is to allow names of elements or attributes to be specified in assertions to have the same format as those provided by evaluating the <name> element. The <span> element is also allowed, with the same meaning as in HTML.

<assert test="child::bone"
>A <name/> element should contain two <emph>ear</emph> elements.</assert>

Note that there is an abbreviated syntax possible for use in the test attribute. So the following example is equivalent to the previous one:

<assert test="bone"
>A <name/> element should contain two <emph>ear</emph> elements.</assert>

For internationalization, the element <dir> can be used inside these two elements to support bidirectional written languages; the semantics are those of the dir element of [HTML].  The elements may also have an xml:dir attribute for tagging the written language of the contents of the element; the xml:lang attribute does not express the language of the target document.

For better formating of assertion reports, these two elements may also have an icon attribute, which is the [URL] of a small image that may provide visual clues to a user.

These two elements can also have a subject attribute. This is an [XPath] path which allows very direct specification of the subject of the assertion: this may be useful information for automatically generating [RDF] documents.

There is no prescribed order in which assertions must be checked. (By default, most implementations will probably check assertions in the same order they appear in the document.)

In the particular case of Schematron schemas which need to be very terse, and which are inteded for a yes/no validation result, the natural language assertions may be omitted.

Rules

<assert> and <report> elements are grouped inside <rule> elements. The <rule> element has a context attribute which contains an [XPath] expression. Every element in the document for which this path expression evaluates to true is then used as the context to test the assertions. An assertion is tested by testing an [XPath] expression declared in a test attribute of the <assert> and <report> element.

The full declarations for the assertions above are

<rule context="dog">
   <assert test="count(ear) = 2"
   >A 'dog' element should contain two 'ear' elements.</assert>
   <report test="bone"
   >This dog has a bone.</report>
</rule>

The context attribute on a <rule> element is an XSLT pattern: informally, this allows XPath path expressions to be combined in or groups (using the keyword or and parentheses for grouping). Formally, they must match the production and semantics of production [1] Pattern, s.5.2 Patterns in the XSLT specification. (See Appendix D below for a combined listing of the various productions.) The functions available include those in XSLT s.12 Additional Functions. These can be extended using the methods in XSLT s14.2 Extension Methods; the function-available() function should be used before any extension function is called to allow some graceful behaviour on systems which do not support the functions. (It is anticipated that some more business-oriented functions may be developed at some later stage.)

These three elements are the operational core of Schematron. [XPath] expressions allow a very wide range of constraints to be expressed: based on element and attribute names, based on their position and occurrence, based on text values, and based on counts.  In the example, the context is every element with a generic identifier 'dog': the test in the <assert> element counts the number of child elements with the generic identifer 'ear'.  Neither assertion in this rule will fail for the following XML document:
<dog><ear/><ear/></dog>
The context attribute is an [XPath] as extended by [XSLT], allowing 'or' operations, for example. The test attributes are [XPath] expressions which allow various logical operators such as '|'.

The <rule>,<assert> and <report> elements can each have a role attribute. This is an identifier within the schema to identify the role that is played. Schematron 1.6 does not pre-define any roles; the <ns> attribute on the <schema> element can be used to specify some URL to which this controlled vocabulary belongs. These elements can also have id attributes.

<rule context="dog" role="animal" id="doggy">
   <assert test="count(ear) = 2" role="internalProperty"
   >A 'dog' element should contain two 'ear' elements.</assert>
   <report test="bone" role="externalProperty"
   >This dog has a bone.</report>
</rule>

This jointed-leg path system is reminiscent of SQL queries: one could consider a query SELECT x FROM y WHERE z IS a  to be a context statement (i.e., 'WHERE x IS y') and a test (i.e., 'x FROM y).

A <rule> element can also contain <key> elements, which allows [XSLT]'s key mechanism to be used. This allows various testing of reference constraints; it is more powerful than the [XML] ID/IDREF mechanism. The path attribute is an [XPath] path; the name attribute is a token naming the key. The icon attribute allows specification of an icon.

An important feature to note is that, because of [XSLT]'s document() function, a Schematron assertion test can refer to data in a different document from the context document. This allows Schematron schemas to be used for two important uses: to validate against a controlled vocabulary located externally to the schema (indeed, this can be in any XML document type, not just using a Schematron schema), and to validate the output of some programs function against data found in its input (or vice versa) as a form of black-box testing.

It is also useful to note that Schematron lends itself to analysis of information sets using cohesion and coupling ideas [Constantine]. The coupling of one information item to another often is not symetrical: DTDs force all coupling contstraints to be expressed in terms of the parent to the child, whereas some coupling may be better expressed from child to parent. This is a typical way of specifying optional elements in a Schematron schema.

A simple macro mechanism is allowed on rules. A <rule> element can have one or more <extends> elements. These have a rule attribute, which is the identifier of an abstract rule. This allows you to bring the assertions of that anbstract rule into the current rule. An abstract rule is specified with an abstract attribute with a value "true".  An abstract rule element cannot have a context attribute. (This is use of "extends" where W3C XML Schemas uses "restricts" is the appropraite terminology from rule-based systems[WASH].

As an example, this constraint can be specified as follows (in [XPath} paths

<rule context="sch:rule">
   <assert test="(attribute::abstract='true') and not(attribute::context)"
   >An abstract rule cannot have a context attribute.</assert>
   <assert test="(attribute::abstract='false') and attribute::context"
   >A rule should have a context attribute (except for abstract rules.)</assert>
   <assert test="not(attribute::abstract) and attribute::context"
   >A rule should have a context attribute (except for abstract rules.)</assert>
   <report test="attribute::abstract and not(attribute::abstract='true') and not(attribute::abstract='false')"
   >In a rule, the abstract attribute is optional, and can have values 'true' or 'false'</report>
</rule>
Note in this example that Schematron schemas are very specific. It is quite probable that a simpler schema would be just as effective, or the various assertions could be combined into a larger test with a more general statement.  

One abstract rule can extend another.  XML Entities can also be used for various macro effects, as desired.

There is no prescribed order in which rules must be checked. (By default, most implementations will probably check rules in the same order they appear in the document.) Note, however, that if rules are checked in a different order, they still must implement the order-dependent semantic that each context attribute is a shortened form of the full context attribute, such that the context is really formed by first testing negative of or-ing all previous contexts in the same pattern; only nodes which pass that seive are tested by the rule.

Patterns

Rules are grouped into <pattern> elements.  A pattern is a grouping of rules. An element will only be used as the context of one rule per pattern; the first rule in lexical order for which a context matches will be used.

Pattern elements have various attributes. The name attribute allows specification of a simple human-readable string to identify the pattern. The id attribute allows a unique identifier to be assigned. for reference purposes. The fpi attribute allows an [SGML] Formal Public Identifier to be attached. The see attribute allows a [URL] to be specified which gives some human readable documentation for the pattern; a hypertext presentation of the schema results can link to that resource.

A pattern is the nearest equivalent in Schematron to a type; except that Schematron is concerned with providing as direct as possible specification of the relationships between information items rather than trying to fit them into an abstract mold such as type. Which is not, of course, to say that notions of type are not quite useful where appropriate.

A <pattern> element can have an icon element.

There is no prescribed order in which patterns must be checked. (By default, most implementations will probably check patterns in the same order they appear in the document, and schema-writers may put important patterns before less important patterns in the document to present the most useful information to the user.) Note, however, that if the pattern is activated in the current phase, it should not be checked.

Schema

The top-level element of an XML schema is <schema>. A schema element should have a <title> sub-element.

Typically the schema will be declared using XML [Namespace] conventions. The preferred prefix is sch and the appropriate namespace URI is

http://www.ascc.net/xml/schematron
Thus a complete XML schema document is as follows:
<?xml version="1.0" encoding="US-ASCII"?>
<sch:schema xmlns:sch="http://www.ascc.net/xml/schematron">
 <sch:title>Example Schematron Schema</sch:title>
 <sch:pattern>
   <sch:rule context="dog">
    <sch:assert test="count(ear) = 2"
    >A 'dog' element should contain two 'ear' elements.</sch:assert>
    <sch:report test="bone"
    >This dog has a bone.</sch:report>
   </sch:rule>
  </sch:pattern>
</sch:schema>
The <schema> element can have a ns attribute which gives the namespace URI that role attributes will have, if the role is used to externally mark up the target document.

The <schema> element also allows explicit declaration of namespace prefixes and URLs that are used in the schema, using the <ns> subelements. The usual XML [Namespaces] mechanism can be used, however, then the prefix and URL data is not available for diagnostic reporting or application processing; furthermore, some implementations may require that the information is made available in that form. For example:

<sch:schema xmlns:sch="http://www.ascc.net/xml/schematron">
  <sch:title>Screen-scraper for XHTML data</sch:title>
  <sch:ns prefix="xhtml" uri="http://www.w3.org/1999/xhtml" />
...

A <schema> can have an icon attribute.  It can also contain <p> elements, allowing some modest end-user-oriented documentation to be given: this allows the user to know what kind of validation or constraints the schema specifies, to aid them in interpreting any results usefully. The <p> element can have an icon attribute.

Phases

Workflow and dynamical schemas are supported through the phase mechanism. The <schema> element can contain <phase> elements. This must have an id attribute for a unique identifier; it can have an icon attribute; it can have an fpi attribute to give a persistant identifier for the phase, because a phase may correspond to a DTD which had an FPI (note that the FPI is for the phase, not for the current scheme per se.)

The <phase> element has subelements active which provide the identifier of a <pattern> in an attribute pattern.

<phase id="basicValidation">
  <active pattern="text" />
  <active pattern="tables" />
  <active pattern="attributePresence" />
</phase>
<phase id="fullValidation">
  <active pattern="metadata" />
  <active pattern="text" />
  <active pattern="tables" />
  <active pattern="attributePresence" />
  <active pattern="attributeValueChecks" />
</phase>

By default, all patterns in a document are active. However, an application may provide a way to allow the user to select the phase to be used: for example, a command line option when invoked from the command line, a preferences dialog box in a GUI, or a parameter on the function invocation when called as a precondition-checker in a programming language (such as C's assert(), from which Schematron's assert takes its name, or the pre and post-condition statements in Eiffel.)

Diagnostics

Users have reported that a common use of Schematron schemas is to allow specific diagnostics to be given. However, it is desirable to keep <assert> and <report> statements as general assertions rather than diagnostic messages. To support this, the <assert> and <report> elements have a diagnostics attribute which is a reference to one or more <diagnostic> elements. These are allowed in a <diagnostics> section at the end of the document. The value of the diagnostics attribute can be a list of references to <diagnostic> elements.

A <diagnostic> element is general text. It can trivially be converted into HTML. It can contain <dir> (for bidirectional languages), <span> and <emph> subelements with the same meanings as HTML. It must have an id attribute to allow references to it.  The <diagnostic> element can have <value-of> sub-elements, which have the same semantics as in [XSLT]. These allow insertion of value information as well as name details. A <diagnostic> element can have an icon attribute.

<rule context="dog" >
   <assert test="nose | @exceptional='true'" diagnostics="d1 d2"
   >A dog should have a nose.</assert>
</rule>
...
<diagnostics>
 <diagnostic id="d1"
 >Your dog <value-of select="@petname" /> has no nose. 
 How does he smell? Terrible. Give him a nose element,
 putting it after the <name path="child::*[2]"/> element.
 </diagnostic>
 <diagnostic id="d2"
 >Animals such as <name/> usually come with a full complement
 of legs, ears and noses, etc. However, exceptions are possible.
 If your dog is exceptional, provide an attribute
 <emph>exceptional='true'</emph>
 </diagnostic>
</diagnostics>

There is no prescribed order in which diagnostics must be given. (By default, most implementations will probably give diagnostics in the same order they appear in the document, and schema-writers may try to use this by putting more likely diagnostics before unlikely ones.)

Documentation

Because of the emphasis that the natural language text of an assert or report element is the definition of an assertion, wi the tests being models of the assertion, even undocumented Schematron schemas should be more comprensible than other schema languages. The documentation features are designed to extend these, and in particular to be able to generate pleasant print or hypertext versions of a schema.

A <p> element is general text. It can be trivially converted into HTML. It can contain <dir> (for bidirectional languages), <span> and <emph> subelements with the same meanings as HTML. It can have an id attribute to allow references to it.  The optional class attribute is provided to help generation of styled HTML, The xml:lang attribute can be used to specify the language of the paragraph. A <p> element can have an icon attribute, which is not an HTML attribute.

Schematron-Like Systems

Other useful validation languages can be built by merely using the Schematron framework and substituting other query languages. For a language to be Schematron-like it must be

A Schematron-like assertion language does not require backtracking or a theoretically complex implementation. Other higher and lower layers may be added, for instance the phase mechanism (which could in turn be generalized into a finite state machine.) However, schema language designers should note that there seems to be good usability reasons to stick with a fairly fixed hierarchy, rather than, say, adding an extra leg between context and the assertions: for a start, it means that Schematron schemas can be entered using simple forms.

Another distinctive that may make a language Schematron-like is the decision to partition the query components to a separate, embedded query language. This seems to have helped the readability, comprehesibility and learnability of the language; it adds a terseness which makes hand-editing of schemas completely viable.

Schematron lends itself to being embedded in other schema languages. In such a use, an extractor program (perhaps an XSLT stylesheet) typically extracts and creates the separate Schematron schema. Even though only fragments of Schematron are being used, such as just the assert elements, it is still appropriate to use the Schematron namespace.

Related Material

For a relevant discussion on the role of primacy of natural language descriptions over formal descriptions and the nature of declarative specifications, which are surely applicable to schemas, see [LeCharlier]. Note their comments that a specification should always have 1) a statement indicating purpose, and 2) a list of representation conventions that must be satisfied.

Schematron can be considered a front-end for specifying the targets of a transformation system (see [CIP].) Indeed, Schematron also may be considered to split the front-end into a rule-based framework (see [Schemarama] for an implementation of this) and a query language (in Schematron's case, XPath.)

The element name assert was chosen for familiarity to C programmers from the C assert(). See ???

The XLinkIt language is a similar assertion language to Schematron, but invented independently and with different usage goals and design rationales. See [Finkelstein]. Note the existance of the Patent ???? which relates to generating links between a schema document and an instance for consistancy checking using a rules-based system. con>

A phase can be regarded as a state in a Finite State Machine (see [Etessami] for a comparison of rules-based and state-machine-based approaches.)

Appendix A: XML DTD For Schematron 1.6

The following are markup declarations for the Schematron assertion language. For clarity, this version used default namespace; it is inadvisable to use the default namespace in practise, because some Schematron implementations may apply that default namespace to the target document, to unqualified names.  Note that, providing the defaulting noted is followed and except for ID purposes, the Schematron DTD does not make infoset contributions and should not be required.
<!-- +//IDN sinica.edu.tw//DTD Schematron 1.5//EN -->
<!-- http://www.ascc.net/xml/schematron/schematron1-5.dtd -->
<!-- version of 2002/08/16 -->
<!-- All names are given indirectly, to allow explicit use of a namespace prefix
       if desired.  In that case, in the internal subset of the doctype declaration,
       define <!ENTITY % sp "sch:" >
-->
<!ENTITY % sp "">
<!ENTITY % schema "%sp;schema">
<!ENTITY % active "%sp;active">
<!ENTITY % assert "%sp;assert">
<!ENTITY % dir "%sp;dir">
<!ENTITY % emph "%sp;emph">
<!ENTITY % extends "%sp;extends">
<!ENTITY % diagnostic "%sp;diagnostic">
<!ENTITY % diagnostics "%sp;diagnostics">
<!ENTITY % key "%sp;key">
<!ENTITY % name "%sp;name">
<!ENTITY % ns "%sp;ns">
<!ENTITY % p "%sp;p">
<!ENTITY % pattern "%sp;pattern">
<!ENTITY % phase "%sp;phase">
<!ENTITY % report "%sp;report">
<!ENTITY % rule "%sp;rule">
<!ENTITY % span "%sp;span">
<!ENTITY % title "%sp;title">
<!ENTITY % value-of "%sp;value-of">
<!-- Data types -->
<!ENTITY % URI "CDATA">
<!ENTITY % PATH "CDATA">
<!ENTITY % EXPR "CDATA">
<!ENTITY % FPI "CDATA">
<!-- Element declarations -->
<!ELEMENT %schema; ((%title;)?, (%ns;)*, (%p;)*, (%phase;)*, (%pattern;)+, (%p;)*, (%diagnostics;)?)>
<!ELEMENT %active; (#PCDATA | %dir; | %emph; | %span;)*>
<!ELEMENT %assert; (#PCDATA | %name; | %emph; | %dir; | %span;)*>
<!ELEMENT %dir; (#PCDATA)>
<!ELEMENT %emph; (#PCDATA)>
<!ELEMENT %extends; EMPTY>
<!ELEMENT %diagnostic; (#PCDATA | %value-of; | %emph; | %dir; | %span;)*>
<!ELEMENT %diagnostics; (%diagnostic;)*>
<!ELEMENT %key; EMPTY>
<!ELEMENT %name; EMPTY>
<!ELEMENT %ns; EMPTY>
<!ELEMENT %p; (#PCDATA | %dir; | %emph; | %span;)*>
<!ELEMENT %pattern; ((%p;)*, (%rule;)*)>
<!ELEMENT %phase; ((%p;)*, (%active;)*)>
<!ELEMENT %report; (#PCDATA | %name; | %emph; | %dir; | %span;)*>
<!ELEMENT %rule; (%assert; | %report; | %key; | %extends;)+>
<!ELEMENT %span; (#PCDATA)>
<!ELEMENT %title; (#PCDATA | %dir;)*>
<!ELEMENT %value-of; EMPTY>
<!-- Attribute declarations -->
<!ATTLIST %schema;
xmlns %URI; #FIXED  "http://www.ascc.net/xml/schematron"
xmlns:sch %URI; #FIXED "http://www.ascc.net/xml/schematron"
xmlns:xsi %URI; #FIXED "http://www.w3.org/2000/10/XMLSchema-instance"
xsi:schemaLocation %URI;  "http://www.ascc.net/xml/schematron
          http://www.ascc.net/xml/schematron/schematron.xsd"
id ID #IMPLIED
fpi %FPI; #IMPLIED
schemaVersion CDATA #IMPLIED
defaultPhase IDREF #IMPLIED
icon %URI; #IMPLIED
version CDATA "1.6"
xml:lang NMTOKEN #IMPLIED
>
<!ATTLIST %active;
pattern IDREF #REQUIRED
>
<!ATTLIST %assert;
test %EXPR; #REQUIRED
role NMTOKEN #IMPLIED
id ID #IMPLIED
diagnostics IDREFS #IMPLIED
icon %URI; #IMPLIED
subject %PATH; #IMPLIED
xml:lang NMTOKEN #IMPLIED
>
<!ATTLIST %dir;
value (ltr | rtl) #IMPLIED
>
<!ATTLIST %extends;
rule IDREF #REQUIRED
>
<!ATTLIST %diagnostic;
id ID #REQUIRED
icon %URI; #IMPLIED
xml:lang NMTOKEN #IMPLIED
>
<!ATTLIST %key;
name NMTOKEN #REQUIRED
path %PATH; #REQUIRED
icon %URI; #IMPLIED
>
<!ATTLIST %name;
path %PATH; "."
>
<!-- Schematrons should implement '.' 
               as the default value for path in sch:name -->
<!ATTLIST %p;
xml:lang CDATA #IMPLIED
id ID #IMPLIED
class CDATA #IMPLIED
icon %URI; #IMPLIED
>
<!ATTLIST %pattern;
name CDATA #REQUIRED
see %URI; #IMPLIED
id ID #IMPLIED
icon %URI; #IMPLIED
>
<!ATTLIST %ns;
uri %URI; #REQUIRED
prefix NMTOKEN #IMPLIED
>
<!ATTLIST %phase;
id ID #REQUIRED
fpi %FPI; #IMPLIED
icon %URI; #IMPLIED
>
<!ATTLIST %span;
class CDATA #IMPLIED
>
<!ATTLIST %report;
test %EXPR; #REQUIRED
role NMTOKEN #IMPLIED
id ID #IMPLIED
diagnostics IDREFS #IMPLIED
icon %URI; #IMPLIED
subject %PATH; #IMPLIED
xml:lang NMTOKEN #IMPLIED
>
<!ATTLIST %rule;
context %PATH; #IMPLIED
abstract (true | false) "false"
role NMTOKEN #IMPLIED
id ID #IMPLIED
>
<!-- Schematrons should implement 'false' as the default
                  value of abstract -->
<!ATTLIST %value-of;
select %PATH; #REQUIRED
>

Appendix B: Schematron Schema for Schematron 1.5


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE schema PUBLIC "http://www.ascc.net/xml/schematron"
"http://www.ascc.net/xml/schematron/schematron1-5.dtd">
<schema xmlns="http://www.ascc.net/xml/schematron"
xmlns:sch="http://www.ascc.net/xml/schematron"
xml:lang="en"                                       
xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance"
xsi:schemaLocation="http://www.ascc.net/xml/schematron
  http://www.ascc.net/xml/schematron/schematron1-5.xsd" 
fpi="+//IDN ascc.net//SGML XML Schematron 1.5 Schema for Schematron 1.5//EN"
schemaVersion="2001/01/31"  version="1.5"
defaultPhase="New"
icon="http://www.ascc.net/xml/resource/schematron/bilby.jpg">

<title>Schematron 1.5</title>
<ns uri="http://www.ascc.net/xml/schematron" prefix="sch"/>

<p>Copyright (C) 2001 Rick Jellife. 
Freely and openly available under zlib/libpng license.</p>
<p>This schema is open: it only
considers elements in the Schematron namespce. 
Elements and attributes from other namespaces can be used freely.
This schema does not assume that the Schematron schema is the top-level element.
</p>
<p>This schema uses conservative rules (e.g. no use of key()) to 
work with incomplete XSLT-based implementations.</p>


<phase id="New">
	<p>For creating new documents.</p>
	<active pattern="mini"/>
</phase>
<phase id="Draft">
	<p>For fast validation of draft documents.</p>
	<active pattern="required" />
</phase>
<phase id="Full">
	<p>For final validation and tracking some tricky problems.</p>
	<active pattern="mini" />
	<active pattern="required" />
	<active pattern="attributes" />
</phase>

<pattern name="Minimal Schematron" id="mini">
<p>These rule establish the smallest possible Schematron document.
These rules may be handy for beginners with starting documents.</p>
	<rule context="/">
		<assert test="//sch:schema"
		>A Schematron schema should have a schema element. </assert>
		<report test="count(//sch:schema) > 1"
		>There should only be one schema per document.</report>
		<assert test="//sch:schema/sch:pattern "
		>A Schematron schema should have pattern elements inside the schema element</assert>
		<assert test="//sch:schema/sch:pattern/sch:rule[@context]"
		>A Schematron schema should have rule elements inside the pattern elements. Rule elements should have a context attribute.</assert>
		<assert test="//sch:schema/sch:pattern/sch:rule/sch:assert[@test] 
		or //sch:schema/sch:pattern/sch:rule/sch:report[@test]" 
		>A Schematron schema should have  assert or report elements inside the rule elements. Assert and report elements should have a test attribute.</assert>
	</rule>
</pattern>

<pattern name="Schematron Elements and Required Attributes" id="required">
	<p>Rules defining occurrence rules for Schematron elements
	and their required attributes. Note that for attributes,
	it is not that the attribute is being tested for existance,
	but whether it has a value.</p>
	<p>Some elements require certain children or attributes. 
	Other elements require certain parents. Schematron can represent 
	both these kinds of coupling. </p> 

	<rule context="sch:schema">
		<assert test="count(sch:*) = count(sch:title|sch:ns|sch:phrase|sch:p|sch:pattern|sch:diagnostics|sch:phase)"
		>The element <name/> should contain only the elements title, ns, phrase, p, pattern, diagnostics or phase from the Schematron namespace.</assert>
		<assert test="sch:pattern"
		>A schema element should contain at least one pattern element.</assert>
		<report test="ancestor::sch:*"
		>A Schematron schema should not appear as a child of another Schematron schema.</report>
		<report test="@defaultPhase and sch:phase and not(defaultPhase='#ALL') and not(sch:phase[@id= current()/@defaultPhase])"
		>The value of the defaultPhase attribute must match the id of a phase element.</report>
	</rule> 
	<rule context="sch:title">
		<assert test="parent::sch:schema"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element schema.</assert>
		<assert test="count(preceding-sibling::sch:*) = 0"
		>The element <name/> should only appear as the first element from the Schematron namespace in the schema element.</assert>
	</rule>
	<rule context="sch:ns">
		<assert test="parent::sch:schema"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element schema.</assert>
		<assert test="string-length(normalize-space(@prefix)) &gt; 0"
		>The element <name/> must have a value for the attribute prefix.</assert>
		<assert test="string-length(normalize-space(@uri)) &gt; 0"
		>The element <name/> must have a value for the attribute uri.</assert>
		<assert test="count(preceding-sibling::sch:*) = count(preceding-sibling::sch:title) + count(preceding-sibling::sch:ns)"
		>The <name/> element must come before any other Schematron elements, except the title</assert>
		<report test="*"
		>The <name/> element should be empty.</report>
	</rule>
	<rule context="sch:phase">
		<assert test="parent::sch:schema"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element schema.</assert>
		<assert test="count(preceding-sibling::sch:*) = count(preceding-sibling::sch:phase)
		+ count(preceding-sibling::sch:title) + count(preceding-sibling::sch:ns)
		+ count(preceding-sibling::sch:p)"
		>The <name/> elements must come before any other Schematron elements, except the title, ns and p elements</assert>
	</rule>
	<rule context="sch:active"> 
		<assert test="parent::sch:phase"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element phase.</assert>
		<assert test="string-length(normalize-space(@pattern)) &gt; 0"
		>The element <name/> must have a value for the attribute pattern.</assert>
	</rule>
	<rule context="sch:pattern">
		<assert test="parent::sch:schema"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element schema.</assert>
		<assert test="count(sch:*) = count(sch:rule|sch:p)"
		>The element <name/> should contain only rule and p elements from the Schematron namespace.</assert>
		<assert test="sch:rule"
		>The element <name/> should contain at least one rule element.</assert>
		<assert test="string-length(normalize-space(@name)) &gt; 0"
		>The element <name/> must have a value for the attribute name.</assert>
		<assert test="count(sch:title) &lt; 2"
		>A Schematron schema cannot have more than one title element.</assert>
	</rule>
	<rule context="sch:rule[@abstract='true']">
		<assert test="parent::sch:pattern"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element pattern.</assert>
		<assert test="count(sch:*) = count(sch:assert |sch:report|sch:key|sch:extends ) "
		>The element <name/> should contain only the elements assert, report, key or extends from the Schematron namespace.</assert>
		<assert test="sch:assert | sch:report | sch:extends"
		>The element <name/> should contain at least one assert, report or extends elements.</assert>
		<report test="@test"
		>The <name/> element cannot have a test attribute: that should go on a report or assert element.</report>
		<report test="@context"
		>An abstract rule cannot have a context attribute.</report>
		<assert test="string-length(normalize-space(@id)) &gt; 0"
		>An rule should have an id attribute. </assert>
	</rule>
	<rule context="sch:rule">
		<assert test="parent::sch:pattern"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element pattern.</assert>
		<assert test="count(sch:*) = count(sch:assert |sch:report|sch:key|sch:extends ) "
		>The element <name/> should contain only the elements assert, report, key or extends from the Schematron namespace.</assert>
		<assert test="sch:assert | sch:report | sch:extends"
		>The element <name/> should contain at least one assert, report or extends elements.</assert>
		<report test="@test"
		>The <name/> element cannot have a test attribute: that should go on a report or assert element.</report>
		<assert test="string-length(normalize-space(@context)) &gt; 0"
		>A rule should have a context attribute. This should be an XSLT pattern for selecting nodes to make assertions and reports about. (Abstract rules do not require a context attribute.)</assert>
		<assert test="not(@abstract) or (@abstract='false')  or (@abstract='true')"
		>In a rule, the abstract attribute is optional, and can have values 'true' or 'false'</assert>
	</rule>
	<rule context="sch:assert "> 
		<assert test="parent::sch:rule"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element rule.</assert>
		<assert test="string-length(normalize-space(text())) &gt; 0"
		>A <name/> element should contain a natural language sentence.</assert>
		<assert test="string-length(normalize-space(@test)) &gt; 0"
		>The element <name/> must have a value for the attribute test. This should be an XSLT expression.</assert>
		<report test="@context"
		>The <name/> element cannot have a context attribute: that should go on the rule element.</report>
	</rule>  
	<rule context=" sch:report">
		<assert test="parent::sch:rule"  diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element rule.</assert>
		<assert test="string-length(normalize-space(text())) &gt; 0"
		>A <name/> element should contain a natural language sentence.</assert>
		<assert test="string-length(normalize-space(@test)) &gt; 0"
		>The element <name/> must have a value for the attribute test. This should be an XSLT expression.</assert>
		<report test="@context"
		>The <name/> element cannot have a context attribute: that should go on the rule element.</report>
	</rule>  
	<rule context="sch:diagnostics">
		<assert test="parent::sch:schema"  diagnostics="bad-parent"
		>The element <name/> should only appear as a child of the schema element</assert>
		<report test="following-sibling::sch:*"
		>The element <name/> should be the last element in the schema.</report>
	</rule>
	<rule context="sch:diagnostic">
		<assert test="parent::sch:diagnostics"  diagnostics="bad-parent"
		>The element <name/> should only appear in the diagnostics section.</assert>
		<assert test="string-length(normalize-space(@id)) &gt; 0"
		>The element <name/> must have a value for the attribute id. </assert>
	</rule>
	<rule context="sch:key">
		<assert test="parent::sch:rule"  diagnostics="bad-parent"
		>The element <name/> should only appear in a rule.</assert>
		<assert test="string-length(normalize-space(@name)) &gt; 0"
		>The element <name/> must have a value for the attribute name. </assert>
		<assert test="string-length(normalize-space(@path)) &gt; 0"
		>The element <name/> must have a value for the attribute path.   This should be an XPath expression.</assert>
		<report test="*"
		>The <name/> element should be empty.</report>
	</rule>
	<rule context="sch:extends">
		<assert test="parent::sch:rule"  diagnostics="bad-parent"
		>The element <name/> should only appear in a rule.</assert>
		<assert test="string-length(normalize-space(@rule)) &gt; 0"
		>The element <name/> must have a value for the attribute rule. </assert>
		<report test="*"
		>The <name/> element should be empty.</report>
		<assert test="/*//sch:rule[@abstract='true'][@id = current()/@rule]"
		>The <name/> element should have an attribute rule which gives the id of an abstract rule.</assert>
	</rule>
	<rule context="sch:p">
		<assert test="parent::sch:*"  diagnostics="bad-parent"
		>The element <name/> should only appear inside an element from the Schematron namespace. It is equivalent to the HTML element of the same name.</assert>
	</rule>
	<rule context="sch:name">
		<assert test="parent::sch:assert | parent::sch:report |parent::sch:p | parent::sch:diagnostic"
		 diagnostics="bad-parent"
		>The element <name/> should only appear inside a Schematron elements p (paragraph) or diagnostic.</assert>
		<report test="*"
		>The <name/> element should be empty.</report>
	</rule>
	<rule context="sch:emph">
		<assert test="parent::sch:p | parent::sch:diagnostic"
		 diagnostics="bad-parent"
		>The element <name/> should only appear inside a Schematron elements p (paragraph) or diagnostic. It is equivalent to the HTML element of the same name.</assert>
	</rule>
	<rule context="sch:dir">
		<assert test="parent::sch:p | parent::sch:diagnostic"
		 diagnostics="bad-parent"
		>The element <name/> should only appear inside a Schematron elements p (paragraph) or diagnostic.</assert>
		<assert test="@value and (@value='rtl' or @value='ltr')"
		>The attribute value of the <name/> element must be lowercase "rtl" or "ltr". It is equivalent to the HTML element of the same name.</assert>
	</rule>
	<rule context="sch:span">
		<assert test="parent::sch:p | parent::sch:diagnostic"
		 diagnostics="bad-parent"
		>The element <name/> should only appear inside a Schematron elements p (paragraph) or diagnostic. It is equivalent to the HTML element of the same name.</assert>
	</rule>
	<rule context="sch:value-of">
		<assert test="parent::sch:diagnostic"   diagnostics="bad-parent"
		>The element <name/> should only appear inside the Schematron element diagnostic.</assert>
		<assert test="string-length(normalize-space(@select)) &gt; 0"
		>The element <name/> must have a value for the attribute select. The value should be an XPath expression.</assert>
		<report test="*"
		>The <name/> element should be empty.</report>
	</rule>
	<rule context="sch:*">
		<report test="1=1" diagnostics="spelling"
		>The <name/> element is not an element from the Schematron 1.5 namespace</report>
	</rule>
</pattern>

<pattern name="Schematron Attributes" id="attributes" >
	<p>These rules specify which elements each attribute can belong to, and what they mean.</p>
	<rule context="sch:*">
		<report test="@abstract and not(self::sch:rule)"
		>The boolean attribute abstract can only appear on the element rule. An abstract rule can be used to extend other rules.</report>
		<report test="@class and not(self::sch:span or self::sch:p)"
		>The attribute class can only appear on the elements span and p. It gives a name that can be used by CSS stylesheets.</report>
		<report test="@context and not(self::sch:rule)"
		>The attribute context can only appear on the element rule. It is an XPath pattern.</report>
		<report test="@defaultPhase and not(self::sch:schema)"
		>The attribute defaultPhase can only appear on the element schema. It is the id of the phase that will initially be active.</report>
		<report test="@diagnostics and not(self::sch:assert or self::sch:report)"
		>The attribute diagnostics can only appear on the elements report and report. It is the id of some relevent diagnostic or hint.</report>
		<report test="@fpi and not(self::sch:schema or self::sch:phase)"
		>The attribute fpi can only appear on the elements schema and phase. It is an ISO Formal Public Identifier.</report>
		<report test="@icon and not(self::sch:schema or self::sch:report or
		self::sch:diagnostic or self::sch:key or self::sch:p or self::sch:pattern
		or self::sch:phase or self::sch:report )"
		>The attribute icon can only appear on the elements schema, report, diagnostic, key, p, pattern, phase and report. It is the URL of a small image. </report>
		<report test="@id and not(self::sch:schema or self::sch:report or
		self::sch:p or self::sch:pattern or self::sch:phase or 
		self::sch:report or self::sch:rule or self::sch:diagnostic)"
		>The attribute id can only appear on the elements schema, report, p, pattern, phase, report, rule and diagnostic. It is a name, it should not start with a number or symbol.</report>
		<report test="@name and not(self::sch:key or self::sch:pattern)"
		>The attribute name can only appear on the elements pattern and key.</report>
		<report test="@path and not(self::sch:key | self::sch:name)"
		>The attribute path can only appear on the element key. It is an XPath path.</report>
		<report test="@pattern and not(self::sch:active)"
		>The attribute pattern can only appear on the element active. It gives the id of a pattern that should be activated in that phase.</report>
		<report test="@prefix and not(self::sch:ns)"
		>The attribute prefix can only appear on the element ns.</report>
		<report test="@role and not(self::sch:report or self::sch:report or self::sch:rule)"
		>The attribute role can only appear on the element report, report or rule. It is a simple name, not a phrase.</report>
		<report test="@rule and not(self::sch:extends)"
		>The attribute rule can only appear on the element extends. It is the id of an abstract rule declared elsewhere in the schema.</report>
		<report test="@see and not(self::sch:pattern)"
		>The attribute see can only appear on the element pattern. It is the URL of some documentation for the schema.</report>
		<report test="@select and not(self::sch:value-of)"
		>The attribute select can only appear on the element value-of, with the same meaning as in XSLT. It is an XSLT pattern.</report>
		<report test="@schemaVersion and not(self::sch:schema)"
		>The attribute schemaVersion can only appear on the element schema. It gives the version of the schema.</report>
		<report test="@subject and not(self::sch:report or self::sch:report)"
		>The attribute subject can only appear on the elements report and report. It is an XSLT pattern. </report>
		<report test="@test and not(self::sch:assert or self::sch:report)"
		>The attribute test can only appear on the elements report and report. It is an XPath expression with the XSLT additional functions.</report>
		<report test="@uri and not(self::sch:ns)"
		>The attribute uri can only appear on the element ns. It is a URI.</report>
		<report test="@value and not(self::sch:dir)"
		>The attribute value can only appear on the element dir. It sets the directionality of text: 'rtl' is right-to-left and 'ltr' is left-to-right.</report>
		<report test="@version and not(self::sch:schema)"
		>The attribute version can only appear on the element schema. It gives the version of Schematron required as major number "." minor number.</report>
		<assert test="not(attribute::*) or attribute::*[string-length(normalize-space(text()))=0]"
		>Every attribute on a Schematron element must have a value if it is specified.</assert>                  
	</rule>
</pattern>
<diagnostics>
	<diagnostic id="spelling"
	>Check this is not a spelling error. The recognized element names are
	schema, title, ns, pattern, rule, key, assert, report, diagnostics, diagnostic,
	name, value-of, emph and dir.</diagnostic>
	<diagnostic id="bad-parent"
	>The element appeared inside a <value-of select="name(parent::*)"/>.</diagnostic>
</diagnostics>
</schema>

Appendix C: W3C XML Schema for Schematron 1.5

This is non-normative


<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema 
targetNamespace="http://www.ascc.net/xml/schematron" 
xmlns:sch="http://www.ascc.net/xml/schematron" 
xmlns="http://www.ascc.net/xml/schematron" 
xmlns:xsd="http://www.w3.org/2000/10/XMLSchema" 
version="+//IDN sinica.edu.tw//SGML W3C XML Schema for Schematron 1.5//EN">
<xsd:annotation>
	<xsd:documentation source="http://www.ascc.net/xml/resource/schematron/schematron.html" xml:lang="en"/>
</xsd:annotation>
<xsd:element name="active">
	<xsd:complexType mixed="true">
		<xsd:choice minOccurs="0" maxOccurs="unbounded">
			<xsd:element ref="sch:dir"/>
			<xsd:element ref="sch:emph"/>
			<xsd:element ref="sch:span"/>
		</xsd:choice>
		<xsd:attribute name="pattern" type="xsd:IDREF" use="required"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="assert">
	<xsd:complexType mixed="true">
		<xsd:choice minOccurs="0" maxOccurs="unbounded">
			<xsd:element ref="sch:name"/>
			<xsd:element ref="sch:emph"/>
			<xsd:element ref="sch:dir"/>
			<xsd:element ref="sch:span"/>
			<xsd:any namespace="##other" processContents="lax"/>
		</xsd:choice>
		<xsd:attribute name="test" type="xsd:string" use="required"/>
		<xsd:attribute name="role" type="xsd:NMTOKEN"/>
		<xsd:attribute name="id" type="xsd:ID"/>
		<xsd:attribute name="diagnostics" type="xsd:IDREFS"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
		<xsd:attribute name="subject" type="xsd:string" use="default" value="."/>
		<xsd:anyAttribute namespace="##other" processContents="lax"/>
		<xsd:attribute name="xml:lang" type="xsd:language" >
	</xsd:complexType>
</xsd:element>
<xsd:element name="diagnostic">
	<xsd:complexType mixed="true">
		<xsd:choice minOccurs="0" maxOccurs="unbounded">
			<xsd:element ref="sch:value-of"/>
			<xsd:element ref="sch:emph"/>
			<xsd:element ref="sch:dir"/>
			<xsd:element ref="sch:span"/>
			<xsd:any namespace="##other" processContents="lax"/>
		</xsd:choice>
		<xsd:attribute name="id" type="xsd:ID" use="required"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
		<xsd:anyAttribute namespace="##other" processContents="lax"/>
		<xsd:attribute name="xml:lang" type="xsd:language" >
	</xsd:complexType>
</xsd:element>
<xsd:element name="diagnostics">
	<xsd:complexType>
		<xsd:sequence>
			<xsd:element ref="diagnostic" minOccurs="0" maxOccurs="unbounded"/>
		</xsd:sequence>
	</xsd:complexType>
</xsd:element>
<xsd:element name="dir">
	<xsd:complexType>
		<xsd:simpleContent>
			<xsd:restriction base="xsd:string">
				<xsd:attribute name="value">
					<xsd:simpleType>
						<xsd:restriction base="xsd:NMTOKEN">
							<xsd:enumeration value="ltr"/>
							<xsd:enumeration value="rtl"/>
						</xsd:restriction>
					</xsd:simpleType>
				</xsd:attribute>
			</xsd:restriction>
		</xsd:simpleContent>
	</xsd:complexType>
</xsd:element>
<xsd:element name="emph" type="xsd:string"/>
<xsd:element name="extends">
	<xsd:complexType>
		<xsd:attribute name="rule" type="xsd:IDREF" use="required"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="key">
	<xsd:complexType>
		<xsd:attribute name="name" type="xsd:NMTOKEN" use="required"/>
		<xsd:attribute name="path" type="xsd:string" use="required"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="name">
	<xsd:complexType>
		<xsd:attribute name="path" type="xsd:string" use="default" value="."/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="ns">
	<xsd:complexType>
		<xsd:attribute name="uri" type="xsd:uriReference" use="required"/>
		<xsd:attribute name="prefix" type="xsd:NCName"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="p">
	<xsd:complexType mixed="true">
		<xsd:choice minOccurs="0" maxOccurs="unbounded">
			<xsd:element ref="sch:dir"/>
			<xsd:element ref="sch:emph"/>
			<xsd:element ref="sch:span"/>
		</xsd:choice>
		<xsd:attribute name="id" type="xsd:ID"/>
		<xsd:attribute name="class" type="xsd:string"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
		<xsd:anyAttribute namespace="##other" processContents="lax"/>
		<xsd:attribute name="xml:lang" type="xsd:language" >
	</xsd:complexType>
</xsd:element>
<xsd:element name="pattern">
	<xsd:complexType>
		<xsd:sequence>
			<xsd:element ref="p" minOccurs="0" maxOccurs="unbounded"/>
			<xsd:element ref="sch:rule" maxOccurs="unbounded"/>
		</xsd:sequence>
		<xsd:attribute name="name" type="xsd:string" use="required"/>
		<xsd:attribute name="see" type="xsd:uriReference"/>
		<xsd:attribute name="id" type="xsd:ID"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="phase">
	<xsd:complexType>
		<xsd:sequence >
			<xsd:element ref="sch:p" minOccurs="0" maxOccurs="unbounded"/>
			<xsd:element ref="sch:active" maxOccurs="unbounded"/>
		</xsd:sequence>
		<xsd:attribute name="id" type="xsd:ID" use="required"/>
		<xsd:attribute name="fpi" type="xsd:string"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="report">
	<xsd:complexType mixed="true">
		<xsd:choice minOccurs="0" maxOccurs="unbounded">
			<xsd:element ref="sch:name"/>
			<xsd:element ref="sch:emph"/>
			<xsd:element ref="sch:dir"/>
			<xsd:element ref="sch:span"/>
			<xsd:any namespace="##other" processContents="lax"/>
		</xsd:choice>
		<xsd:attribute name="test" type="xsd:string" use="required"/>
		<xsd:attribute name="role" type="xsd:NMTOKEN"/>
		<xsd:attribute name="id" type="xsd:ID"/>
		<xsd:attribute name="diagnostics" type="xsd:IDREFS"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
		<xsd:attribute name="subject" type="xsd:string" use="default" value="."/>
		<xsd:attribute name="xml:lang" type="xsd:language" >
	</xsd:complexType>
</xsd:element>
<xsd:element name="rule">
	<xsd:complexType>
		<xsd:choice maxOccurs="unbounded">
			<xsd:element ref="sch:assert"/>
			<xsd:element ref="sch:report"/>
			<xsd:element ref="sch:key"/>
			<xsd:element ref="sch:extends"/>
		</xsd:choice>
		<xsd:attribute name="context" type="xsd:string"/>
		<xsd:attribute name="abstract" type="xsd:boolean" use="default" value="false"/>
		<xsd:attribute name="role" type="xsd:NMTOKEN"/>
		<xsd:attribute name="id" type="xsd:ID"/>
	</xsd:complexType>
</xsd:element>
<xsd:element name="schema">
	<xsd:complexType>
		<xsd:sequence>
			<xsd:element ref="sch:title" minOccurs="0"/>
			<xsd:element ref="sch:ns" minOccurs="0" maxOccurs="unbounded"/>
			<xsd:element ref="sch:p" minOccurs="0" maxOccurs="unbounded"/>
			<xsd:element ref="sch:phase" minOccurs="0" maxOccurs="unbounded"/>
			<xsd:element ref="sch:pattern" maxOccurs="unbounded"/>
			<xsd:element ref="sch:p" minOccurs="0" maxOccurs="unbounded"/>
			<xsd:element ref="sch:diagnostics" minOccurs="0"/>
		</xsd:sequence>
		<xsd:attribute name="id" type="xsd:ID"/>
		<xsd:attribute name="fpi" type="xsd:string"/>
		<xsd:attribute name="schemaVersion" type="xsd:string"/>
		<xsd:attribute name="defaultPhase" type="xsd:IDREF"/>
		<xsd:attribute name="icon" type="xsd:uriReference"/>
		<xsd:attribute name="version" type="xsd:string" use="default" value="1.5"/>
		<xsd:anyAttribute namespace="##other" processContents="lax"/>
		<xsd:attribute name="xml:lang" type="xsd:language" >
	</xsd:complexType>
</xsd:element>
<xsd:element name="span">
	<xsd:complexType>
		<xsd:simpleContent>
			<xsd:restriction base="xsd:string">
				<xsd:attribute name="class" type="xsd:string"/>
			</xsd:restriction>
		</xsd:simpleContent>
	</xsd:complexType>
</xsd:element>
<xsd:element name="title">
	<xsd:complexType mixed="true">
		<xsd:choice minOccurs="0" maxOccurs="unbounded">
			<xsd:element ref="sch:dir"/>
		</xsd:choice>
	</xsd:complexType>
</xsd:element>
<xsd:element name="value-of">
	<xsd:complexType>
		<xsd:attribute name="select" type="xsd:string" use="required"/>
	</xsd:complexType>
</xsd:element>
</xsd:schema>

Appendix D: EBNF Productions for Paths and Expressions

These have been abstracted from the relevant W3C Recommendations, which should be treated as the normative sourse. Schematron implementations should track the most recent W3C specifications.


AbbreviatedAbsoluteLocationPath    
::=    '//' RelativeLocationPath  
AbbreviatedRelativeLocationPath    
::=    RelativeLocationPath '//' Step  
AbbreviatedStep    
::=    '.'  | '..'  
AbbreviatedAxisSpecifier    
::=    '@'? 
AbsoluteLocationPath    
::=    '/' RelativeLocationPath?  | AbbreviatedAbsoluteLocationPath  
AdditiveExpr    
::=    MultiplicativeExpr  
| AdditiveExpr '+' MultiplicativeExpr  
| AdditiveExpr '-' MultiplicativeExpr  
AndExpr    
::=    EqualityExpr  | AndExpr 'and' EqualityExpr  
Argument    
::=    Expr 
AxisSpecifier    
::=    AxisName '::'  | AbbreviatedAxisSpecifier  
AxisName    
::=    'ancestor'  | 'ancestor-or-self'  | 'attribute'  
| 'child'  | 'descendant'  | 'descendant-or-self'  
| 'following'  | 'following-sibling'  | 'namespace'  
| 'parent'  | 'preceding'  | 'preceding-sibling'  | 'self' 
ChildOrAttributeAxisSpecifier    
::=    AbbreviatedAxisSpecifier  | ('child' | 'attribute') '::' 
Digits    
::=    +  
EqualityExpr    
::=    RelationalExpr  
| EqualityExpr '=' RelationalExpr  
| EqualityExpr '!=' RelationalExpr  
Expr    
::=    OrExpr  
ExprToken    
::=    '(' | ')' | '' | '.' | '..' | '@' | ',' | '::'  
| NameTest  | NodeType  | Operator  | FunctionName  | AxisName  
| Literal  | Number  | VariableReference  
FilterExpr    
::=    PrimaryExpr  | FilterExpr Predicate 
FunctionCall    
::=    FunctionName '(' ( Argument ( ',' Argument )* )? ')'  
FunctionName    
::=   'last' | 'position' | 'ount' | 'id' | 'local-name'
| 'namespace-uri' | 'name' | 'string' | 'concat' | 'starts-with'
| 'contains' | 'substring-before' | 'substring-after' | 'substring'
| 'string-length' | 'normalize-space' | 'translate' | 'boolean'
| 'not' | 'true' | 'fals' | 'lang' | 'number' | 'sum'
| 'floor' | 'ceiling' | 'round' | 'document' | 'key'
| 'format-number' | 'current' 
| 'system-property' (Caution: system-roperty() available with XSLT 1.1 only) 
IdKeyPattern    
::=    'id' '(' Literal ')'  | 'key' '(' Literal ',' Literal ')'  
LocalPart 
::=  NCName 
LocationPath    
::=    RelativeLocationPath  | AbsoluteLocationPath  
LocationPathPattern    
::=    '/' RelativePathPattern?  | '//'? RelativePathPattern  
Literal    
::=    '"' * "'"  
MultiplicativeExpr    
::=    UnaryExpr  
| MultiplicativeExpr MultiplyOperator UnaryExpr  
| MultiplicativeExpr 'div' UnaryExpr  
| MultiplicativeExpr 'mod' UnaryExpr  
MultiplyOperator    
::=    '*'  
NameTest    
::=    '*'  | NCName ':' '*'  | QName  
NCName 
::=  (Letter | '_') (NCNameChar)* 
NCNameChar 
::=  Letter | Digit | '.' | '-' | '_' | CombiningChar | Extender 
NodeTest    
::=    NameTest  | NodeType '(' ')'  
| 'processing-instruction' '(' Literal ')'  
NodeType    
::=    'comment'  | 'text'  | 'processing-instruction'  | 'node'  
Number    
::=    Digits ('.' Digits?)?  | '.' Digits  
Operator    
::=    OperatorName  | MultiplyOperator  
| '/' | '//' | '|' | '+' | '-' | '=' | '!=' | '<' | '<=' | '>' | '>='  
OperatorName    
::=    'and' | 'or' | 'mod' | 'div'  
PathExpr    
::=    LocationPath  | FilterExpr  
| FilterExpr '/' RelativeLocationPath  
| FilterExpr '//' RelativeLocationPath  
Pattern    
::=    LocationPathPattern  | Pattern '|' LocationPathPattern  
| IdKeyPattern (('/' | '//') RelativePathPattern)?  
Predicate    
::=    ''  
PredicateExpr    
::=    Expr 
Prefix 
::=  NCName 
PrimaryExpr    
::=    VariableReference  | '(' Expr ')'  | Literal  
| Number  | FunctionCall 
QName 
::=  (Prefix ':')? LocalPart 
RelativeLocationPath    
::=    Step  | RelativeLocationPath '/' Step  
| AbbreviatedRelativeLocationPath 
RelativePathPattern    
::=    StepPattern  
| RelativePathPattern '/' StepPattern  
| RelativePathPattern '//' StepPattern  
RelationalExpr    
::=    AdditiveExpr  
| RelationalExpr '<' AdditiveExpr  
| RelationalExpr '>' AdditiveExpr  
| RelationalExpr '<=' AdditiveExpr  
| RelationalExpr '>=' AdditiveExpr  
S    
::=    (#x20 | #x9 | #xD | #xA)+ 
Step    
::=    AxisSpecifier NodeTest Predicate*  | AbbreviatedStep  
StepPattern    
::=    ChildOrAttributeAxisSpecifier NodeTest Predicate*   
UnaryExpr    
::=    UnionExpr  | '-' UnaryExpr 
UnionExpr    
::=    PathExpr  | UnionExpr '|' PathExpr  
VariableReference    
::=    '$' QName  

Appendix F: Reference Implementation for Schematron 1.3

Following is a reference implementation for an earlier version of Schematron. It shows how simple a basic implementation can be. For reference and other implementations of Schematron 1.5, visit the website http://www.ascc.net/xml/schematron

<?xml version="1.0" ?>
<!-- Preprocessor for the Schematron XML Schema Language.
http://www.ascc.net/xml/resource/schematron/schematron.html

Copyright (c) 1999, 2000 Rick Jelliffe and Academia Sinica Computing Center, Taiwan

This software is provided 'as-is', without any express or implied warranty. 
In no event will the authors be held liable for any damages arising from 
the use of this software.

Permission is granted to anyone to use this software for any purpose, 
including commercial applications, and to alter it and redistribute it freely,
subject to the following restrictions:

1. The origin of this software must not be misrepresented; you must not claim
that you wrote the original software. If you use this software in a product, 
an acknowledgment in the product documentation would be appreciated but is 
not required.

2. Altered source versions must be plainly marked as such, and must not be 
misrepresented as being the original software.

3. This notice may not be removed or altered from any source distribution.

History: 
1999-10-18 Created RJ
1999-10-25 In report and assert should use apply-template not value-of
	Thanks to James Clark for this fix
1999-11-2  Add key element
1999-12-21 Add ns element: thanks Dave Carlisle for the code
2000-03-26 Add axsl:output and version- well spotted Oliver Becker 
2000-10-20 Add select to do-all-patterns: thanks Uche Obbuji
-->
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias">

<xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>

<!-- Category: top-level-element -->
<xsl:output
method="xml" 
omit-xml-declaration="no"
standalone="yes" 
indent="yes" />


<xsl:template match="schema">
<axsl:stylesheet version="1.0">
<axsl:output method="text" />

	<xsl:for-each select="ns">
		<xsl:attribute 
		name="{concat(@prefix,':dummy-for-xmlns')}"
		namespace="{@uri}"/>
	</xsl:for-each>
	<xsl:attribute name="version">1.0</xsl:attribute>
	<xsl:apply-templates mode="do-keys" />
	<axsl:template match='/'>
		<xsl:value-of select="title" />
		<xsl:apply-templates mode="do-all-patterns" />
	</axsl:template>

	<xsl:apply-templates />

	<axsl:template match="text()" priority="-1">
		<!-- strip characters -->
	</axsl:template>
</axsl:stylesheet>

</xsl:template>

<xsl:template match="pattern" mode="do-all-patterns" >
<axsl:apply-templates select="/" mode='M{count(preceding-sibling::*)}' />
</xsl:template>

<xsl:template match="pattern">
<xsl:apply-templates />

<axsl:template match="text()" priority="-1" mode="M{count(preceding-sibling::*)}">
	<!-- strip characters -->
</axsl:template>
</xsl:template>

<xsl:template match="rule">
<axsl:template match='{@context}' priority='{4000 - count(preceding-sibling::*)}' mode='M{count(../preceding-sibling::*)}'>
	<xsl:apply-templates />
	<axsl:apply-templates mode='M{count(../preceding-sibling::*)}'/>
</axsl:template>

</xsl:template>

<xsl:template match="name" mode="text">
<xsl:choose>
	<xsl:when test='@path' >
		<axsl:value-of select="name({@path})" />
	</xsl:when>		
	<xsl:otherwise>
		<axsl:value-of select="name(.)" />
	</xsl:otherwise>
</xsl:choose>
</xsl:template>

<xsl:template match="assert">
<axsl:choose> 
	<axsl:when test='{@test}'/>
	<axsl:otherwise>
		<xsl:if test="@role">(<xsl:value-of select="@role"/>) </xsl:if>
In pattern <xsl:value-of select="ancestor::pattern/@name"/>:
		<xsl:apply-templates mode="text" />
	</axsl:otherwise>
</axsl:choose> 

</xsl:template>

<xsl:template match="report">
<axsl:if test='{@test}'>
		<xsl:if test="@role">(<xsl:value-of select="@role"/>) </xsl:if>
In pattern <xsl:value-of select="ancestor::pattern/@name"/>:
		<xsl:apply-templates mode="text"/>
</axsl:if> 
</xsl:template>

<xsl:template match="rule/key" mode="do-keys">
<axsl:key match="{../@context}" name="@name" path="@use" />
</xsl:template>

<xsl:template match="text()" priority="-1" mode="do-keys" >
<!-- strip characters -->
</xsl:template>

<xsl:template match="text()" priority="-1" mode="do-all-patterns">
<!-- strip characters -->
</xsl:template>

<xsl:template match="text()" priority="-1">
<!-- strip characters -->
</xsl:template>

</xsl:stylesheet>

Appendix G: Notice of Intended Upgrades for ISO Schematron

Schematron is being standardized as part of the International Organization for Standardization (ISO) international standard DSDL. It will be known as

ISO/IEC 19757 - DSDL
Document Schema Definition Languages
Part 3 Rule-based validation - Schematron    

Schematron 1.5 will be the basis for this. It is expected that an existing Schematron 1.5 schema will run unchanged with ISO Schematron.

This appendix indicates the expected alterations, most of which have been announced previously or implemented in prototypes already.

1. Assertions allow value-of

Schematron 1.5 did not allow value-of in assertions. This was to enforce a distinction between diagnostics and assertions (which are intended to make positive statements of expectation.) Many users requested this change.

2. More flexibility with key

Schematron 1.5 only allowed the key element as part of rules. ISO Schematron will also allow key under the schema at the same position as phase elements. This was implemented in the ZVON Schematron, and users reported finding it useful.

3. Schematron as a Framework

The ISO Schematron standard will position Schematron as a framework (the elements) which potentially allows different query languages. This will probably be done by adding to the schema element an attribute

    use  NMTOKEN "XSLT" 

which allows the query/expression language to be stated. Anticipated values are

XSLT
XSLT 1.n, as currently used, this is the default
EXSLT
XSLT 1.n with the EXSLT extensions
XPATH
This is for implementations just using a simple XPath library. The element key would not be available.
XPATH2
The schema uses the mooted XPath2 spec.
XSLT2
The schema uses the mooted XSLT spec.
XQUERY
The schema uses the mooted XQuery spec.

This helps resolve or clarify a couple of issues: first, that some implementers have just used an XPath library; second, that we have to cope with different versions of XPath notably XPath2; third that there has been implementation experience using non-XPath query languages (Schemarama from Becket and Miller); and fourth to clarify that the Schematron idea is not just using XPaths but the particular configuration of assertions into rules into patterns into phases.

I expect there will be other schema languages which just add an assertion element to an element or attribute declaration (e.g. Eric van der Vlist's Examplotron), but this (though useful) is not Schematron: the key idea of Schematron is the pattern—an abstract structure which is expressed in terms of an element (the context) but may not actually have anything to do with that element.

This recasts Schematron as a general rule framework.

4. Variable statement let

Schematron 1.5 is cumbersome when expressing "datatype" kinds of constraints. It is powerful enough to parse a string into components, but frequently a string must be reparsed several times causing very verbose and error-prone expressions.

ISO Schematron will include a let statement that allows binding of variables within the scope of a rule. The variable value will be available using a $ delimiter, and can be implemented using XSLT variables. Presumably it would only be available when using XSLT or EXLST as the query language.

This feature is adopted from XCSL (XML Constraint Specification Language), with the kind blessing of XCSL's developer José Carlos Leite Ramalho

5. Abstract Patterns

Schematron was invented to able to declare and detect abstract patterns. This allows a document type to be declared in terms of rhetorical structures rather than physical structures. For example, to say this is a table and the row element name is tr or every paragraph has a heading to which it relates.

This would again be implemented using XLST variables. So we could say (the syntax is not fixed yet)

 <pattern isa="table">
	<param formal="row" actual="tr"/>
	<param formal="cell" actual="td"/>
 </pattern> 

 <pattern abstract="true" name="table">
	<rule context="$row">
		<assert test="$cell"
		>A <name/> should have at least one cell</assert>
	</rule>
</pattern>

So let statements allow clearer expressions in the test values, while abstract patterns allow clearer expression with fewer elements.

Also, with abstract patterns, it then becomes possible to do document to document mappings, because we can identify structures of related information items independently of their serialization and naming conventions. The role attribute can be used for this.

Acknowledgements

The Schematron was developed as a free software project at the Academia Sinica Computing Centre in 1999 and 2000 by the author. I thank the Director, Dr. Simon Lin, for his encouragement and support. Also I owe thanks for the support and contributions of Professor C.C. Hsieh, Dr Makoto Murata, Dr Oliver Becker (architecture), Dr Miloslav Nic (tutorials), Dr David Carlisle, Mr James Clark, Mr Adrian Edwards, Mr Uche Ogbuji, Mr Francis Norton, Mr David Pawson, Mr Eddie Robertsson, Dr. José Carlos Leite Ramalho, Dr. Dave Becket, Mr Ludvig Svenovius (extends) and the members of the Schematron mail list. Other work was performed with sponsorship from GeoTempo Inc., Taipei, Allette Systems, Pty. Ltd. Sydney, and Topologi Pty. Ltd. Sydney.

This specification is a much updated version of a paper delivered at the Pacific Neighbourhood Consortium/Electronic Cultural Atlas Initiative/Electronic Buddhist Text Initiative joint conference, University of California, Berkeley, Feb. 2000.

References

[1] Private conversation by the author with a Taiwanese MIS professor.

[2] The Schematron project website is at

http://www.ascc.net/xml/resource/schematron/schematron.html
A new website to encourage open source contributions is being established for full operation 1Q/20001 at
http://sourceforge.net/projects/schematron

[CIP] F.L. Bauer, M. Broy, B. Moller, P. Pepper, M. Wirsing, et al. The Munich Project CIP. Vol. I: The Wide Spectrum Language CIP-L, volume I of Lecture Notes on Computer Science. Springer Verlag, Berlin, Heidelberg, New York, Berlin, 1985.

[deFrancis] The Chinese Language

[Etessami] K. Etessami and M. Yannakakis, From Rule-based to Automata-based testing, Proceedings of FORTE/PSTV'2000, 20th IFIP Int. Conf. on Formal Description Techniques/Protocol Specification, Testing, and Verification, 2000, http://citeseer.nj.nec.com/410961.html

[Finkelstein]

[HTML]

[LeCarlier]Baudouin Le Charlier and Pierre Flener, Specifications Are Necessarily Informal or: Some More Myths of Formal Methods, The Journal of Systems and Software Vol. 40", No."3", March, 1998 citeseer.nj.nec.com/190011.html

[Namespace]

[SGML]

[Schemarama]

[SOX]

[RDF]

[RELAX] Murata M.

[URL]

[XLink]

[XML]

[XML Schema]

[XSL-FO]

[XSLT]

[WASH] http://www.cs.washington.edu/research/jair/volume2/minton94a-html/node3.html