
The Schematron is a simple and powerful Structural Schema Language available now with an error-browser, interface for interactive debuggers, and tutorial !!!
Jumpomatic
- Need to Find a Schema?
- Need to Find an Implementation?
- Need to Understand more Details?
- Other Schema Languages
- Need to Read a Basic Overview?
- Need to Read the Specifications?
- News:
- New design home page. (This page which was the old home page has been simplified and moved to old-index.html.) The new home page has the latest news and the overview.
- Schematron namespace now uses RDDL! see http://www.ascc.net/xml/schematron. It is now the main root for language-related information. Implementation-related information will be going to the Source Forge projects.
- Updated the Schematron 1.5 specification in preparation for submitting it as a note to W3C. Please could any users of Schematron who work for a W3C Member organization or are an invited expert on a WG or IG contact Rick Jellife to talk about co-submitting it. (The more the better.)
- W3C XML Schemas schema for Schematron 1.5 now available. Also updated DTD to cope with namespaces and started debugging the big Schematron schema.
- Schematron 1.5 Syntax Quick Guide
- Kip Hampton releases Perl implementation of Schematron
- Article on Perl on XML.COM mentions Schmatron by Kip Hampton
- Schematron available in 4Thought system
- Excellent article at XML.COM on Schematron by C. Ogbuji
- Possible mascot (the Schematroll? Scheming Ron?) from Kody Chang and Rick. It is a cross between a bilby and a bandicoot. There is a green one too.
- Beta implementation (zipped tarfile) of Schematron 1.5 with language reference announced by Miloslav Nic.
- Specification for Schematron 1.5 includes DTD and Schematron schema. This includes several features made in response to user feedback. It is the version. All the skeletons and implementations will be moved over to use it.
- New version of pretty-printer for handling namespaces, with some fixes.
- Schematron-xml tool generates XML with XPath location attributes, courtesy Francis Norton.
- Schematron schemas for RSS 1.1 and SOAP added with bug fix, with pretty-printed versions
- I have relaxed the copyright to the simplest and free-est copyright I can find: the zlib/libpng license. The files have that were copyright Rick Jelliffe and Academia Sinica Computing Center have been changed appropriately. I hope developers of new implementations will use it (change the name of course!) too.
- Small enhancement to allow context="/" has been added to the skeletons and schematron-report. Thanks Uche Ogbuji from the maillist.
- First draft article (PDF) on data model/serialization issues from Rick Jelliffe, reviews welcome.
- Interesting site for schematron-like language, though targeted differently, at xlinkit.com (developed from University College, London.)
- Mailing list up and running! schematron-love-in is a mail list for discussing schematron and rule-based validation/annotation.
- We are moving and going open! This site will continue as the Academia Sinica home site for Schematron, but due to popular demand we are also starting a public project hosted as source forge. The URL will be http://schematron.sourceforge.net/". This site will allow easier development of schematron implementations and better resources for updates. We hope to have it up and running by the Monday September 18 (if Rick can read the documentation enough to figure it out)! This site also contains mail-lists on Schematron that you can subscribe to.
- Dan Connolly of W3C is trialing a Web Content Accessibility Checking Service. This is just an alpha version for testing, but it generates a new kind of report and uses the new architecture. See Implementations" below. It is an online service: if you work for the government or large corporation in some countries, you may find the law obliges you to provide good accessibility to people with disability to your webpages: how do you check? This kind of online service is the vanguard for trialling possible methods.
- Two companies have notified me of plans to release products incorporating Schematron, in the new year. I wish them luck and hope that other companies will also consider providing it as part of their XML Schema or XSLT products.
- New Spec: The new specification for Schematron is now available: it is a draft of a paper presented at a conference a Berkeley; please don't link to it because I have to correct it a bit. It also has example of how to handle namespaces using the ns element: a few users have reported not finding this information anywhere.
- But we are almost there for the 1.4 release: Oliver Becker has stepped in with an new architecture and has debugged his implementation on SAXON which is apparantly a little less forgiving than XT. Thanks Oliver for this! I have added the 1.4 bits, but I have not made this public until I can test it eenough (sorry for being so busy).
- Ludvig Svenovius has come up with an extension system: he reports it is useful and easy to implement--just what we want!
- Schematron 1.4 also adds some features for better RDF conversion and nicer user interfaces. However, its core is unchanged from Schematron 1: the additions make it more friendly to use and to develop user interfaces with, rather than adding to its power or complexity. I have tried to make the new specification easy to read. Suggestions are, as always, welcome. I may try starting a maillist on Schematron after the 1.4 release, perhaps at SourceForge or E-Groups.
- In other news, Schematron has been mentioned in the last few months by standards-makers involved in RDF, WAI, XHTML and in the XML Schema Compiler project.
- Understanding:
- Overview
- Dr Nic's Schematron Tutorials and Ken Holman's Practical Transformations using XSLT and XPath
- Uche Ogbuji's article on SunWorld Introducting the Schematron (Good for XSL programmers)
- DTD and Schematron schema including DTDs for Schematron 1.3 (the version currently implemented plus some requested enhancements)
- From Grammars to The Schematron (PDF slides)
- Schematron 1.5 implementation skeleton1-5.xsl now released.
- New Architecture:
Oliver Becker has contributed a new architecture for easier implementation and management of Schematron. We will be moving over to this architecture for Schematron 1.3: Oliver's implementation will make it easier to deploy the new version, to create your own implementation, and to understand how Schematron works.
In the new architecture, Oliver has made Schematron into a skeleton with hooks for routines you provide to perform various actions. Nice and much more generic. The process flow is unchanged from the original implementations: compile your Schematron schema through one of these sch-*.xsl files, then run the resutling XSLT script over your data document.Thanks Oliver!
- schematron-skel-ns.xsl (XSL) is the skeleton with namesace support. (obsolete due to skeleton1-5.xsl)
- schematron-skeleton.xsl (XSL) is the skeleton with no special namespace support (obsolete due to skeleton1-5.xsl)
- sch-basic.xsl (XSL) is a very simple example metastylesheet (obsolete due to schematron-basic.xsl)
- sch-message.xsl (XSL) is a simple metastylesheet that can be used to create interactive debugging when used with an XSLT implementation like XT that generate line numbers from xsl:message, for editors that can run programs and read the line-format messages (such as emacs, XED) (obsolete due to schematron-message.xsl)
- sch-report.xsl (XSL)
- Implementations:
- schematron-basic: Mimimal concept demonstration generating simple text
- schematron-message: plug-in to interactive debuggers such as XED and emacs
- schematron-report: Error-browser: generates HTML pages with specific user-oriented messages detailing the errors found: click on the messages and be taken to the element in the source. Courtesy of David Carlisle.
- Example report <<< LOOK AT ME FIRST!
- schematron-xml: Generates XML with an attribute location containing an XPaths to the location of the suspect element. Also available in version for documents with no namespace (Uses new architecture). Courtesy of Francis Norton.
- schematron-rdf: Automatic external markup tool: creates RDF statements for each detected pattern in a schema: the original patterns and rules are available as statements. The context element of the patterns is located by an XPointer. Beta only.
- schematron-pretty: A pretty-printer for schemas, also available in version for schematrons in namespace
- schematron-w3c: Alpha version by Dan Connolly of which generates HTML page with assertion messages followed by the little inline version of the node that failed the assertion. Used in online Web Content Accessibility Checking Service. (Alpha version still called schematron-report, but it is really a new implementation built with the new architecture.)
- schematron-pretty for IE5: An XSL stylesheet for schemas for IE5. Just add the following processing instrucion to your Schematron schema, at the top of the document (after any <?xml ...?> at the top): <?xml-stylesheet href="schematron-ie5.xsl" type="text/xsl" ?> but make the href point to your local copy. This way you don't need to a separate process to pretty print the schema! Thanks to Adrian Edwards for this port.
- Schemas
- Schematron Tutorials courtesy of Dr Miloslav Nic in Prague: a catalog of frequently used rules.
- Schematron schema for Schematron with DTD
- Schematron schema for W3C XML Schema specifications (beta)
This validates XML Schema specifications, and is not validator that uses XML Schema specifications to validate other documents. To install, first get schematron-report (above). Here is a sample.- Schematron schema for News Interchange Transfer Format (NITF)
This validates things additional to the DTD courtesy of Dave Pawson.- Schematron schema for well-known Purchase Order example,
- Schematron schema for Web Accessibility Initiative (WAI),
- Schematron schema for Synchronized Multimedia Interchange Language (SMIL)
- Schematron schema for RSS Validator (RSS: a contraction without a name! Resource Syndication Syntax?)
from Leigh Dobbs at his site.
- pretty-printed schema (local)
- Schematron schema for SOAP from Rick Jelliffe.
- Schematron schema for XLink from Rick Jelliffe (beta)
- Systems
- XT is what we develop Schematron with
- 4XSLT Uche Ogbuji email that Schematron works fine with Fourthought's Python-based tools
- Oliver Becker reports that (with the version number fix, 2000-03-27) Schematron works with SAXON
- New report that Schematron does not work OK with some version of LotusXSL/Xalan, because it implements namespace handling differently: I am looking into this, and should have an answer this week.
- Chatter & Gossip
- Schematron presented at database session at Pacific Neighbourhood Consortium conference, University of California, Berkeley, Jan 2000. Draft of paper available.
- Implementation and Rationale Notes
- User Questions
- Extensions Under Consideration
- Related Material
Current version of this document: 2001-02-15
The current implementations include code for the extensions we are planning for v.1.3 for keys and namespaces.
If you want to contribute any schemas, please email ricko@gate.sinica.edu.tw
If you want to contribute any tutorial patterns, please email Dr Miloslav Nic [announcement]
Implementation and Rationale Notes
The Schematron represents a radical break from conventional schema language design:
- First and foremost, is the strategy/aim of easy implementation on top of XSL rather than of easy implementation by a hedge-automaton (though, it is probably that a hedge-automaton implementation could indeed be made, though probably this is not a trivial task).
- Second, is the strategy/aim that the language should be trivially implementable in a GUI with simple forms: hence the fixed hierarchy of elements.
- Third, is the pragmatic consideration that DTDs exist as part of XML 1.0 and it is doubtful they will go away. Furthermore, XML Schemas and other schema languages use similar content model systems and so leave the same kinds of structures unverifiable. So there is the need for a complimentary schema language based on entirely different principles, a feather duster for the furthest corners of a room where the vacuum cleaner cannot reach.
- Fourth, is the idea that classes, archetypes and other abstractions may hinder direct specification of schematic constraints by the developer: all metaphysical categories are abstractions and an abstraction that is clarifying in one instance may be muddying in another. Hence, the pattern model tries to be as direct as possible: rather than allowing sophisticated type hierarchies composed from very basic paths (element, attribute) and expressions (content models), the pattern model provides basic grouping of sophisticated paths (XPaths paths) and expressions (XPath expressions).
- Fifth, is the idea that the purpose of a schema language is to locate patterns which do not fit in with the assertions made about the instance, and making those assertions available in specific statements which allow machine logic to be applied. A grammar, especially one using many non-terminals, may not provide the correct enough information about why an instance is in error: for example, to identify the pattern which has not been adhered to and to locate from the point of fault detection the higher-level element which causes the requirement.
- Sixth, is the idea that a language may be varigated into several variants. These variants may be useful at different phases of a workflow or of a document's life. A language may be maintained keeping a measure of backwards compatability. Therefore it is an important part of a schema language to be able to cope with variants and report them.
- Finally, its name is more funny than any other comparable schema language's!
There is a legitimate question: is this really a schema language or simply a validation language? Well, it is a schema language, but the subject of the schema is not the surface markup of elements, attributes and values, but rather the patterns that combine these. In other words, what we are defining through this schema language is cohesion rather than merely parent-child coupling.
Other background material can be found at http://www.ascc.net/xml/en/utf-8/schemas.html
User Questions
Various people have made useful suggestions or reported problems.
- Q. Is there a way to get line number or find out the originating entity of a problem? (A. Yes. Use schematron-message.)
- Q. Can I make the current implementation report back the context information in the error string? (A. The assert and report statements are not error strings intended for diagnostics, but many people use them this way. It is a legitimate need, but rather than providing more access to Xpath expressions other than name(), the upcoming version of Schematron will support diagnostic attribute on report and assert: this contains IDREFS to diagnostic elemements which can go in the schema foot.)
- Q. Do multiple rules match within a pattern, or just one? (A. All assert and report statements in all rules in all patterns are active all the time. Within a pattern, if a node --i.e. an element or attribute-- is matched by one rule, it will not be caught by a rule written later in the schema.)
- Q. Can a Schematron be implemented on SAX (i.e., streaming) or does it require DOM? (A. It is the same question as with XSL. If you only use paths and expressions that do not require lookahead, you can implement simply on top of SAX; if you use paths and expressions that require lookahead (this includes the children:: axis) then you probably need DOM (i.e., have the kinds of nodes you may need to test available in-memory as a tree. )
- Q. What is the difference between report and assert? (A. Assert gives its message if its test evaluates to false. Report gives its message if its test evaluates to true.)
- Q. How can I group asserts and reports together, to use in different rules and different contexts? (A. Use entity declarations. There are many possibilities for providing higher layers on top of schematron schemas. I want to see what XML Schemas do: if it is easy to implement the XML Schema system in XSLT then probably I will take it. But there are many possibilities, and I don't want to confuse people. A Schematron schema can be partial, so there is less need for a brilliant class or architype mechanism that may be required for an all-singing, all-dancing system like XML Schema.)
- Q. What mechanisms does Schematron need for the runtime manipulation of its schemas? (A. Because of the fixed element depth, schemas can be easily maintained using tables GUIs not tree GUIs. I think this is easier for people. Also, I think that tree-pattern schemas lend themselves very well to the wizard approach: I can imagine a GUI so that you first create an example instance then some wizard leads you around the instance: you you select other elements or attributes that belond to the same pattern as the current element or attribute--the schema can be built incrementally and interactively.)
- Q. Can Schematron schemas be used with XML Schemas? (A. Yes. Rick is on the XML Schema Working Group and is trying hard to make integration easy. Schematron is pretty complimentary to grammar-based schema languages. The most important thing is that XML Schemas now allows an element appinfo in which an XML schema can be embedded. A simple implementation can just extract the Schematron schema and run it on the document. A more integrated avenues would open up if the XML Schema WG specify that, as part of schema-validation, the xsi:type attribute is added to all elements. That allows Schematron validation based on the "type" (in XML Schema terms) as well as just the element name: <rule context="*[xsi:type='address']">... for example.
Extensions In Schematron 1.5
See the page DTD and Schematron for details on the 1.5 version on Schematron which is prepared but not tested yet. The code for keys and namespaces is already out.
- Dynamic Schemas. I want to use the id on patterns in a mechanism to allow pattern-checking to be turned on and off. For example: to validate tables only. to make an explicit workflow DTD with phases: each phase switches on and off certain patterns. Because I think they meet different requirements perhaps both would be good: config file allows user-end variants, a phase element in the schema allows centralized description of variants.
These dynamic schemas can be used in a few ways:
- workflow validation, where the document goes through different stages
- partial validation, where a document is under construction and the user merely wants to validate a particular pattern; this can be used for both literature and for data-interchange debugging;
- variant documents, where a markup language has gone through some kind of evolution, and all the variants are treated as different patterns;
- where a document's errors in a certain regard are known and the user wants to tolerate them for some reason.
- A better mechanism for generating nice diagnostics. The diagnostics hint mechanism is the way to go, I think.
- An explicit xml:lang attribute on schema. This is a no-brainer. Originally I was going to include an explicit linking mechanism to connect to alternate-language versions of the assert and report strings; however, retrieval of language-specific versions of the schema seems the better way.
I have been asked how Schematron relates to various other technologies. Schematron is used entirely to detect patterns in an XML document and to associate those patterns with various labels: text, role identifiers, etc. The two areas targeted for this are for document validation and for automated markup systems (where the former is considered a particular case of the latter, at heart). So it is targeted at usage schemas rather than definitional schemas. These definitional schema languages implement particular data-modeling paradigms (e.g., the data is a tree, the data is a table, the data is a particular kind of graph) while Schematron only takes the view that there are patterns that exist atop more regular structures. A definitional schema answers the question "what is this element or attribute or record?" while a usage schema answers the question "what constraints are imposed in this data by its context?"
Here are some alternatives to Schematron.
Copyright 1999,2000 (C) Rick Jelliffe, Academia Sinica Computing Centre, Taibei. The Schematron software and this page are available for any public use, under the conditions of the zlib/libpng license (the least restrictive), but please mention our names in any documentation or About screens for any products that uses it. Comments, fixes and upgrades welcome: email ricko@gate.sinica.edu.tw