Extended Linking Comes to the WWW: XLinks

Rick Jelliffe
Academia Sinica Computing Centre
1999-08-02

What if every piece of data on every computer all over the world could be identified, located and linked to? XLink is a key specification to allow this on the WWW.

HTML Links

Because of the WWW, most people are familiar with the basic idea of HyperText links:

"A link has two ends -- called anchors -- and a direction. The link starts at the "source" anchor and points to the "destination" anchor, which may be any Web resource (e.g., an image, a video clip, a sound bite, a program, an HTML document, an element within an HTML document, etc.). " HTML 4.0 Specifiication (http://www.w3.org/TR/REC-html40/struct/links.html#h-12.1)

If you are an experienced user of your browser, you will know that right-clicking on a link brings up a menu; one of the options is "Open in a New Window". In HTML, these kinds of links are created using the a (anchor) element.

If you are an HTML guru, you will also know that there is a special element that can appear inside head elements, called link. This allows other kinds of links: for example, to link the page to a stylesheet (new browsers give the user a choice which stylesheet to use), or to link the page to some music (which will be played automatically after it has been downloaded), or to link to the next page (to allow it to be pre-fetched, so that the user does not have to wait so long.)

All these are kinds of links too. In fact, there are many kinds of hypertext links that HTML does not allow: even the earliest of hypertext systems (in the late 1960s and early 1970s) provided richer kinds of linking. HTML's success is because it used the simplest form of hypertext link (two ends, remote end addressed, replacement behaviour) in a markup language (users do not require a special tool to create a web page, just an editor).

Links can do more!

When we look at the subject of links, we find that many things-that-do-not-look-like-links are links. Almost anything that includes some kind of electronic address, identifier, location or query is a link. If you use "relationship" or "role" or "pointer" or "has-a" when you think about two pieces of data, you can use links to mark them up in XML.

In fact, the criteria that HTML 4.0 uses to defines hypertext links are not necessary characteristics of links at all!

This note looks at XML Links, which are currently being developed by the World Wide Web Consortium (W3C) for use in XML and HTML; you can find the latest draft specifications at http://www.w3.org/TR/.

What can we use XLinks for? Well, anything that you want...data modeling, data interchange, hypertext, anything where there are complex relationships between different branches of data. One important use of XLinks will be to build topic maps: these are structured metadata which link various WWW resources according to their relevence to particular topics. Topic maps allow external annotation of all sorts of data. (There is an ISO standard for Topic Maps recently completed.)

XLinks

A link is an explicit relationship between two or more resources or portions of resources. (http://www.w3.org/TR/xlink)

A hyperlink is a link that is meaningful to end-users...frequently for direct use. Perhaps we can say that hypertext is linked text, where the links are made directly available to the end user, rather than hidden (some experts would dispute this definition, though).

XLink defines some generic elements for links: simple, extended, group and document.

Let us look at simple links and then extended links.

Simple XLinks

A simple link can be expressed in two ways, which acheive the same function. You can use the name simple directly as the element type name (with an XML Namespace prefix) or you can use an xlink:type attribute; this second mechanism allows us to retro-fit XLinks on top of HTML. W3C plans to use XML as the base syntax for HTML in the future, so it is important that XLinks also encompass HTML.

Let us take an HTML link (HTML browsers allow the XML syntax for an empty element: /> ):

<a href="http://www.ascc.net/xml" title="Chinese XML Now! home page" />

To convert this to an XLink only requires two attributes:

<a href="http://www.ascc.net/xml" title="Chinese XML Now! home page" 
	xmlns:xlink="http://www.w3.org/XML/XLink/0.9" xlink:type="simple" />

The same link can be expressed using xml:simple as the element type name:

<xlink:simple href="http://www.ascc.net/xml" title="Chinese XML Now! home page" 
	xmlns:xlink="http://www.w3.org/XML/XLink/0.9"   />

(The strange-looking xmlns:xlink attribute is an example of a namespace declaration. Namespaces are a way to allow better management of names in documents where element types have been defined by different sources. You can expect to see a lot of this in HTML in the future.)

Extended XLinks

Here is an example of an extended link. It links all the three people in my office to the phones they answer. Notice that this is very different from the HTML's hypertext links: it is a many-to-many link, it is made from many arcs, and the linking element (i.e., xlink:extended) is not itself at the end of any arc.

<office version="1">
	<phone  id="phone1" >2789 9380 </phone>
	<phone  id="phone2" >2782 6432 </phone>
	<person id="Rick"   >Rick J.   </person>
	<person id="Stella" >Stella S. </person>
	<person id="Eva"    >Ivy Y.    </person>
	<xlink:extended title="All phone numbers of people in our office" 
		role="phone_number_of" 
		xmlns:xlink="http://www.w3.org/XML/XLink/0.9" >
		<xlink:arc from="Rick"   to="Phone1" />
		<xlink:arc from="Eva"    to="Phone1" />
		<xlink:arc from="Eva"    to="Phone2" />
		<xlink:arc from="Stella" to="Phone2" />
	</xlink:extended> 
</office>

It may seem strange that all these arcs are bundled together as part of the same extended link. Some people may prefer the following version instead (these links does not have any "from" and "to": just the href locations of the link-ends):

<office version="2"
	xmlns:xlink="http://www.w3.org/XML/XLink/0.9" >
	<phone  id="phone1" >2789 9380 </phone >
	<phone  id="phone2" >2782 6432 </phone >
	<person id="Rick"   >Rick J.   </person>
	<person id="Stella" >Stella S. </person>
	<person id="Eva"    >Ivy Y.    </person>
	<xlink:extended title="Phone numbers for Rick" role="contact_number"  >
		<xlink:locator href="?xptr=id(Rick)" role="contact"  />
		<xlink:locator href="?xptr=id(Phone1)" role="number" />
	</xlink:extended>
	<xlink:extended title="Phone numbers for Ivy" role="contact_number" >
		<xlink:locator href="?xptr=id(Eva)"    role="contact" />
		<xlink:locator href="?xptr=id(Phone1)" role="number"  />
		<xlink:locator href="?xptr=id(Phone2)" role="number"  />
	</xlink:extended> 
	<xlink:extended title="Phone numbers for Stella" role="contact_number" >
		<xlink:locator href="?xptr=id(Stella)" role="contact" />
		<xlink:locator href="?xptr=id(Phone2)" role="number"  />
	</xlink:extended> 
 
</office>

In a locator, we use full hypertext referencing: the href attribute contains a URL; the role attribute contains keywords a computer can use to figure out what to do with the link.

XLinks provide a very rich and exciting extension to URIs. They have several other attributes available with which different kinds of generic behaviour for hypertext links can be specified:

XLink is an application of the same software engineering approach we see in XML/SGML: this layered approach says that we can make more manageable, portable and extensible systems by specifying the structures of the data independently of the specific operations to be performed on the data by a particular tool. We should not confuse a link and what we want to do with that link.

For HTML Gurus: the syntax used in the href attribute in the last example may seem strange: it is a legitimate URL, which just has the query portion. It is a query on the current document! URLs have a special syntax (starting with ? question mark) to allow a query at the end of a URL; many links to databases used this. As part of the XLink development effort, W3C is developing a special kind of query, called an XPointer. The syntax for an XPointer is that it begins with ?xptr= and then is followed by an XML Path Expression. id(Phone2) is an example of an XML Path Expression; it points to the element with the unique identifier "Phone2" in the current document (which is pretty much just what the from and to of the xlink:arc element type do). XML Path expressions allow you to select particular branches of the element tree of some structured data, using many kinds of criteria.

For a good article on linking, see the article by Prof. Steve DeRose in Journal of Markup Theory and Practise, issue 1, available in the ASCC library.


Copyright (C) 1999 Rick Jelliffe. Please feel free to publish this in any way you like, but try to update it to the most recent version, and keep my name on it.