Thirteenth Parallel /archive/rdf/

An Introduction to the Resource Description Framework (RDF)

By Daniel Pupius, June 2002

One of the original aims for the world wide web was that as well as humans being able to read and understand the content, so could machines.  As it is the www is machine-readable, but for the most part it is not machine-understandable.  One proposed solution to this problem is the use of metadata to describe the data contained within the web.  Metadata is "data about data", for example a library catalogue is metadata since is describes the books contained within that library.  In this context metadata is data that describes Web resources.

The Resource Description Framework (RDF) is a W3C recommendation that is a foundation for processing metadata.  It allows applications to exchange machine-understandable information on the web.  It can be used in a variety of application areas for example:
 - resource discovery, to provide better search engine capabilities
 - cataloguing, for describing the content and relationships for a particular website
 - knowledge sharing & exchange for intelligent software agents
 - describe collections of pages that represent one logical document
 - describe intellectual property rights
 
- privacy preferences, of both a user and a website
 - digital signatures

This tutorial provides an introduction to RDF, we will discuss a model for representing RDF metadata and a syntax based on XML. The XML syntax is only possible syntax for RDF, but is the most easy and most relevant.  We will also be looking at RSS - which is an implementation of RDF for sharing news and article information.

Basic RDF Model

The basis of RDF is a model for representing named properties and property values.  The properties can be thought of as attributes of a resource and correspond to the traditional attribute-value pairs.  The basic data model consists of three object types:
1.  A resource
2.  A property
3.  A statement

The resource is anything that can have a URI.  It may be part of a web page or even a whole collection of pages.

The property is a resource that has a name and is used to describe a specific aspect, characteristic, attribute or relation used to describe a resource.  Since a property is a resource, a property can have properties, but most of the time we are only really interested in the name.

A specific resource together with a named property plus a value of that property for that resource is an RDF statement.  These parts are known as the subject, the predicate, and the object.  The object of a statement (i.e. the property value) can be the URI of another resource or it can be a literal.  So a statement could be:

"The Author of http://13thparallel.org/tutorial/2002.06.rdf.htm is Daniel Pupius"

or it could be:

"The Author of http://13thparallel.org/tutorial/2002.06.rdf.htm is http://pupius.co.uk/info.rdf"

where info.rdf is a resource that describes me, the author.

These two statements are shown below in RDF/XML syntax:

<rdf:Description about="http://13thparallel.org/tutorial/2002.06.rdf.htm">
  <Author>Pupius Daniel</Author>
</rdf:Description>
<rdf:Description about="http://13thparallel.org/tutorial/2002.06.rdf.htm">
  <Author rdf:resource="http://pupius.co.uk/info.rdf"' />
</rdf:Description>

Features of RDF

From the W3C FAQ:

In general, RDF provides the basis for generic tools for authoring, manipulating, and searching machine understandable data on the Web thereby promoting the transformation of the Web into a machine-processable repository of information.

RDF provides the following features:
 - interoperability of metadata
 - machine understandable semantics for metadata
 - better precision in resource discovery than full text search
 - future-proofing applications as schemas evolve

Further development will enable RDF to also provide:
 - a uniform query capability for resource discovery
 - a processing rules language for automated decision-making about Web resources
 - language for retrieving metadata from third parties

Dublin Core

The Dublin Core (DC) metadata standard is a simple yet effective element set for describing a wide range of networked resources.  It provides a semantic vocabulary for describing the "core" information properties, such as "Description" and "Creator" and "Date", of a resource.  The DC can be used with other metadata systems, but is particularly powerful when used in conjunction with the RDF.

By using the 15 elements in the Dublin Core Metadata Element Set it is more likely that your metadata will be both human- and machine-understandable, since RDF itself only provides a model, not a vocabulary.  DC means that the value <author> actually has a meaning.  (Note: there are other modules and vocabularies that may be more relevant to a particular situation.)

An example XML document using the Dublin Core is:

<?xml:namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" prefix="RDF" ?>
<?xml:namespace ns="http://purl.oclc.org/DC/" prefix="DC" ?>
<RDF:RDF>
  <RDF:Description RDF:HREF="http://uri-of-some-document">
    <DC:Title>Some sample document</DC:Title>
    <DC:Subject>some, keywords, separated, by, commas</DC:Subject>
    <DC:Creator>John Smith</DC:Creator>
  </RDF:Description>
</RDF:RDF>

Use the Element Set to create more complex descriptions of your resources.

RDF Syntax

For this chapter we will continue using the Dublin Core Element Set and explore some of the ways you can arrange the metadata.

XML Syntax

The XML document can be formatted as shown above, however, the following is also acceptable:

<? xml version="1.0" ?>
<RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:DC="http://purl.oclc.org/DC/">
  <Description about="http://uri-of-some-document" >
    <DC:Title>Some sample Document</DC:Title>
    <DC:Creator>John Smith</DC:Creator>
    <DC:Subject>some, keywords, separated, by, commas</DC:Subject>
  </Description>
</RDF>

and you could also format it like:

<? xml version="1.0" ?>
<?xml:namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" prefix ="RDF" ?>
<?xml:namespace ns="http://purl.oclc.org/DC/" prefix="DC" ?>
<RDF:RDF>
  <RDF:Description
     about="http://uri-of-some-document"
     DC:Title="Some sample Document"
     DC:Creator="John Smith"
     DC:Subject="some, keywords"
   />
</RDF:RDF>

which is especially useful for embedding the RDF in XHTML or another tagging language that displays the content between > and <

Complex Properties

You can create complex properties by nesting properties.  For example:

<DC:Creator parseType="Resource">
  <x:name>Pupius Daniel</x:name>
  <x:bday>1979-12-12</x:bday>
  <x:geo>Sheffield, England</x:geo>
</DC:Creator>

Where x is a namespace vocabulary such as vcard.

Using literal

Using the parseType attribute with the value "literal" you can allow XML as the value, for example:

<DC:AddressLabel parseType="Literal">
  <b>Some person</b>
  <center>169 some street</center>
  <i>City</i>
  <ucase>england</ucase>
</DC:Creator>

Unordered lists

RDF allows multiple values, and unordered lists are called bags.  For example 13thParallel creators could be represented using:

<DC:Creator>
  <bag>
    <li>Pupius Daniel</li>
    <li>Ouwerkerk Michael<li>
  </bag>
</DC:Creator>

Ordered lists

Similar to <bag>, <seq> allows multiple values, however their order is important.  So, if we wanted an alphabetical list of creators we could use <seq> to ensure they are kept in that order:

<DC:Creator>
  <seq>
    <li>Ouwerkerk Michael<li>
    <li>Pupius Daniel</li>
  </seq>
</DC:Creator>

Alternative values

A good example for the use of <alt> is when using multiple languages in conjunction with the xml:lang attribute.  For example:

<DC:Title>
  <alt>
    <li xml:lang="en">Introduction to RDF</li>
    <li xml:lang="fr">Introduction à RDF</li>
    <li xml:lang="de">Einleitung in ein RDF</li>
  </alt>
</DC:Title>

Resources as properties

As seen in one of the earlier examples, it is possible to specify a resource for a property:

<DC:Creator rdf:resource="http://pupius.co.uk/info.rdf" />

Using RDF in HTML

The best way to link to your RDF code from a HTML is to create an external file and use the <LINK> tag as follows:

<link rel="meta" href="index.php.rdf" />

(If you are describing a single page, then a good convention is to use the filename of that page.)

RDF Site Summary (RSS)

You may have seen lots of sites offering news feeds recently, that allow you to automatically include links to other people's articles and news items in your site.  On the surface this may not seem too appealing!  Why would you want to post someone else's news on your site?  However, there are many useful ways that RSS can be used.

One example would be a network of sites, with each site having their own RSS feed for recent articles.  A central site could then then broadcasts the article lists for the entire network.  There are obviously many more uses, that I'm sure you can think of, but my points is that RSS is an easy way to share information.

While RSS 1.0 is available many people still use RSS 0.91, as it is easier.  Version 1.0 is similar but is a more strict RDF.   From the examples you should be able to create your own RSS file to share your content.

Example (0.91)

This example shows a simple RSS 0.91 file with 3 items.  While this is an older version of RSS it has a simpler syntax and many of the scripts for processing RSS feeds still use this version. View

Example (1.0)

Again this file has 3 items, but as you can see it has a different syntax: View

RSS Tools

RSS Monkey - PERL script for processing RSS files, and outputting HTML.
JERSS - Java Servlet for displaying RSS feeds.
RSS to ASP - ASP tools for converting RSS feeds to WML, HTML.  Works with 0.92 or 1.0
phpRSS - a php class for RSS handling

Conclusion

There is a lot more to RDF than is described in this tutorial, it merely functions as an introduction and enables you to create simple RDF documents that describe your sites and resources.  RDF is very powerful and if more people adopt it as a means of describing their sites then it should make the web a much easier place to navigate and share information.

Resources

Enterprise XML by Robert Standefer (ISBN 0-12-663355-X)
What is RDF?
W3C Resource Description Framework (RDF) activity page
W3C RDF FAQ
Dublin Core Element Set
RDF Site Summary 1.0
RSS Dev Center