Thirteenth Parallel /archive/rdf/
One of the original aims for the world wide web was that as well as humans being able to read and understand the content, so could machines. As it is the www is machine-readable, but for the most part it is not machine-understandable. One proposed solution to this problem is the use of metadata to describe the data contained within the web. Metadata is "data about data", for example a library catalogue is metadata since is describes the books contained within that library. In this context metadata is data that describes Web resources.
The Resource Description Framework (RDF) is a W3C recommendation that is a
foundation for processing metadata. It allows applications to exchange
machine-understandable information on the web. It can be used in a variety
of application areas for example:
- resource discovery, to provide better search engine capabilities
- cataloguing, for describing the content and relationships for a
particular website
- knowledge sharing & exchange for intelligent software agents
- describe collections of pages that represent one logical document
- describe intellectual property rights
- privacy preferences, of both a user and a website
- digital signatures
This tutorial provides an introduction to RDF, we will discuss a model for representing RDF metadata and a syntax based on XML. The XML syntax is only possible syntax for RDF, but is the most easy and most relevant. We will also be looking at RSS - which is an implementation of RDF for sharing news and article information.
The basis of RDF is a model for representing named properties and
property values. The properties can be thought of as attributes of a
resource and correspond to the traditional attribute-value pairs. The
basic data model consists of three object types:
1. A resource
2. A property
3. A statement
The resource is anything that can have a URI. It may be part of a web page or even a whole collection of pages.
The property is a resource that has a name and is used to describe a specific aspect, characteristic, attribute or relation used to describe a resource. Since a property is a resource, a property can have properties, but most of the time we are only really interested in the name.
A specific resource together with a named property plus a value of that property for that resource is an RDF statement. These parts are known as the subject, the predicate, and the object. The object of a statement (i.e. the property value) can be the URI of another resource or it can be a literal. So a statement could be:
"The Author of http://13thparallel.org/tutorial/2002.06.rdf.htm is Daniel Pupius"
or it could be:
"The Author of http://13thparallel.org/tutorial/2002.06.rdf.htm is http://pupius.co.uk/info.rdf"
where info.rdf is a resource that describes me, the author.
These two statements are shown below in RDF/XML syntax:
<rdf:Description about="http://13thparallel.org/tutorial/2002.06.rdf.htm"> <Author>Pupius Daniel</Author> </rdf:Description>
<rdf:Description about="http://13thparallel.org/tutorial/2002.06.rdf.htm"> <Author rdf:resource="http://pupius.co.uk/info.rdf"' /> </rdf:Description>
From the W3C FAQ:
In general, RDF provides the basis for generic tools for authoring, manipulating, and searching machine understandable data on the Web thereby promoting the transformation of the Web into a machine-processable repository of information.
RDF provides the following features:
- interoperability of metadata
- machine understandable semantics for metadata
- better precision in resource discovery than full text search
- future-proofing applications as schemas
evolve
Further development will enable RDF to also provide:
- a uniform query capability for resource discovery
- a processing rules language for automated decision-making about Web
resources
- language for retrieving metadata from third parties
The Dublin Core (DC) metadata standard is a simple yet effective element set for describing a wide range of networked resources. It provides a semantic vocabulary for describing the "core" information properties, such as "Description" and "Creator" and "Date", of a resource. The DC can be used with other metadata systems, but is particularly powerful when used in conjunction with the RDF.
By using the 15 elements in the Dublin Core Metadata Element Set it is more likely that your metadata will be both human- and machine-understandable, since RDF itself only provides a model, not a vocabulary. DC means that the value <author> actually has a meaning. (Note: there are other modules and vocabularies that may be more relevant to a particular situation.)
An example XML document using the Dublin Core is:
<?xml:namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" prefix="RDF" ?> <?xml:namespace ns="http://purl.oclc.org/DC/" prefix="DC" ?> <RDF:RDF> <RDF:Description RDF:HREF="http://uri-of-some-document"> <DC:Title>Some sample document</DC:Title> <DC:Subject>some, keywords, separated, by, commas</DC:Subject> <DC:Creator>John Smith</DC:Creator> </RDF:Description> </RDF:RDF>
Use the Element Set to create more complex descriptions of your resources.
For this chapter we will continue using the Dublin Core Element Set and explore some of the ways you can arrange the metadata.
The XML document can be formatted as shown above, however, the following is also acceptable:
<? xml version="1.0" ?> <RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:DC="http://purl.oclc.org/DC/"> <Description about="http://uri-of-some-document" > <DC:Title>Some sample Document</DC:Title> <DC:Creator>John Smith</DC:Creator> <DC:Subject>some, keywords, separated, by, commas</DC:Subject> </Description> </RDF>
and you could also format it like:
<? xml version="1.0" ?> <?xml:namespace ns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" prefix ="RDF" ?> <?xml:namespace ns="http://purl.oclc.org/DC/" prefix="DC" ?> <RDF:RDF> <RDF:Description about="http://uri-of-some-document" DC:Title="Some sample Document" DC:Creator="John Smith" DC:Subject="some, keywords" /> </RDF:RDF>
which is especially useful for embedding the RDF in XHTML or another tagging language that displays the content between > and <
You can create complex properties by nesting properties. For example:
<DC:Creator parseType="Resource"> <x:name>Pupius Daniel</x:name> <x:bday>1979-12-12</x:bday> <x:geo>Sheffield, England</x:geo> </DC:Creator>
Where x is a namespace vocabulary such as vcard.
Using the parseType attribute with the value "literal" you can allow XML as the value, for example:
<DC:AddressLabel parseType="Literal"> <b>Some person</b> <center>169 some street</center> <i>City</i> <ucase>england</ucase> </DC:Creator>
RDF allows multiple values, and unordered lists are called bags. For example 13thParallel creators could be represented using:
<DC:Creator> <bag> <li>Pupius Daniel</li> <li>Ouwerkerk Michael<li> </bag> </DC:Creator>
Similar to <bag>, <seq> allows multiple values, however their order is important. So, if we wanted an alphabetical list of creators we could use <seq> to ensure they are kept in that order:
<DC:Creator> <seq> <li>Ouwerkerk Michael<li> <li>Pupius Daniel</li> </seq> </DC:Creator>
A good example for the use of <alt> is when using multiple languages in conjunction with the xml:lang attribute. For example:
<DC:Title> <alt> <li xml:lang="en">Introduction to RDF</li> <li xml:lang="fr">Introduction à RDF</li> <li xml:lang="de">Einleitung in ein RDF</li> </alt> </DC:Title>
As seen in one of the earlier examples, it is possible to specify a resource for a property:
<DC:Creator rdf:resource="http://pupius.co.uk/info.rdf" />
The best way to link to your RDF code from a HTML is to create an external file and use the <LINK> tag as follows:
<link rel="meta" href="index.php.rdf" />
(If you are describing a single page, then a good convention is to use the filename of that page.)
You may have seen lots of sites offering news feeds recently, that allow you to automatically include links to other people's articles and news items in your site. On the surface this may not seem too appealing! Why would you want to post someone else's news on your site? However, there are many useful ways that RSS can be used.
One example would be a network of sites, with each site having their own RSS feed for recent articles. A central site could then then broadcasts the article lists for the entire network. There are obviously many more uses, that I'm sure you can think of, but my points is that RSS is an easy way to share information.
While RSS 1.0 is available many people still use RSS 0.91, as it is easier. Version 1.0 is similar but is a more strict RDF. From the examples you should be able to create your own RSS file to share your content.
This example shows a simple RSS 0.91 file with 3 items. While this is an older version of RSS it has a simpler syntax and many of the scripts for processing RSS feeds still use this version. View
Again this file has 3 items, but as you can see it has a different syntax: View
RSS Monkey -
PERL script for processing RSS files, and outputting HTML.
JERSS - Java Servlet for
displaying RSS feeds.
RSS to ASP - ASP tools for
converting RSS feeds to WML, HTML. Works with 0.92 or 1.0
phpRSS - a php class for RSS
handling
There is a lot more to RDF than is described in this tutorial, it merely functions as an introduction and enables you to create simple RDF documents that describe your sites and resources. RDF is very powerful and if more people adopt it as a means of describing their sites then it should make the web a much easier place to navigate and share information.
Enterprise XML by Robert Standefer (ISBN 0-12-663355-X)
What is RDF?
W3C Resource Description Framework (RDF) activity page
W3C RDF FAQ
Dublin Core Element Set
RDF Site Summary 1.0
RSS Dev Center