August 9, 2003
@ 12:58 AM
The first thing to remember is XML is syntax not semantics. Repeat it to yourself often. It is the biggest mistake made by many people who work with XML, even the supposedly experienced old hands. This mistaken assumption leads to statements like "XML is a self describing format" when in truth it is no such thing since self describing implies that the semantics of an XML document are self contained and inherent in the document which is a bogus claim that is obvious to anyone who's spent 5 minutes working with XML.

One thing to point out is that there currently is no way to specify in a machine readable way what an XML vocabulary actually does. What do I mean? Well, there is no way for a random XML-aware processor to take a random XML document and automatically know what to do with it. There's no way for Joe Blow's XML processor to read in a W3C XML Schema document, an XSLT stylesheet or an RSS 2.0 feed and automatically become a XML validator, an XML transform engine or a news aggregator. Without this fundamental piece which quite frankly I don't see being built anytime soon the rest of the discussion that Jon Udell points to is a waste of time. What points do I mean?
Arguably I should get a life :-), but for me this remark was an epiphany. I've long suspected that we won't really understand what it means to mix XML namespaces until we do some large-scale experimentation. What I hadn't fully appreciated, until just now, is the deep connection between RDF and namespace-mixing. Dan's original hard-line position, he now explains, was that there is no sane way to mix namespaces without some higher-order model, and that RDF is that model. That he is now modulating that position, and saying that none of us yet knows whether or not that is true, strikes me as both intellectually honest and potentially a logjam-breaker.
RDF is definitely not that model. Here's my pair of Turing tests for when we have that model.
  1. I can take a vanilla XSLT processor and pass it a stylesheet with EXSLT extension elements which my XSLT processor automatically learns how to process as valid stylesheet instructions.

  2. I can take a vanilla W3C XML Schema processor and pass it a schema with embedded Schematron assertions which it automatically learns how to use to validate an input document in addition to using the W3C XML Schema rules.
Any model that cannot provide a solution to both problems listed above is NOT the model for mixing XML namespaced vocabularies. Given that on the face of it the above problem seems more difficult to solve than a number of the problems the Artifical Intelligence folks were claiming they could in the years preceding the AI Winter I don't see why anyone who is just concerned with syndicating news feeds should be trying to solve such a significant problem as well.

Discussing this problem in the context of syndicating website news is not only a waste of time but actually derails the process.

Get yourself a News Aggregator and subscribe to my RSSfeed

Disclaimer: The above comments do not represent the thoughts, intentions, plans or strategies of my employer. They are solely my opinion.