Daniel Steinberg has a an article entitled Bosworth's Web of Data where he discusses some of the ideas Adam Bosworth evangelized in his keynote at the MySQL Users Conference 2005. Daniel writes,

Bosworth explained that the key factors that enabled the web began with simplicity. HTTP was simple enough that any "P" language or JavaScript programmer could build applications. On the consumption side, web browsers such as Internet Explorer 4 were committed to rendering whatever they got. This meant that people could be sloppy and they didn't need to be high priests of syntax. Because it was a sloppy standard, people who otherwise couldn't have authored content did. The fact that it was a standard allowed this single, simple, sloppy, open wire format to run on every platform.
...
The challenge is to take a database and do for the web what was done for content. Bosworth explained that you "need a model that allows for massively linear scalability and federation of information that can spread effortlessly across a federated web."

Solutions that were suggested were to use XML and XQuery. The problem with XML is that unlike HTML, there is not a single grammar. This removed the simple and sloppy aspects of the web. The problem with XQuery is the time it took to finish the specification. Bosworth noted that it took more than four years and that "anything that takes four years is not worth doing. It is over-designed. Intead, take six months and learn from customers."
...
The next solution used web services, which began as an easy idea: you send an XML request and you get XML back. Instead, the collection of WS-* specs were huge and again, overly complicated. Bosworth said that this was a deliberate effort on the part of the companies that control the specs, like IBM and Microsoft, which deliberately made the specification hard, because then only they could deliver technology to do it.
...
Bosworth predicts that RSS 2.0 and Atom will be the lingua franca that will be used to consume all data from everywhere. These are simple formats that are sloppily extensible. Anyone who wants to can use these formats to consume content or to author content. Contrast this with the Semantic Web, which requires that you get a large group of people to agree on the schema of everything.

There are lots of interesting ideas here. I won't dwell on the criticisms of XQuery & WS-* mainly because I tend to agree that they are both overdesigned and complicated. I also wont dwell on the apparent contradiction inherent in claiming that the Semantic Web is doomed because it requires people to agree on the same schema for everything then proposing that everyone agree on using RSS as the schema for all data on the Web. I have a suspicion of what he sees as the difference but I'll wait for a blog post from him clarifying that.

What I find very interesting is using RSS is the data access format for the Web. RSS gained popularity as a way to syndicate blog posts and news sites but its turned out to be a lot more versatile than that. Sites like Feedster and Amazon's OpenSearch technology show you can use RSS as a mechanism for providing search results and integrating search engines respectively. Podcasting shows you can use RSS to syndicate digital media content instead of just plain old text or HTML. With Amazon's syndicated feeds one can keep abreast of when new CDs, books and more are released.

Over the weekend I wrote the MSN Spaces photo album browser page which displays slideshows of all the photos in the various albums on a particular user's MSN Spaces space. This page also can display the photos on a randomly selected space. This webpage is entirely powered by RSS. The photos are obtained from the RSS feed for the Space and the list of random spaces is obtained by querying MSN search with the query "site:spaces.msn.com photo album" and requesting the results as RSS. In fact, the information from the MSN Spaces RSS feeds is enough to build something like the Flickr related tags browser, where instead of showing related tags one could show spaces related to the user from the information in their blog roll which happens to also be provided in the RSS feed. Pretty nifty and all without requiring building a REST, SOAP or XML-RPC API.

In situations where one simply wants to expose read-only data via a service on the Web, it's looking like RSS is the technology to beat. As more and more information is exposed as RSS feeds, there will be even more interesting things people will be able to do with this technology. At Microsoft we definitely are gung ho about exposing as much data as possible via RSS and I've been amazed at how much enthusiasm there is around the opportunites in this area.   

Side Note: Yesterday while at the Microsoft Research Social Computing Symposium I was chatting with Randy Farmer, who's one of the guys behind Yahoo! 360° and Yahoo's purchase of Flickr, and I mentioned that it seemed like 2003 was the year that RSS really started to take off. This was also the year that Dave Winer froze the RSS 2.0 spec and Sam Ruby gathered all the malcontents in the XML syndication space and gave them a shiny new toy to play with in Atom. Coincidence?


 

Wednesday, 27 April 2005 14:11:39 (GMT Daylight Time, UTC+01:00)
I've actually been thinking that RSS is the emerging as the standard "noun" of the web for a while. The way I see it:

HTTP methods are the verbs. (REST)
RSS data are the nouns.
Proxies and other intermediaries (SOAP) are the abverbs.

Winter
Wednesday, 27 April 2005 22:11:36 (GMT Daylight Time, UTC+01:00)
But, what kind of typing/structure does RSS allow? Or is there a future in RSS whereby XML structure will be embedded in an RSS 'document'?
Anupam
Thursday, 28 April 2005 20:54:25 (GMT Daylight Time, UTC+01:00)
I didn't understand a word that Anupam just said.
pb
Friday, 29 April 2005 02:40:16 (GMT Daylight Time, UTC+01:00)
I think it's absolutely no coincidence that the Atom effort started, and then the RSS spec got cleaned up, frozen, transferred to a neutral party, and a number of clarifications were made to the Metaweblog API. Could be the best thing that ever happened to both formats.
Friday, 29 April 2005 04:44:30 (GMT Daylight Time, UTC+01:00)
I'm writing an app that does exactly this, except I'm using Atom as my format of choice.
James
Friday, 29 April 2005 15:16:36 (GMT Daylight Time, UTC+01:00)
pb:

My comment was whether RSS will go beyond the current structure of Title, Description etc. For example, can I serialize a language (C++/Java/C#) object and send it over as an RSS feed? Or can I take an Excel spreadsheet and send it over an RSS feed?

Will the description contain an XML representation/schema of the Excel spreadsheet or C++ object?

Anupam
Friday, 29 April 2005 15:50:33 (GMT Daylight Time, UTC+01:00)
Anupam,
RSS is extensible which is why some of the unorthodox uses of it have risen up already. Secondly exchanging C++/Java/C# objects over the Web is a bad idea since these are implementation details of the server and instead you should be concerning yourself about what kind of data you want to pass around.

Try it, it's a very liberating way of thinking about building distributed applications.
Friday, 29 April 2005 19:10:00 (GMT Daylight Time, UTC+01:00)
I agree C++/JAVA/C# objects was a bad example. But what about things like excel spreadsheets which can be exported as an XML document?

A better example would be craigslist postings? Currently, folks are able to hack applications (http://www.paulrademacher.com/housing/) by looking at RSS feeds from Craigslist. I bet this guy had to do some parsing of the RSS feed to extract the address, price etc.

But, is there a future where craigslist or other websites will make their feeds available with tags that tell me the address, price rather than letting others hack it out of the feed?

Anupam
Friday, 29 April 2005 21:05:15 (GMT Daylight Time, UTC+01:00)
"But what about things like excel spreadsheets which can be exported as an XML document?"

Anupam, what goes for programming languages goes for Excel - implementation detail. XML documents of Excel spreadsheets are no more a panacea than XML documents of Java objects are.

"But, is there a future where craigslist or other websites will make their feeds available with tags that tell me the address, price rather than letting others hack it out of the feed? "

Yes, but it's a while away. All the progress in the last few years has been in transportation and packaging, not what's inside the packages, what you or I might call the 'content'. Chipping away at that content is the next step.
Friday, 29 April 2005 21:16:40 (GMT Daylight Time, UTC+01:00)
"I mentioned that it seemed like 2003 was the year that RSS really started to take off. This was also the year that Dave Winer froze the RSS 2.0 spec and Sam Ruby gathered all the malcontents in the XML syndication space and gave them a shiny new toy to play with in Atom. Coincidence?"

Hmm, what was that you were saying about great power and great responsiblity a while back?

No, methinks it's a coincidence. If anything, the rise of RSS is related to the descent of SOAP.

Comments are closed.