Thursday, 05 May 2005 - Dare Obasanjo's weblog

May 5, 2005

@ 06:44 PM

On Replacing WSDL with Something Simpler

In a recent post entitled Replacing WSDL, Twice Tim Bray writes

Lets make three assumptions: First, that Web Services are important. Second, that to make Web Services useful, you need some sort of declaration mechanism. Third, that WSDL and WSDL 2, despite being the work of really smart people, are so complex and abstract that they have unacceptably poor ease-of-use. What then? Naturally, the mind turns to a smaller, simpler successor, sacrificing generality and eschewing abstraction; in exactly the same way that XML was a successor for SGML. Well anyhow, thats the direction my mind turned. So did Norm Walshs; his proposal for NSDL also includes a helpful explanation of why Web-Service description is important. My sketch is called SMEX-D.

Although the XML geek in me wants this blog post to be a critical analysis of NSDL and SMEX-D, I think a more valuable discussion is questioning the premise behind Tim Bray's post. I agree with his first assumption; web services are important. I also agree with his third assumption that the various flavors of WSDL are too abstract and complex to be of easily useful. More importantly, most of the interoperability problems in the XML Web Service space are usually the fault of WSDL and the XSD type system which it depends on.

However I happen to disagree with his second premise; for Web Services to be useful you need some sort of declaration mechanism. Just to contradict by example, I can point to a wide number of web services such as the Yahoo! Search web services, Flickr web services, del.icio.us web services, 43 Things web services, Bloglines web services and every web site that provides an RSS feed as examples of useful web services that don't use some sort of declarative mechanism to describe the services.

Before deciding to reinvent WSDL, people like Tim Bray and Norm Walsh should ask themselves what purpose a description language like WSDL serves. This is exactly what Mark Nottingham does in his post Questions Leading to a Web Description Format where he writes

A while back, I published a series of entries ( 1 , 2 , 3 , 4 ) about would-be Web Description Formats, with the intent of figuring out which (if any) is suitable, or whether a new one is required.
...

As I said, Ive talked about specific use cases for a Web description format before, but to recap, the big ones are:

Code Generation Its very useful to start with a description of the site, and then code by filling in the blanks. In the Web services world, this is referred to as coding to the contract or contract-first development, and it makes a lot of sense (although I think contract is needlessly legalistic, and misleading too; it implies a closed world, when in reality a description is very open, in that its always subject to additional information being discovered).

A couple of ways that this might manifest is through stub and skeleton generation, and auto-completion and other whizz-bang hinting in tools. Wouldnt it be nice for Eclipse to give you a drop-down of the valid URIs that will give you a certain type when youre coding?

...

Dynamic Configuration Ive complained about the poor state of Web server configuration before, so Ill spare you a repeat of the full polemic. A proper description format would be one mechanism to allow more transparent configuration of servers, and better use of the Web (and HTTP).

...

Application Modeling and Visualisation Finally, theres a considerable amount of value in having a standard representation (thats intentional, folks) of a sites layout and configuration; you can discuss it with peers, evolve it over time in a manner thats independent to the implementation, develop tools to manipulate and visualise it, and so forth.

Looking at this list of benefits of having a description language for web services, I don't see anything that leaps out to me as being a must have. Being able to generate skeleton code from an XML description of a web service is nice but isn't a showstopper. And in the past, the assumptions caused by such toolkits has led to interoperability problems so I'm not enthusiastic about code generation being a good justification for anything in this area.

The dynamic configuration bit is interesting but it is unlikely that we will come up with a description language generic enough to unify formats as diverse as RDDL, RSD, P3P, WSDL, etc that won't be too complex or too simple to be useful.

As for application modelling and visualization, I'm not sure why we need an XML format for that. If someone decides to come up with a dialect of UML for describing web services there doesn't need to be a reason for it to have an XML serialization format except for code generation which as previously stated isn't such a great idea anyway.

For me the question isn't whether we should replace WSDL but rather whether we even needed it in the first place.

Categories: XML Web Services

May 5, 2005

@ 01:05 PM

Comments [12]

Top 5 Sites That Need To Get RSS Feeds

Every once in a while I find myself still having to check a few websites directly instead of reading them in my favorite RSS reader. The top 5 sites I still check by hand which I'd love to see get RSS feeds are

What are your list of sites that should have RSS feeds but don't? Maybe we can get together and sign a petition. :)

Categories: Mindless Link Propagation | Ramblings

May 3, 2005

@ 04:54 PM

Comments [12]

Is Applying Tagging to the File System a Good Idea or a Bad One?

Russell Beattie has a post entitled Spotlight Comments are the Perfect Spot for Tags! where he writes

I read just about every sentence of the Ars Technica overview of OSX Tiger and learned a lot, especially the parts where the author drones on about OSX's support for meta-data in the filesystem. I originally thought the ability to add arbitrary meta data to any file or folder was an interesting capability, albeit not particularly useful in day-to-day activities. But then I was just playing around and saw the Spotlight Comments field that's now included at the very top of a file or folder's Info box and I grokked it! Now that there's actually an easy way to both add and to search for meta-data on files and folders, then there's actually a reason to put it in! But not just any meta-data... What's the newest and coolest type of meta-data out there? Yep, tags! And the comments fields is perfect for this!

Obviously nothing has changed in terms of the UI or search functionality, just the way I think about meta data. Before I may have ignored an arbitray field like "comments" even if I could search on it (haven't I been able to do something similar in Windows?). But now that I "get" tagging, I know that this isn't the place for long-winded description of the file or folder, just keywords that I can use to refer to it later. Or if those files are shared on the network, others can use these tags to find the files as well. Fantastic!

This sounds like a classic example of "When you have a hammer, everything looks like a nail". One of the interesting things about the rush to embrace tagging by many folks is the refusal to look at the success of tagging in context. Specifically, how did successful systems like del.icio.us get around the Metacrap problem which plague all attempts to create metadata systems? I see two aspects of the way del.icio.us applied tagging which I believe were key to it becoming a successful system.

Tagging is the only organizational mechanism: In del.icio.us, the only way to organize your data is to apply tags to it. This basically forces users to tag their data if they want to use the service otherwise it quickly becomes difficult to find anything.
It's about the folksonomy: What really distinguishes services like del.icio.us from various bookmarks sites that have existed since I was in college is that users can browse each other's data based on their metadata. The fact that del.icio.us is about sharing data encourages users to bookmark sites more than they typically do and to apply tags to the data so that others may find the links.

Neither of the above applies when applying tags to files on your hard drive. My personal opinion is that applying tagging to the file system is applying an idea that works in one context in another without understanding why it worked in the first place.

Categories: Technology

May 3, 2005

@ 04:02 PM

Comments [4]

Explaining Why Old Posts on MSN Spaces are Marked as New in Bloglines

Over the past few weeks there have been a bunch of reports on internal mailing lists about problems with MSN Spaces RSS feeds and Bloglines. The specific problem is that every once in a while old posts containing photos are marked as being new in Bloglines. There have also been some complaints that indicate this problem also manifests itself in Newsgator as well.

After some investigation we discovered that this problem seemed to only occur in RSS items containing links to photos hosted on our storage servers such as blog posts with photo attachments or photo albums. This led to a hunch that this problem only affected RSS readers that mark old posts as new if any content in the <description> element changes. Once this was confirmed then we had our answer. For certain reasons, the permalink URL to an image stored on our storage servers changes over time*. Whenever one of these changes to the URLs of images takes place, then RSS readers that detect changes to the content <description> element of a feed will indicate that this post has been altered.

A brief discussion with the folks behind Bloglines indicates that there isn't a straightforward solution to this problem. It is unlikely that they will change their RSS parsing code to deal with the idiosyncracies of RSS feeds provided by MSN Spaces. Being the author of an RSS reader as well, I can understand not wanting to litter the code with special cases. Similarly it is unlikely that we will be changing the behavior that causes URLs to images hosted on our servers to change in the short term.

After chatting with Mike and Jason about this one of the solutions we came up with was to use the dcterms:modified element in our RSS feeds. The element would contain the date of the last time a user directed change was made to the item, in this case the item would be a blog post or photo album. This means that RSS readers can simply test the value of the dcterms:modified element to determine if a post was changed by the user instead of performing inefficient textual comparisons of the contents of the post. In fact, the main reason I don't provide support for detecting changes in RSS items in RSS Bandit is the high rate of false positives as well as slowdowns caused by performing lots of text comparisons. Having this element in RSS feeds would make it a lot easier for me to support detecting changes to the contents of items in an RSS feed without degrading the user experience in the general case.

Of course, without RSS readers deciding to support the use of the dcterms:modified element in RSS feeds this will continue to be a problem. I need to send some mail to Mark Fletcher and the RSS-AggDev mailing list to see what people think about supporting this element as a way to get around the "bogus new items" problem.

* Note that this doesn't break links that reference that image with the old URL.

Categories: MSN | Syndication Technology

May 2, 2005

@ 07:33 PM

Comments [3]

Microsoft Careers -- Meet Our People: Dare

The Microsoft Careers website has a section entitled Meet Our People where various profiles of employees of various divisions at Microsoft are highlighted. For whatever reason, I was one of the people picked this year and you can go there find my employee profile here.

The process was quite lightweight. I spoke to an interviewer on the phone for several minutes and then she sent me a transcript of key parts of our conversation which I got to edit or veto. After that there was the photo shoot and a couple of weeks later, voila.

Reading the profile, there is some stuff that stands out to me. I've already been here 3 years so I'm no longer a new guy yet I still don't feel like part of the B0rg. There is also the point about management being interested in your career and you as a person.

A while ago Mike Torres and I had a presentation for Brian Arbogast which was my first presentation for anyone with Vice President in their job title. While setting up my laptop, Brian chatted with Mike about a post he'd read in Mike's blog about wanting to upgrade his home sound system. Brian also mentioned that he'd remembered seeing my name come up in a lengthy internal email thread where I was in a debate with Vic Gundotra and didn't back down, he commended me for sticking to my guns. That was cool.
After a recent all-hands meeting I disagreed with parts of the presentations made by David Cole and Steve Liffick, so I sent them some critical feedback. They not only received it well but after exchanging some mail, Steve suggested we meet in person to discuss things further. When I eventually met with Steve, he readily accepted my feedback so we spent a most of the time talking about Social Software as the Platform of the Future. That was fun.
Our dev manager, Farookh Mohammed is just as interested as I am in making sure we have something akin to an MSN Developer Network when we start shipping our APIs for accessing MSN Spaces and we'll be working together to evangelize some ideas around this topic to upper management.
My boss's boss knows I'm fairly good at communicating my ideas in writing and that I'm passionate about social software so he's been encouraging me to produce a whitepaper that can be shared with various folks around work.
My boss, Mike Pacholec, has told me I need to attend more conferences and has suggested that I should plan to attend at least one more conference before the end of the year.

At the end of the profile it states that I have my dream job which is true. I get to work on software that I like using and which directly affects millions of users. All of this in a ship cycle that is measured in months instead of years. What's not to love?

Categories: Life in the B0rg Cube

May 2, 2005

@ 06:40 PM

Comments [1]

Happy Birthday, Dave

Happy 50^th birthday to Dave Winer. He is one of the few people in our industry to say that he has changed the world with the work he has done. From being one of the co-authors of the SOAP 1.1 specification and the author of the XML-RPC specification to being a key evangelist of the power of weblogging and RSS, Dave's impact has been felt across the world.

Dave and I have had our differences in the past (and probably still do now) but there are few people who I can point to in the software industry whose existence has been as much of a net positive to us all.

Categories: Mindless Link Propagation

May 2, 2005

@ 05:59 PM

Comments [0]

Classifying Users of Social Networking Applications

A bunch of folks from the Spaces team were in Asia recently talking to customers about how they used social software applications like instant messaging, blogging tools, email, social networking services, chat and dating sites. Recently Moz Hussain who's one of the product managers for the Spaces team provided some insights on current thinking about classifying users of social networking applications.

Below are excerpts from his blog post

In a people centric world, I see two major dimensions of people interaction: who I want to know about and who I want to share information with. This leads to four distinct segments as shown below.

1) The Content Consumer

This group values their privacy but is voyeuristic in its desire to learn about others. One focus group participant explained how they like to compare thoughts and lifestyles of people in their social status and age bracket. They primarily use the Internet to search for information but rely on traditional communication methods for keeping in touch with close friends.

This group needs easier ways to find information, including user generated content, on a particular topic. A company that does this well in Japan is Livedoor with rich categorization and editorials on user generated content.

This group is often slightly older, but not exclusively so.

2) The Relationship Builder

This group is interested ONLY in their close circle of friends. They neither care about or want to share information with strangers. We have seen much higher prevalence of this group in Europe than elsewhere.

Relationship Builders use a variety of online and offline communications tools to share private thoughts and memories with those close to them. This can include photos, opinions, what's going on in their lives. The reason can be to keep in touch, or just for fun with friends. In China, MSN Messenger is seen as a great product for this group. In Japan, MIXI is the leading web based service for this group.

As many social networking tools are new to this group, they would benefit from greater education on the scenarios that are applicable to them.

This group is often slightly older, but not exclusively so.

3) The Social Networker

This group enjoys meeting people, even strangers, online and interacting to kill time. They enjoy chat rooms, dating services and generally having multiple superficial relationships. It is not uncommon for this group to have more than 200 contacts in their Messenger contact list.

This group is often younger and accesses the Internet from net cafes or mobile phones, i.e. away from prying eyes of parents and room mates. They often use paid for content to enrich their entertainment experience.

Many early web based social networking products such as Friendster and Orkut effectively targeted this segment.

4) The Content Creator

This group is the classic "Maven". They consider themselves experts in a field (ranging from shopping to high tech) or have a desire to express their creativity publicly. They want to get their opinions heard. They use the Internet to research topics of interest, and then create blogs to write their take on the situation.

This group is also interested in rewards for their content and opinions. There is an opportunity to align this group to service provider interests with appropriate reward mechanisms.

This group is also slightly older and has a narrower range of feature interests.

These groups are present in every geography, their relative size varies. The challenge now lies in addressing their needs and figuring out how to use each group to create a synergistic ecosystem of viewer and authors.

All great fun and why I love this job so much!

I think this classification of users of social networking services hits the mark. I also think it is quite cool that we are actually sharing this kind of information with the community of social software enthusiasts as opposed to keeping this as private market research. I wonder what the various folks on the Corante Many2Many blog would have to say about the above classification scheme.

I like the fact that this classification takes into account online social butterflies like Robert Scoble as well people who simply want to use social software to enhance their existing real world relationships.

Putting the above data together with the Degrees of Kevin Bacon post by Mike Torres seems to imply that we are very interested in how people use social networking applications. I wonder why?

;)

Categories: MSN

April 28, 2005

@ 08:26 AM

Comments [1]

MSN Messenger and Hotmail blogs

I tend to talk about MSN Spaces a lot which makes people think I work on that team when in truth I work with them as well as with the MSN Messenger and Hotmail folks. Although its easy to find folks who work on Spaces by simply starting from Mike Torres's blog, it isn't so straightforward for Hotmail or Messenger. Below is a short list of a few of MSN communications services folks blogs I am aware of

HOTMAIL BLOGS

MSN MESSENGER BLOGS

There are a bunch of other blogs by the folks on the various client and server teams but these four are the ones that talk most about the products they work on. In fact, if you browse the various blog rolls on their spaces and mine you'll notice that most of the MSN folks use their spaces for personal stuff not work blogging.

Categories: MSN

April 28, 2005

@ 08:06 AM

Comments [1]

Random MSN Spaces Stuff

Below is a grab bag of mildly interesting stuff that I happened to come across while browsing various blogs in the MSN Spaces universe

Today is Bloggers 101. I've already seen a bunch of Spaces blogs where folks have posted 101 facts about themselves. So far I've liked Deadites: My 101.
A couple of folks are trying to have an MSN Spaces block party. I don't know how successful it's going to be but this is the kind of thing folks used to use Meetup.com for. If they can't have a party I hope they at least get enough people to have a blogger dinner.
According to the PubSub linkcounts page for yesterday the most linked domains in the blogosphere where http://storage.msn.com (367,042 links links from 13,577 sites) and http://spaces.msn.com (30,727 links from 7,539 sites). Not only are we growing to be the biggest blog hosting service on the Web but it looks like we are becoming the most active community as well.

Categories: Mindless Link Propagation | MSN

April 27, 2005

@ 12:52 PM

Comments [10]

Adam Bosworth's Web of Data: Is RSS the only API your Website Needs?

Daniel Steinberg has a an article entitled Bosworth's Web of Data where he discusses some of the ideas Adam Bosworth evangelized in his keynote at the MySQL Users Conference 2005. Daniel writes,

Bosworth explained that the key factors that enabled the web began with simplicity. HTTP was simple enough that any "P" language or JavaScript programmer could build applications. On the consumption side, web browsers such as Internet Explorer 4 were committed to rendering whatever they got. This meant that people could be sloppy and they didn't need to be high priests of syntax. Because it was a sloppy standard, people who otherwise couldn't have authored content did. The fact that it was a standard allowed this single, simple, sloppy, open wire format to run on every platform.
...
The challenge is to take a database and do for the web what was done for content. Bosworth explained that you "need a model that allows for massively linear scalability and federation of information that can spread effortlessly across a federated web."

Solutions that were suggested were to use XML and XQuery. The problem with XML is that unlike HTML, there is not a single grammar. This removed the simple and sloppy aspects of the web. The problem with XQuery is the time it took to finish the specification. Bosworth noted that it took more than four years and that "anything that takes four years is not worth doing. It is over-designed. Intead, take six months and learn from customers."
...
The next solution used web services, which began as an easy idea: you send an XML request and you get XML back. Instead, the collection of WS-* specs were huge and again, overly complicated. Bosworth said that this was a deliberate effort on the part of the companies that control the specs, like IBM and Microsoft, which deliberately made the specification hard, because then only they could deliver technology to do it.
...
Bosworth predicts that RSS 2.0 and Atom will be the lingua franca that will be used to consume all data from everywhere. These are simple formats that are sloppily extensible. Anyone who wants to can use these formats to consume content or to author content. Contrast this with the Semantic Web, which requires that you get a large group of people to agree on the schema of everything.

There are lots of interesting ideas here. I won't dwell on the criticisms of XQuery & WS-* mainly because I tend to agree that they are both overdesigned and complicated. I also wont dwell on the apparent contradiction inherent in claiming that the Semantic Web is doomed because it requires people to agree on the same schema for everything then proposing that everyone agree on using RSS as the schema for all data on the Web. I have a suspicion of what he sees as the difference but I'll wait for a blog post from him clarifying that.

What I find very interesting is using RSS is the data access format for the Web. RSS gained popularity as a way to syndicate blog posts and news sites but its turned out to be a lot more versatile than that. Sites like Feedster and Amazon's OpenSearch technology show you can use RSS as a mechanism for providing search results and integrating search engines respectively. Podcasting shows you can use RSS to syndicate digital media content instead of just plain old text or HTML. With Amazon's syndicated feeds one can keep abreast of when new CDs, books and more are released.

Over the weekend I wrote the MSN Spaces photo album browser page which displays slideshows of all the photos in the various albums on a particular user's MSN Spaces space. This page also can display the photos on a randomly selected space. This webpage is entirely powered by RSS. The photos are obtained from the RSS feed for the Space and the list of random spaces is obtained by querying MSN search with the query "site:spaces.msn.com photo album" and requesting the results as RSS. In fact, the information from the MSN Spaces RSS feeds is enough to build something like the Flickr related tags browser, where instead of showing related tags one could show spaces related to the user from the information in their blog roll which happens to also be provided in the RSS feed. Pretty nifty and all without requiring building a REST, SOAP or XML-RPC API.

In situations where one simply wants to expose read-only data via a service on the Web, it's looking like RSS is the technology to beat. As more and more information is exposed as RSS feeds, there will be even more interesting things people will be able to do with this technology. At Microsoft we definitely are gung ho about exposing as much data as possible via RSS and I've been amazed at how much enthusiasm there is around the opportunites in this area.

Side Note: Yesterday while at the Microsoft Research Social Computing Symposium I was chatting with Randy Farmer, who's one of the guys behind Yahoo! 360° and Yahoo's purchase of Flickr, and I mentioned that it seemed like 2003 was the year that RSS really started to take off. This was also the year that Dave Winer froze the RSS 2.0 spec and Sam Ruby gathered all the malcontents in the XML syndication space and gave them a shiny new toy to play with in Atom. Coincidence?

Categories: Syndication Technology | XML

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Thursday, 05 May 2005 - Dare Obasanjo's weblog