I'm slowly working towards the goal of making RSS Bandit a desktop RSS client for Google Reader, NewsGator Online and Exchange (via the Windows RSS platform). Today I made some progress integrating with the Windows RSS platform but as with any integration story it is some good news and some bad news. The good news can be seen in the screen shot below

RSS Bandit and Internet Explorer sharing the same feed list

The good news is that for the most part the core application has been refactored to be able to transparently support loading feeds from sources such as the Windows RSS platform or from the RSS Bandit feed cache. It should take one or two more weekends and I can move on to adding similar support for synchronizing feeds from Google Reader.

The bad news is that using the Windows RSS platform has been a painful exercise. My current problem is that for some reason I can't fathom I can't receive events from the Windows RSS platform. I can write the same code and receive events from a standalone program but for some reason the event handlers aren't received triggered when the exact same code is running in RSS Bandit. The main problem I had turned out to have been due to a stupid oversight. With that figured out we're about 80% done with integration with the Windows RSS platform. There are lots of smaller issues too, such as the fact that there is no event that indicates an enclosure has finished being downloaded although the documentation seems to imply the FeedDownloadCompleted does double duty. Or the various exceptions that can occur when accessing properties of a feed including BadImageFormatException for accessing IFeed.Title if the underlying feed file has been corrupted somehow or a COMException complaining that the "Element not found" if you access IFeed.DownloadUrl before you've attempted to download the feed.

I've used up my budget of free time for coding this weekend so I'll start up again next weekend. In the meantime, if you have any tips on working with the Windows RSS platform from C#, don't hesitate to share.

 Now Playing: Bone Thugs 'N Harmony - No Surrender


 

Categories: RSS Bandit

Sam Ruby has an insightful response to Joe Gregorio in his post APP Level Patch where he writes

Joe Gregorio: At Google we are considering using PATCH. One of the big open questions surrounding that decision is XML patch formats. What have you found for patch formats and associated libraries?

I believe that looking for an XML patch format is looking for a solution at the wrong meta level.  Two examples, using AtomPub:

  • In Atom, the order of elements in an entry is not significant.  AtomPub servers often do not store their data in XML serialized form, or even in DOM form.  If you PUT an entry, and then send a PATCH based on the original serialization, it may not be understood.
  • A lot of data in this world is either not in XML, or if it is in XML, is simply there via tunneling.  Atom elements are often merely thin wrappers around HTML.  HTML has a DOM, and can flattened into a sequence of SAX like events, just like XML can be.

I totally agree with Sam. A generic “XML patch format” is totally the wrong solution. At Microsoft we had several different XML patch formats produced by the same organization because each targetted a different scenario

  • Diffgram: Represent a relational database table and changes to it as XML.
  • UpdateGram: Represent changes to an XML view of one or more relational database tables optionally including a mapping from relational <-> XML data
  • Patchgram: Represent infoset level differences between two XML documents

Of course, these are one line sumarries but you get the point. Depending on your constraints, you’ll end up with a different set of requirements. Quick test, tell me why one would choose Patchgrams over XUpdate and vice versa? 

Given the broad set of constraints that will exist in different server implementations of the Atom Publishing Protocol, a generic XML patch format will have lots of features which just don’t make sense (e.g. XUpdate can create processing instructions, Patchgrams use document ordered positions of nodes for matching).

If you decide you really need a patch format for Atom documents, your best bet is working with the community to define one or more which are specific to the unique constraints of the Atom syndication format instead of hoping that there is a generic XML patch format out there you can shoehorn into a solution. In the words of Joe Gregorio’s former co-worker, “I make it fit!”.

Personally, I think you’ll still end up with so many different requirements (Atom stores backed by actual text documents will have different concerns from those backed by relational databases) and spottiness in supporting the capability that you are best off just walking away from this problem by fixing your data model. As I said before, if you have sub-resources which you think should be individually editable then give them a URI and make them resources as well complete with their own atom:entry element.  

Now playing: Oomp Camp - Time To Throw A Chair


 

Categories: XML Web Services

February 23, 2008
@ 04:00 AM

About five years ago, I was pretty active on the XML-DEV mailing list. One of the discussions that cropped up every couple of weeks (aka permathreads) was whether markup languages could be successful if they were not simple enough that a relatively inexperienced developer could “View Source” and figure out how to author documents in that format. HTML (and to a lesser extent RSS) are examples of the success of the “View Source” principle. Danny Ayers had a classic post on the subject titled The Legend of View ‘Source’ which is excerpted below

Q: How do people learn markup?
A: 'View Source'.

This notion is one of the big guns that gets wheeled out in many permathreads - 'binary XML', 'RDF, bad' perhaps even 'XML Schema, too
complicated'. To a lot of people it's the show stopper, the argument that can never be defeated. Not being able to view source is the reason format X died; being able to view source is the reason for format Y's success.

But I'm beginning to wonder if this argument really holds water any more. Don't get me wrong, I'm sure it certainly used to be the case, that many people here got their initial momentum into XML by looking at that there text. I'm also sure that being able to view existing source can be a great aid in learning a markup language. What I'm questioning is whether the actual practice of 'View Source' really is so widespread these days, and more importantly whether it offers such benefits for it to be a major factor in language decisions. I'd be happy with the answer to : are people really using 'View Source' that much? I hear it a lot, yet see little evidence.

One last point, I think we should be clear about what is and what isn't 'View Source'. If I need an XSLT stylesheet the first thing I'll do is open an existing stylesheet and copy and paste half of it. Then I'll get Michael's reference off the shelf.  I bet a fair few folks here have the
bare-bones HTML 3.2 document etched into their lower cortex. But I'd argue that nothing is actually gained from 'View Source' in this, all it is is templating, the fact that it's a text format isn't of immediate relevance.

The mistake Danny made in his post was taking the arguments in favor of “View Source” literally. In hindsight, I think the key point of the “View Source” clan was that it is clear that there is a lot of cargo cult programming that goes on in the world of Web development. Whether it is directly via using the View Source feature of popular Web browsers or simply cutting and pasting code they find at places like quirks mode, A List Apart and W3C Schools, the fact is that lots of people building Web pages and syndication feeds are using technology and techniques they barely understand on a daily basis.

Back in the days when this debate came up, the existence of these markup cargo cults was celebrated because it meant that the ability to author content on the Web was available to the masses which is still the case today (Yaaay, MySpace Wink ). However there has been a number of down sides to the wide adoption of [X]HTML, CSS and other Web authoring technologies by large numbers of semi-knowledgeable developers and technologically challenged content authors.

One of these negative side effects has been discussed to death in a number of places including the article Beyond DOCTYPE: Web Standards, Forward Compatibility, and IE8 by Aaron Gustafson which is excerpted below

The DOCTYPE switch is broken

Back in 1998, Todd Fahrner came up with a toggle that would allow a browser to offer two rendering modes: one for developers wishing to follow standards, and another for everyone else. The concept was brilliantly simple. When the user agent encountered a document with a well-formed DOCTYPE declaration of a current HTML standard (i.e. HTML 2.0 wouldn’t cut it), it would assume that the author knew what she was doing and render the page in “standards” mode (laying out elements using the W3C’s box model). But when no DOCTYPE or a malformed DOCTYPE was encountered, the document would be rendered in “quirks” mode, i.e., laying out elements using the non-standard box model of IE5.x/Windows.

Unfortunately, two key factors, working in concert, have made the DOCTYPE unsustainable as a switch for standards mode:

  1. egged on by A List Apart and The Web Standards Project, well-intentioned developers of authoring tools began inserting valid, complete DOCTYPEs into the markup their tools generated; and
  2. IE6’s rendering behavior was not updated for five years, leading many developers to assume its rendering was both accurate and unlikely to change.

Together, these two circumstances have undermined the DOCTYPE switch because it had one fatal flaw: it assumed that the use of a valid DOCTYPE meant that you knew what you were doing when it came to web standards, and that you wanted the most accurate rendering possible. How do we know that it failed? When IE 7 hit the streets, sites broke.

Sure, as Roger pointed out, some of those sites were using IE-6-specific CSS hacks (often begrudgingly, and with no choice). But most suffered because their developers only checked their pages in IE6 —or only needed to concern themselves with how the site looked in IE6, because they were deploying sites within a homogeneous browserscape (e.g. a company intranet). Now sure, you could just shrug it off and say that since IE6’s inaccuracies were well-documented, these developers should have known better, but you would be ignoring the fact that many developers never explicitly opted into “standards mode,” or even knew that such a mode existed.

This seems like an intractible problem to me. If you ship a version of your software that is more standards compliant than previous versions you run the risk of breaking applications or content that worked in previous versions. This reminds me of Windows Vista getting the blame because Facebook had a broken IPv6 record. The fact is that the application can claim it is more standards compliant but that is meaningless if users can no longer access their data or visit their favorite sites. In addition, putting the onus on Web developers and content authors to always write standards compliant code is impossible given the acknowledged low level of expertise of said Web content authors. It would seem that this actually causes a lot of pressure to always be backwards (or is that bugwards) compatible. I definitely wouldn’t want to be in the Internet Explorer team’s shoes these days.

It puts an interesting wrinkle on the exhortations to make markup languages friendly to “View Source” doesn’t it?

Now playing: Green Day - Welcome To Paradise


 

Categories: Web Development

From the press release entitled Microsoft Makes Strategic Changes in Technology and Business Practices to Expand Interoperability we learn

REDMOND, Wash. — Feb. 21, 2008 — Microsoft Corp. today announced a set of broad-reaching changes to its technology and business practices to increase the openness of its products and drive greater interoperability, opportunity and choice for developers, partners, customers and competitors.

Specifically, Microsoft is implementing four new interoperability principles and corresponding actions across its high-volume business products: (1) ensuring open connections; (2) promoting data portability; (3) enhancing support for industry standards; and (4) fostering more open engagement with customers and the industry, including open source communities.
...
The interoperability principles and actions announced today apply to the following high-volume Microsoft products: Windows Vista (including the .NET Framework), Windows Server 2008, SQL Server 2008, Office 2007, Exchange Server 2007, and Office SharePoint Server 2007, and future versions of all these products. Highlights of the specific actions Microsoft is taking to implement its new interoperability principles are described below.

  • Ensuring open connections to Microsoft’s high-volume products.
  • Documenting how Microsoft supports industry standards and extensions.
  • Enhancing Office 2007 to provide greater flexibility of document formats.
  • Launching the Open Source Interoperability Initiative.
  • Expanding industry outreach and dialogue.

More information can be found on the Microsoft Interoperability page. Nice job, ROzzie and SteveB.

Now playing: Timbaland - Apologize (Feat. One Republic)


 

Categories: Life in the B0rg Cube

One of the biggest problems with the Facebook user experience today is the amount of spam from applications that are trying to leverage its social networks to "grow virally". For this reason, it is unsurprising to read the blog post from Paul Jeffries on the Facebook blog entitled Application Spam where he writes

We've been working on several improvements to prevent this and other abuses by applications. We'll continue to make changes, but wanted to share some of what's new:

  • When you get a request from an application, you now have the ability to "Block Application" directly from the request. If you block an application, it will not be able to send you any more requests.
  • A few weeks ago, we added the ability to "Clear All" requests from your requests page when you have a lot of requests and invitations that you haven't responded to yet.
  • Your feedback now determines how many communications an application can send. When invitations and notifications are ignored, blocked, or marked as spam, Facebook reduces that application's ability to send more. Applications forcing their users to send spammy invitations can wind up with no invitations at all. The power is in your hands; block applications that are bothering you, and report spammy or abusive communications, and we'll restrict the application.
  • We've explicitly told developers they cannot dead-end you in an "Invite your Friends" loop. If you are trapped by an application, look for a link to report that "This application is forcing me to invite friends". Your reports will help us stop this behavior.
  • We've added an option to the Edit Applications page that allows you to opt-out of emails sent from applications you've already added. When you add a new application, you can uncheck this option right away.

A lot of these are fairly obvious restrictions that put users back in control of their experience. I'm quite surprised that it took so long to add a "Block Application" feature. I can understand that Facebook didn't want to piss off developers on their platform but app spam has become a huge negative aspect of using Facebook. About two months ago, I wrote a blog post entitled Facebook: Placing Needs of Developers Over Needs of Users where I pointed out the Facebook group This has got to stop (POINTLESS FACEBOOK APPLICATIONS ARE RUINING FACEBOOK). At the time of posting that entry, the group had 167,186 members.

This morning, the group has 480,176 members. That's almost half a million people who have indicated that app spam on the site is something they despise. It is amazing that Facebook has let this problem fester for so long given how important keeping their user base engaged and happy with the site is to their bottom line.

Now Playing: Lil' Scrappy feat. Paul Wall - Hustle Man


 

Categories: Social Software

Via Sam Ruby's post Embrace, Extend then Innovate I found a link to Joe Gregorio's post entitled How to do RESTful Partial Updates. Joe's post is a recommendation of how to extend the Atom Publishing Protocol (RFC 5023) to support updating the properties of an entry without having to replace the entire entry. Given that Joe works for Google on GData, I have assumed that Joe's post is Google's attempt to float a trial balloon before extending AtomPub in this way. This is a more community centric approach than the company has previously taken with GData, OpenSocial, etc where these protocols simply appeared out of nowhere with proprietary extensions to AtomPub with an FYI to the community after the fact.

The Problem Statement

In the Atom Publishing Protocol, an atom:entry represents an editable resource. When editing that resource, it is intended that an AtomPub client should download the entire entry, edit the fields it needs to change and then use a conditional PUT request to upload the changed entry.

So what's the problem? Below is an example of the results one could get from invoking the users.getInfo method in the Facebook REST API.


   <user>
    <uid>8055</uid>
    <about_me>This field perpetuates the glorification of the ego.  Also, it has a character limit.</about_me>
    <activities>Here: facebook, etc. There: Glee Club, a capella, teaching.</activities>   
    <birthday>November 3</birthday>
    <books>The Brothers K, GEB, Ken Wilber, Zen and the Art, Fitzgerald, The Emporer's New Mind, The Wonderful Story of Henry Sugar</books>
    <current_location>
      <city>Palo Alto</city>
      <state>CA</state>
      <country>United States</country>
      <zip>94303</zip>
    </current_location>   
    <first_name>Dave</first_name>      
     <interests>coffee, computers, the funny, architecture, code breaking,snowboarding, philosophy, soccer, talking to strangers</interests>
     <last_name>Fetterman</last_name>  
     <movies>Tommy Boy, Billy Madison, Fight Club, Dirty Work, Meet the Parents, My Blue Heaven, Office Space </movies>
     <music>New Found Glory, Daft Punk, Weezer, The Crystal Method, Rage, the KLF, Green Day, Live, Coldplay, Panic at the Disco, Family Force 5</music>
     <name>Dave Fetterman</name> 
     <profile_update_time>1170414620</profile_update_time>
     <relationship_status>In a Relationship</relationship_status>
     <religion/>
     <sex>male</sex>
     <significant_other_id xsi:nil="true"/>
     <status>
       <message>Pirates of the Carribean was an awful movie!!!</message>
     </status>   
   </user>

If this user was represented as an atom:entry then each time an application wants to edit the user's status message it needs to download the entire data for the user with its over two dozen fields, change the status message in an in-memory representation of the XML document and then upload the entire user atom:entry back to the server.  This is a fairly expensive way to change a status message compared to how this is approached in other RESTful protocols (e.g. PROPPATCH in WebDAV).

Previous Discussions on this Topic: When the Shoe is on the Other Foot

A few months ago I brought up this issue as one of the problems encountered when using the Atom Publishing Protocol outside of blog editing contexts in my post Why GData/APP Fails as a General Purpose Editing Protocol for the Web. In that post I wrote

Lack of support for granular updates to fields of an item: As mentioned in the previous section editing an entry requires replacing the old entry with a new one. The expected client interaction with the server is described in section 5.4 of the current APP draft and is excerpted below.

Retrieving a Resource
Client                                     Server
| |
| 1.) GET to Member URI |
|------------------------------------------>|
| |
| 2.) 200 Ok |
| Member Representation |
|<------------------------------------------|
| |
  1. The client sends a GET request to the URI of a Member Resource to retrieve its representation.
  2. The server responds with the representation of the Member Resource.
Editing a Resource
Client                                     Server
| |
| 1.) PUT to Member URI |
| Member Representation |
|------------------------------------------>|
| |
| 2.) 200 OK |
|<------------------------------------------|
  1. The client sends a PUT request to store a representation of a Member Resource.
  2. If the request is successful, the server responds with a status code of 200.

Can anyone spot what's wrong with this interaction? The first problem is a minor one that may prove problematic in certain cases. The problem is pointed out in the note in the documentation on Updating posts on Google Blogger via GData which states

IMPORTANT! To ensure forward compatibility, be sure that when you POST an updated entry you preserve all the XML that was present when you retrieved the entry from Blogger. Otherwise, when we implement new stuff and include <new-awesome-feature> elements in the feed, your client won't return them and your users will miss out! The Google data API client libraries all handle this correctly, so if you're using one of the libraries you're all set.

Thus each client is responsible for ensuring that it doesn't lose any XML that was in the original atom:entry element it downloaded. The second problem is more serious and should be of concern to anyone who's read Editing the Web: Detecting the Lost Update Problem Using Unreserved Checkout. The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes.

That post was negatively received by many members of the AtomPub community including Joe Gregorio. Joe wrote a scathing response to my post entitled In which we narrowly save Dare from inventing his own publishing protocol  where he addressed that particular issue as follows

The second complaint is one of data loss:

The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes.

Fortunately, the only real problem is that Dare seems to have only skimmed the specification. From Section 9.3:

To avoid unintentional loss of data when editing Member Entries or Media Link Entries, Atom Protocol clients SHOULD preserve all metadata that has not been intentionally modified, including unknown foreign markup as defined in Section 6 of [RFC4287].

And further, from Section 9.5:

Implementers are advised to pay attention to cache controls, and to make use of the mechanisms available in HTTP when editing Resources, in particular entity-tags as outlined in [NOTE-detect-lost-update]. Clients are not assured to receive the most recent representations of Collection Members using GET if the server is authorizing intermediaries to cache them.

Hey look, we actually reference the lost update paper that specifies how to solve this problem, right there in the spec! And Section 9.5.1 even shows an example of just such a conditional PUT failing. Who knew? And just to make this crystal clear, you can build a server that is compliant to the APP that accepts only conditional PUTs. I did, and it performed quite well at the last APP Interop.

The bottom line of Joe's response is that he didn't think it was a real problem. My assumption is that his perspective on the problem has broadened now that he has a responsibility to the wide breadth of AtomPub implementations at Google as opposed to when his design decisions were being influenced by a home grown blogging server he wrote in his free time.

The Google Solution: Embrace, Extend then Innovate

Now that Joe thinks supporting granular updates of a resource is a valid scenario, he and the folks at Google have proposed the following solution to the problem. Joe writes

Now if I wanted to update part of this entry, say the title, using the mechanisms in RFC 5023 then I would change the value of the title element and PUT the whole modified entry back to the the URI http://example.org/edit/first-post.atom. Now this document isn't large, but we'll use it to demonstrate the concepts. The first thing we want to do is add a URI Template that allows us to construct a URI to PUT changes back to:

<?xml version="1.0"?>
<entry         
        xmlns="http://www.w3.org/2005/Atom"
        xmlns:t="http://blah...">
<t:link_template ref="sub" 
        href="http://example.org/edit/first-post/{-listjoin|;|id}"/>
    <title>Atom-Powered Robots Run Amok</title>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <author><name>John Doe</name></author>
    <content>Some text.</content>
    <link rel="edit"
        href="http://example.org/edit/first-post.atom"/>
</entry>

Then we need to add id's to each of the pieces of the document we wish to be able to individually update. For this we'll use the W3C xml:id specification:

<?xml version="1.0"?>
<entry         
        xmlns="http://www.w3.org/2005/Atom"
        xmlns:t="http://blah...">   
    <t:link_template ref="sub" href="http://example.org/edit/first-post/{-listjoin|;|id}"/>
    <title xml:id="X1">Atom-Powered Robots Run Amok</title>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <author xml:id="X2"><name>John Doe</name></author>
    <content xml:id="X3">Some text.</content>
    <link rel="edit"
        href="http://example.org/edit/first-post.atom"/>
</entry>

So if I wanted to update both the content and the title I would construct the partial update URI using the id's of the elements I want to update:

http://example.org/edit/first-post/X1;X3

And then I would PUT an entry to the URI with only those child elements:

PUT /edit/first-post/X1;X3
Host: example.org

<?xml version="1.0"?>
<entry xmlns="http://www.w3.org/2005/Atom">
   <title xml:id="X1">False alarm on the Atom-Powered Robots things</title>
   <content xml:id="X3">Sorry about that.</content>
</entry>

The Problems with the Google Solution: Your Shipment of FAIL has Arrived

Ignoring the fact that this spec depends on specifications that are either experimental (URI Templates) or not widely supported (xml:id), there are still significant problems with how this approach (mis)uses the Atom Publishing Protocol. Sam Ruby eloquently points out the problems in his post Embrace, Extend then Innovate where he wrote

With HTTP PUT, the the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.  Having some servers interpret the removal of elements (such as content) as a modification, and others interpret the requests in such a way that elided elements are to be left alone is hardly uniform or self-descriptive.  In fact, depending on usage, it is positively stateful.

I’m fine with a server choosing to interpret the request anyway it sees fit.  As a black box, it could behave as if it updated the resource as requested and then one nanosecond later — and before it processes any other requests — fill in missing data with defaults, historical data, whatever.  My concern is with clients coding with to the assumption as to how the server works.  That’s called coupling.

The main problem is that it changes the expected semantics of HTTP PUT in a way that not only conflicts with how PUT is typically used in other HTTP-based protocols but also how it is used in AtomPub. It's also weird that the existence of xml:id in an Atom document is now used to imply special semantics (i.e. this field supports direct editing). I especially don't like that after all is said and done, the server controls which fields can be partially updated or not which seems to imply a tight coupling between clients and servers (e.g. some servers will support partial updates on all fields, some may only support partial updates on atom:title + atom:category while others will support partial updates on a different set of fields). So the code for editing a title or category changes depending on which AtomPub service you are talking to.  

From where I stand Joe has pretty much invented yet another diff + patch protocol for XML documents. When I worked on the XML team at Microsoft, there were quite a few floating around the company including Diffgram, UpdateGram, and Patchgrams to name three. So I've been around the block when it comes to diff + patch formats for XML and this one has its share of issues.  The most eye brow raising issue with the diff + patch protocol is that half the semantics of the update are in the XML document (which elements to add/edit) while the other half are in the URL (if an ID exists in the URL but is not in the document then it is a delete). This means the XML isn't very self describing nor can it really be said that the URL is identifying a resource [more like it identifies an operation].

Actual Solution: Read the Spec

In Joe's original response to my post his suggestion was that the solution to the "problem" of lack of support for granular updates of entries in AtomPUb is to read the spec. In retrospect, I agree. If a field is important enough that it needs to be identifiable and editable then it should be its own resource. If you want to make it part of another resource then use atom:link to link both resources.

Case closed. Problem solved.

Now Playing: Too Short - Couldn't Be a Better Player Than Me (feat. Lil Jon & The Eastside Boyz)
 

It's a testament to how busy I've been at work focusing on the Contacts platform that I missed an announcement by Angus Logan a few months ago that there had been an alpha release of a REST API for accessing photos on Windows Live Spaces.  The MSDN page for the API describes the API as

Welcome to the Alpha release of the Windows Live Spaces Photos API. The Windows Live Spaces Photo API allows Web sites to view and update Windows Live Spaces photo albums using the WebDAV protocol. Web sites can incorporate the following functionality:

  • Upload or download photos.
  • Create, edit, or delete photo albums.
  • Request a list of a user's albums, photos, or comments.
  • Edit or delete content for an existing entry.
  • Query the content in an existing entry.

This news is of particular interest to me since this API is the fruits of my labor that was first hinted at in my post A Flickr-like API for MSN Spaces? from a little over two years ago. At the time, I was responsible for the public APIs for MSN Windows Live Spaces and had just finished working on the the MetaWeblog API for Windows Live Spaces.

The biggest design problem we faced at the time was how to give applications the ability to access a user's personal data which required the user to be authenticated without having dozens of hastily written applications collecting people's usernames and passwords. In general, if we were just a blogging site it may not have been a big deal (e.g. the Twitter API requires that you give your username & password to random apps which may or may not be trustworthy).  However we were part of MSN Windows Live which meant that we had to ensure that users credentials were safeguarded and we didn't end up training users on how to be phished by entering their Passport Windows Live ID credentials into random applications and Web sites.

To get around this problem with our implementation of the MetaWeblog API, I came up with a scheme where users had to use a special username and password when accessing their Windows Live Spaces blog via the API. This was a quick & dirty hack which had plenty of long term problems with it. For one, users had to go through the process of "enabling API access" before they could use blogging tools or other Metaweblog API clients with the service. Another problem was that the problem still wasn't solved for other Windows Live services that wanted to enable APIs. Should each API have its own username and password? That would be quite confusing and overwhelming for users. Should they re-use our API specific username and password? In that case we would be back to square one by exposing an important set of user credentials to random applications.

The right solution eventually decided upon was to come up with a delegated authentication model where a user grants application permission to act on his or her behalf without having to share credentials with the application. This is the model followed by the Windows Live Contacts API, the Facebook API, Google AuthSub, Yahoo! BBAuth, the Flickr API and a number of other services on the Web that provide APIs to access a user's private data.

Besides that decision, there was also the question of what form the API should take. Should we embrace & extend the MetaWeblog API with extensions for managing photos & media? Should we propose a proprietary API based on SOAP or REST? Adopt someone else's proprietary API (e.g. the Flickr API)? At the end, I pushed for completely RESTful and completely standards based. Thus we built the API on WebDAV (RFC 2518).

WebDAV seemed like a great fit for a lot of reasons.

  • Photo albums map quite well to collections which are often modeled as folders by WebDAV clients. 
  • Support for WebDAV already baked into a lot of client applications on numerous platforms
  • It is RESTful which is important when building a protocol for the Web
  • Proprietary metadata could easily be represented as WebDAV properties
  • Support for granular updates of properties via PROPPATCH

The last one turns out to be pretty important as it is an issue today with everyone's favorite REST protocol du jour. More on that topic in my following post. 

Now Playing: Lil Jon & The Eastside Boyz - Put Yo Hood Up (remix) (feat. Jadakiss, Petey Pablo & Chyna White)


 

Categories: Windows Live | XML Web Services

Pablo Castro has a blog post entitled AtomPub support in the ADO.NET Data Services Framework where he talks about the progress they've made in building a framework for using the Atom Publishing Protocol (RFC 5023) as a protocol for communicating with SQL Server and other relational databases. Pablo explains why they've chosen to build on AtomPub in his post which is excerpted below

Why are we looking at AtomPub?

Astoria data services can work with different payload formats and to some level different user-level details of the protocol on top of HTTP. For example, we support a JSON payload format that should make the life of folks writing AJAX applications a bit easier. While we have a couple of these kind of ad-hoc formats, we wanted to support a pre-established format and protocol as our primary interface.

If you look at the underlying data model for Astoria, it boils down to two constructs: resources (addressable using URLs) and links between those resources. The resources are grouped into containers that are also addressable. The mapping to Atom entries, links and feeds is so straightforward that is hard to ignore. Of course, the devil is in the details and we'll get to that later on.

The interaction model in Astoria is just plain HTTP, using the usual methods for creating, updating, deleting and retrieving resources. Furthermore, we use other HTTP constructs such as "ETags" for concurrency checks,  "location" to know where a POSTed resource lives, and so on. All of these also map naturally to AtomPub.

From our (Microsoft) perspective, you could imagine a world where our own consumer and infrastructure services in Windows Live could speak AtomPub with the same idioms as Astoria services, and thus could both have a standards-based interface and also use the same development tools and runtime components that work with any Astoria-based server. This would mean less clients/development tools for us to create and more opportunity for our partners in the libraries and tools ecosystem out there.

Although I'm not responsible for any public APIs at Microsoft these days, I've found myself drawn into the various internal discussions on RESTful protocols and AtomPub due to the fact that I'm a busy body. :)

Early on in the Atom effort, I felt that the real value wasn't in defining yet another XML syndication format but instead in the editing protocol. Still I underestimated how much mind share and traction AtomPub would eventually end up getting in the industry. I'm glad to see Microsoft making a huge bet on standards based, RESTful protocols especially given our recent history where we foisted Snakes On A Plane on the industry.

However since AtomPub is intended to be an extensible protocol, Astoria has added certain extensions to make the service work for their scenarios while staying within the letter and spirit of the spec. Pablo talks about some of their design decisions when he writes

We are simply mapping whatever we can to regular AtomPub elements. Sometimes that is trivial, sometimes we need to use extensions and sometimes we leave AtomPub alone and build an application-level feature on top. Here is an initial list of aspects we are dealing with in one way or the other. We’ll also post elaborations of each one of these to the appropriate Atom syntax|protocol mailing lists.
...
c) Using AtomPub constructs and extensibility mechanisms to enable Astoria features:

  • Inline expansion of links (“GET a given entry and all the entries related through this named link”, how we represent a request and the answer to such a request in Atom?).
  • Properties for entries that are media link entries and thus cannot carry any more structured data in the <content> element
  • HTTP methods acting on bindings between resources (links) in addition to resources themselves
  • Optimistic concurrency over HTTP, use of ETags and in general guaranteeing consistency when required
  • Request batching (e.g. how does a client send a set of PUT/POST/DELETE operations to the server in a single go?)

d) Astoria design patterns that are not AtomPub format/protocol concepts or extensions:

  • Astoria gives semantics to URLs and has a specific syntax to construct them
  • How metadata that describes the structure of a service end points is exposed. This goes from being to find out entry points (e.g. collections in service documents) to having a way of discovering the structure of entries that contain structured data

Pablo will be posting more about the Astoria design decisions on atom-syntax and atom-protocol in the coming weeks. It'll be interesting to see the feedback on the approaches they've taken with regards to following the protocol guidelines and extending it where necessary.

It looks like I'll have to renew my subscription to both mailing lists.

Now Playing: Lil Jon & The Eastside Boyz - Grand Finale (feat Nas, Jadakiss, T.I., Bun B & Ice Cube)


 

Categories: Platforms | XML Web Services

News of the layoffs at Yahoo! has now hit the presses. A couple of the folks who've been indicated as laid off are people I know are great employees either via professional interaction or by reputation. The list of people who fit this bill so far are Susan Mernitt, Bradley HorowitzSalim Ismail and Randy Farmer.  Salim used to run Yahoo's "incubation" wing so this is a pretty big loss. It is particularly interesting that he volunteered to leave the company which may be a coincidence or could imply that some of the other news about Yahoo! has motivated some employees to seek employment elsewhere. It will be interesting to see how this plays out in the coming months.

Randy Farmer is also a surprise given that he pretty much confirmed that he was working on Jerry Yang's secret plan for a Yahoo comeback which included

  • Rethinking the Yahoo homepage
  • Consolidating Yahoo's plethora of social networks
  • Opening up Yahoo to third parties with a consistent platform similar to Facebook's
  • Revamping Yahoo's network infrastructure

If Yahoo! is reducing headcount by letting go of folks working on various next generation projects, this can't bode well for the future of the company given that success on the Web depends on constant innovation.

PS: To any ex-Yahoo's out there, if the kind of problems described in this post sound interesting to you, we're always hiring. Give me a holler. :)