Via Sam Ruby's post Embrace, Extend then Innovate I found a link to Joe Gregorio's post entitled How to do RESTful Partial Updates. Joe's post is a recommendation of how to extend the Atom Publishing Protocol (RFC 5023) to support updating the properties of an entry without having to replace the entire entry. Given that Joe works for Google on GData, I have assumed that Joe's post is Google's attempt to float a trial balloon before extending AtomPub in this way. This is a more community centric approach than the company has previously taken with GData, OpenSocial, etc where these protocols simply appeared out of nowhere with proprietary extensions to AtomPub with an FYI to the community after the fact.

The Problem Statement

In the Atom Publishing Protocol, an atom:entry represents an editable resource. When editing that resource, it is intended that an AtomPub client should download the entire entry, edit the fields it needs to change and then use a conditional PUT request to upload the changed entry.

So what's the problem? Below is an example of the results one could get from invoking the users.getInfo method in the Facebook REST API.

    <about_me>This field perpetuates the glorification of the ego.  Also, it has a character limit.</about_me>
    <activities>Here: facebook, etc. There: Glee Club, a capella, teaching.</activities>   
    <birthday>November 3</birthday>
    <books>The Brothers K, GEB, Ken Wilber, Zen and the Art, Fitzgerald, The Emporer's New Mind, The Wonderful Story of Henry Sugar</books>
      <city>Palo Alto</city>
      <country>United States</country>
     <interests>coffee, computers, the funny, architecture, code breaking,snowboarding, philosophy, soccer, talking to strangers</interests>
     <movies>Tommy Boy, Billy Madison, Fight Club, Dirty Work, Meet the Parents, My Blue Heaven, Office Space </movies>
     <music>New Found Glory, Daft Punk, Weezer, The Crystal Method, Rage, the KLF, Green Day, Live, Coldplay, Panic at the Disco, Family Force 5</music>
     <name>Dave Fetterman</name> 
     <relationship_status>In a Relationship</relationship_status>
     <significant_other_id xsi:nil="true"/>
       <message>Pirates of the Carribean was an awful movie!!!</message>

If this user was represented as an atom:entry then each time an application wants to edit the user's status message it needs to download the entire data for the user with its over two dozen fields, change the status message in an in-memory representation of the XML document and then upload the entire user atom:entry back to the server.  This is a fairly expensive way to change a status message compared to how this is approached in other RESTful protocols (e.g. PROPPATCH in WebDAV).

Previous Discussions on this Topic: When the Shoe is on the Other Foot

A few months ago I brought up this issue as one of the problems encountered when using the Atom Publishing Protocol outside of blog editing contexts in my post Why GData/APP Fails as a General Purpose Editing Protocol for the Web. In that post I wrote

Lack of support for granular updates to fields of an item: As mentioned in the previous section editing an entry requires replacing the old entry with a new one. The expected client interaction with the server is described in section 5.4 of the current APP draft and is excerpted below.

Retrieving a Resource
Client                                     Server
| |
| 1.) GET to Member URI |
| |
| 2.) 200 Ok |
| Member Representation |
| |
  1. The client sends a GET request to the URI of a Member Resource to retrieve its representation.
  2. The server responds with the representation of the Member Resource.
Editing a Resource
Client                                     Server
| |
| 1.) PUT to Member URI |
| Member Representation |
| |
| 2.) 200 OK |
  1. The client sends a PUT request to store a representation of a Member Resource.
  2. If the request is successful, the server responds with a status code of 200.

Can anyone spot what's wrong with this interaction? The first problem is a minor one that may prove problematic in certain cases. The problem is pointed out in the note in the documentation on Updating posts on Google Blogger via GData which states

IMPORTANT! To ensure forward compatibility, be sure that when you POST an updated entry you preserve all the XML that was present when you retrieved the entry from Blogger. Otherwise, when we implement new stuff and include <new-awesome-feature> elements in the feed, your client won't return them and your users will miss out! The Google data API client libraries all handle this correctly, so if you're using one of the libraries you're all set.

Thus each client is responsible for ensuring that it doesn't lose any XML that was in the original atom:entry element it downloaded. The second problem is more serious and should be of concern to anyone who's read Editing the Web: Detecting the Lost Update Problem Using Unreserved Checkout. The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes.

That post was negatively received by many members of the AtomPub community including Joe Gregorio. Joe wrote a scathing response to my post entitled In which we narrowly save Dare from inventing his own publishing protocol  where he addressed that particular issue as follows

The second complaint is one of data loss:

The problem is that there is data loss if the entry has changed between the time the client downloaded it and when it tries to PUT its changes.

Fortunately, the only real problem is that Dare seems to have only skimmed the specification. From Section 9.3:

To avoid unintentional loss of data when editing Member Entries or Media Link Entries, Atom Protocol clients SHOULD preserve all metadata that has not been intentionally modified, including unknown foreign markup as defined in Section 6 of [RFC4287].

And further, from Section 9.5:

Implementers are advised to pay attention to cache controls, and to make use of the mechanisms available in HTTP when editing Resources, in particular entity-tags as outlined in [NOTE-detect-lost-update]. Clients are not assured to receive the most recent representations of Collection Members using GET if the server is authorizing intermediaries to cache them.

Hey look, we actually reference the lost update paper that specifies how to solve this problem, right there in the spec! And Section 9.5.1 even shows an example of just such a conditional PUT failing. Who knew? And just to make this crystal clear, you can build a server that is compliant to the APP that accepts only conditional PUTs. I did, and it performed quite well at the last APP Interop.

The bottom line of Joe's response is that he didn't think it was a real problem. My assumption is that his perspective on the problem has broadened now that he has a responsibility to the wide breadth of AtomPub implementations at Google as opposed to when his design decisions were being influenced by a home grown blogging server he wrote in his free time.

The Google Solution: Embrace, Extend then Innovate

Now that Joe thinks supporting granular updates of a resource is a valid scenario, he and the folks at Google have proposed the following solution to the problem. Joe writes

Now if I wanted to update part of this entry, say the title, using the mechanisms in RFC 5023 then I would change the value of the title element and PUT the whole modified entry back to the the URI Now this document isn't large, but we'll use it to demonstrate the concepts. The first thing we want to do is add a URI Template that allows us to construct a URI to PUT changes back to:

<?xml version="1.0"?>
<t:link_template ref="sub" 
    <title>Atom-Powered Robots Run Amok</title>
    <author><name>John Doe</name></author>
    <content>Some text.</content>
    <link rel="edit"

Then we need to add id's to each of the pieces of the document we wish to be able to individually update. For this we'll use the W3C xml:id specification:

<?xml version="1.0"?>
    <t:link_template ref="sub" href="{-listjoin|;|id}"/>
    <title xml:id="X1">Atom-Powered Robots Run Amok</title>
    <author xml:id="X2"><name>John Doe</name></author>
    <content xml:id="X3">Some text.</content>
    <link rel="edit"

So if I wanted to update both the content and the title I would construct the partial update URI using the id's of the elements I want to update:;X3

And then I would PUT an entry to the URI with only those child elements:

PUT /edit/first-post/X1;X3

<?xml version="1.0"?>
<entry xmlns="">
   <title xml:id="X1">False alarm on the Atom-Powered Robots things</title>
   <content xml:id="X3">Sorry about that.</content>

The Problems with the Google Solution: Your Shipment of FAIL has Arrived

Ignoring the fact that this spec depends on specifications that are either experimental (URI Templates) or not widely supported (xml:id), there are still significant problems with how this approach (mis)uses the Atom Publishing Protocol. Sam Ruby eloquently points out the problems in his post Embrace, Extend then Innovate where he wrote

With HTTP PUT, the the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server.  Having some servers interpret the removal of elements (such as content) as a modification, and others interpret the requests in such a way that elided elements are to be left alone is hardly uniform or self-descriptive.  In fact, depending on usage, it is positively stateful.

I’m fine with a server choosing to interpret the request anyway it sees fit.  As a black box, it could behave as if it updated the resource as requested and then one nanosecond later — and before it processes any other requests — fill in missing data with defaults, historical data, whatever.  My concern is with clients coding with to the assumption as to how the server works.  That’s called coupling.

The main problem is that it changes the expected semantics of HTTP PUT in a way that not only conflicts with how PUT is typically used in other HTTP-based protocols but also how it is used in AtomPub. It's also weird that the existence of xml:id in an Atom document is now used to imply special semantics (i.e. this field supports direct editing). I especially don't like that after all is said and done, the server controls which fields can be partially updated or not which seems to imply a tight coupling between clients and servers (e.g. some servers will support partial updates on all fields, some may only support partial updates on atom:title + atom:category while others will support partial updates on a different set of fields). So the code for editing a title or category changes depending on which AtomPub service you are talking to.  

From where I stand Joe has pretty much invented yet another diff + patch protocol for XML documents. When I worked on the XML team at Microsoft, there were quite a few floating around the company including Diffgram, UpdateGram, and Patchgrams to name three. So I've been around the block when it comes to diff + patch formats for XML and this one has its share of issues.  The most eye brow raising issue with the diff + patch protocol is that half the semantics of the update are in the XML document (which elements to add/edit) while the other half are in the URL (if an ID exists in the URL but is not in the document then it is a delete). This means the XML isn't very self describing nor can it really be said that the URL is identifying a resource [more like it identifies an operation].

Actual Solution: Read the Spec

In Joe's original response to my post his suggestion was that the solution to the "problem" of lack of support for granular updates of entries in AtomPUb is to read the spec. In retrospect, I agree. If a field is important enough that it needs to be identifiable and editable then it should be its own resource. If you want to make it part of another resource then use atom:link to link both resources.

Case closed. Problem solved.

Now Playing: Too Short - Couldn't Be a Better Player Than Me (feat. Lil Jon & The Eastside Boyz)

It's a testament to how busy I've been at work focusing on the Contacts platform that I missed an announcement by Angus Logan a few months ago that there had been an alpha release of a REST API for accessing photos on Windows Live Spaces.  The MSDN page for the API describes the API as

Welcome to the Alpha release of the Windows Live Spaces Photos API. The Windows Live Spaces Photo API allows Web sites to view and update Windows Live Spaces photo albums using the WebDAV protocol. Web sites can incorporate the following functionality:

  • Upload or download photos.
  • Create, edit, or delete photo albums.
  • Request a list of a user's albums, photos, or comments.
  • Edit or delete content for an existing entry.
  • Query the content in an existing entry.

This news is of particular interest to me since this API is the fruits of my labor that was first hinted at in my post A Flickr-like API for MSN Spaces? from a little over two years ago. At the time, I was responsible for the public APIs for MSN Windows Live Spaces and had just finished working on the the MetaWeblog API for Windows Live Spaces.

The biggest design problem we faced at the time was how to give applications the ability to access a user's personal data which required the user to be authenticated without having dozens of hastily written applications collecting people's usernames and passwords. In general, if we were just a blogging site it may not have been a big deal (e.g. the Twitter API requires that you give your username & password to random apps which may or may not be trustworthy).  However we were part of MSN Windows Live which meant that we had to ensure that users credentials were safeguarded and we didn't end up training users on how to be phished by entering their Passport Windows Live ID credentials into random applications and Web sites.

To get around this problem with our implementation of the MetaWeblog API, I came up with a scheme where users had to use a special username and password when accessing their Windows Live Spaces blog via the API. This was a quick & dirty hack which had plenty of long term problems with it. For one, users had to go through the process of "enabling API access" before they could use blogging tools or other Metaweblog API clients with the service. Another problem was that the problem still wasn't solved for other Windows Live services that wanted to enable APIs. Should each API have its own username and password? That would be quite confusing and overwhelming for users. Should they re-use our API specific username and password? In that case we would be back to square one by exposing an important set of user credentials to random applications.

The right solution eventually decided upon was to come up with a delegated authentication model where a user grants application permission to act on his or her behalf without having to share credentials with the application. This is the model followed by the Windows Live Contacts API, the Facebook API, Google AuthSub, Yahoo! BBAuth, the Flickr API and a number of other services on the Web that provide APIs to access a user's private data.

Besides that decision, there was also the question of what form the API should take. Should we embrace & extend the MetaWeblog API with extensions for managing photos & media? Should we propose a proprietary API based on SOAP or REST? Adopt someone else's proprietary API (e.g. the Flickr API)? At the end, I pushed for completely RESTful and completely standards based. Thus we built the API on WebDAV (RFC 2518).

WebDAV seemed like a great fit for a lot of reasons.

  • Photo albums map quite well to collections which are often modeled as folders by WebDAV clients. 
  • Support for WebDAV already baked into a lot of client applications on numerous platforms
  • It is RESTful which is important when building a protocol for the Web
  • Proprietary metadata could easily be represented as WebDAV properties
  • Support for granular updates of properties via PROPPATCH

The last one turns out to be pretty important as it is an issue today with everyone's favorite REST protocol du jour. More on that topic in my following post. 

Now Playing: Lil Jon & The Eastside Boyz - Put Yo Hood Up (remix) (feat. Jadakiss, Petey Pablo & Chyna White)


Categories: Windows Live | XML Web Services

Pablo Castro has a blog post entitled AtomPub support in the ADO.NET Data Services Framework where he talks about the progress they've made in building a framework for using the Atom Publishing Protocol (RFC 5023) as a protocol for communicating with SQL Server and other relational databases. Pablo explains why they've chosen to build on AtomPub in his post which is excerpted below

Why are we looking at AtomPub?

Astoria data services can work with different payload formats and to some level different user-level details of the protocol on top of HTTP. For example, we support a JSON payload format that should make the life of folks writing AJAX applications a bit easier. While we have a couple of these kind of ad-hoc formats, we wanted to support a pre-established format and protocol as our primary interface.

If you look at the underlying data model for Astoria, it boils down to two constructs: resources (addressable using URLs) and links between those resources. The resources are grouped into containers that are also addressable. The mapping to Atom entries, links and feeds is so straightforward that is hard to ignore. Of course, the devil is in the details and we'll get to that later on.

The interaction model in Astoria is just plain HTTP, using the usual methods for creating, updating, deleting and retrieving resources. Furthermore, we use other HTTP constructs such as "ETags" for concurrency checks,  "location" to know where a POSTed resource lives, and so on. All of these also map naturally to AtomPub.

From our (Microsoft) perspective, you could imagine a world where our own consumer and infrastructure services in Windows Live could speak AtomPub with the same idioms as Astoria services, and thus could both have a standards-based interface and also use the same development tools and runtime components that work with any Astoria-based server. This would mean less clients/development tools for us to create and more opportunity for our partners in the libraries and tools ecosystem out there.

Although I'm not responsible for any public APIs at Microsoft these days, I've found myself drawn into the various internal discussions on RESTful protocols and AtomPub due to the fact that I'm a busy body. :)

Early on in the Atom effort, I felt that the real value wasn't in defining yet another XML syndication format but instead in the editing protocol. Still I underestimated how much mind share and traction AtomPub would eventually end up getting in the industry. I'm glad to see Microsoft making a huge bet on standards based, RESTful protocols especially given our recent history where we foisted Snakes On A Plane on the industry.

However since AtomPub is intended to be an extensible protocol, Astoria has added certain extensions to make the service work for their scenarios while staying within the letter and spirit of the spec. Pablo talks about some of their design decisions when he writes

We are simply mapping whatever we can to regular AtomPub elements. Sometimes that is trivial, sometimes we need to use extensions and sometimes we leave AtomPub alone and build an application-level feature on top. Here is an initial list of aspects we are dealing with in one way or the other. We’ll also post elaborations of each one of these to the appropriate Atom syntax|protocol mailing lists.
c) Using AtomPub constructs and extensibility mechanisms to enable Astoria features:

  • Inline expansion of links (“GET a given entry and all the entries related through this named link”, how we represent a request and the answer to such a request in Atom?).
  • Properties for entries that are media link entries and thus cannot carry any more structured data in the <content> element
  • HTTP methods acting on bindings between resources (links) in addition to resources themselves
  • Optimistic concurrency over HTTP, use of ETags and in general guaranteeing consistency when required
  • Request batching (e.g. how does a client send a set of PUT/POST/DELETE operations to the server in a single go?)

d) Astoria design patterns that are not AtomPub format/protocol concepts or extensions:

  • Astoria gives semantics to URLs and has a specific syntax to construct them
  • How metadata that describes the structure of a service end points is exposed. This goes from being to find out entry points (e.g. collections in service documents) to having a way of discovering the structure of entries that contain structured data

Pablo will be posting more about the Astoria design decisions on atom-syntax and atom-protocol in the coming weeks. It'll be interesting to see the feedback on the approaches they've taken with regards to following the protocol guidelines and extending it where necessary.

It looks like I'll have to renew my subscription to both mailing lists.

Now Playing: Lil Jon & The Eastside Boyz - Grand Finale (feat Nas, Jadakiss, T.I., Bun B & Ice Cube)


Categories: Platforms | XML Web Services

News of the layoffs at Yahoo! has now hit the presses. A couple of the folks who've been indicated as laid off are people I know are great employees either via professional interaction or by reputation. The list of people who fit this bill so far are Susan Mernitt, Bradley HorowitzSalim Ismail and Randy Farmer.  Salim used to run Yahoo's "incubation" wing so this is a pretty big loss. It is particularly interesting that he volunteered to leave the company which may be a coincidence or could imply that some of the other news about Yahoo! has motivated some employees to seek employment elsewhere. It will be interesting to see how this plays out in the coming months.

Randy Farmer is also a surprise given that he pretty much confirmed that he was working on Jerry Yang's secret plan for a Yahoo comeback which included

  • Rethinking the Yahoo homepage
  • Consolidating Yahoo's plethora of social networks
  • Opening up Yahoo to third parties with a consistent platform similar to Facebook's
  • Revamping Yahoo's network infrastructure

If Yahoo! is reducing headcount by letting go of folks working on various next generation projects, this can't bode well for the future of the company given that success on the Web depends on constant innovation.

PS: To any ex-Yahoo's out there, if the kind of problems described in this post sound interesting to you, we're always hiring. Give me a holler. :) 


Mini-Microsoft has a blog post entitled  Microsoft's Yahoo! Acquisition is Bold. And Dumb. which contains the following excerpt

To tell you the truth, if you had pulled me aside when I was in school, holding court in the computer science lab, and whispered in my ear ala The Graduate: "online ads..." I would have laughed my geek butt off.

So Google gets to have the joke on me, but for us to bet the company and build Microsoft's future foundation on ads revenue? WTF? As someone who considers themselves a citizen, not a consumer, I want to create software experiences that make people's lives delightful and better, not that sells them crap they don't need while putting them deeper into debt. I'm going to be in purgatory long enough as is.

I find this sentiment somewhat ironic coming from Mini-Microsoft. Microsoft’s bread and butter comes from selling software that people have to use not software that they want to use. In fact, you can argue that the fundamental problems the company has had in making traction in certain consumer-centric markets is that our culture is still influenced by selling to IT departments and developers (i.e. where features and checklists are important) as opposed to selling to consumers (i.e. where user experience is the most important thing).

Specifically, it is hard for me to imagine that there are more people in the world that think that whatever Microsoft product Mini works on has given them more delight or improved their lives better than Facebook, Flickr, Google, MySpace or Windows Live Messenger which happen to all be ad supported software. Thus it amusing to see him imply that ad-supported software is the antithesis of software that delights and improves peoples quality of life. 

The way I see it, Jerry Yang is right that from the perspective of a user “You Always Have Other Options” when it comes free (ad supported), Web-based software which encourages applications to innovate in the user experience to differentiate themselves. It is no small wonder that we’ve seen more innovations in social applications and user interfaces in the world of free, Web-based applications than we’ve seen in the world of proprietary, commercial software.  Something to think about the next time you decide to crap on ad supported Web apps because you think building commercial software is some sort of noble cause that results in perfect, customer delighting software, Mini.

Now playing: Snoop Doggy Dogg - Downtown Assassins


Categories: Life in the B0rg Cube

A couple of weeks ago I got a bug filed against me in RSS Bandit with the title Weak duplicate feed detection that had the following description

I already subscribed to "" feed. Then while browsing the feed's homepage the feed autodetection gets refreshed with a "new feed found", url: "".  It is not detected as a duplicate (ends with backslash) there and also not detected in the subscription wizard.

Even though I argued that there were lots of URLs that seem equivalent to end users that aren't according to the specs I decided to go ahead and fix the two common types of equivalence that trip up end users

Within a few hours of shipping this in version of RSS Bandit, the bug reports have started coming in hard. There's a thread in our forums with the title URL Corruption After Adding a Feed with the following complaints

After I add the feed at:
I get an error in the Error Log and when I restart RSS Bandit, the URL has been truncated to

Possibly related problem:
When I add "" then RSS Bandit slices and dices it into "".
Sorry about the period outside the quotation mark. I wanted to make sure the URL was clear.

Same problem--have a feed, I've used it in the past with RSSBandit and was trying to enter it (tried the usual way first, then turned off autodiscover, then tried to change it via properties) but no matter what I do the www. in front disappears, and the feed doesn't work.

Each of these is an example where the URL works when the domain is starts with "www" but doesn't if you take it out. This is definitely a case of from bad to worse. We went from the minor irritation of duplicate feeds not being detected when you subscribe to the same feed twice to users being unable to access certain feeds at all.

My apologies to everyone affected by this problem. I will be dialing back the canonicalization process to only treat trailing slashes as equivalent. Expect the installer to be refreshed within the next hour or so.

Now Playing: Young Jeezy feat. Swizz Beatz - Money In Da Bank (Remix)


Categories: RSS Bandit

I just realized that the current released version of RSS Bandit doesn’t have a working code name based on a character from the X-Men comic book. The previous release was codenamed ShadowCat while the next release is codenamed Phoenix. Since the v1.6.0.x releases have been an interim releases on the road to Phoenix, I’ve decided to give them the codename Jean Grey retroactively. Now, on to the updates.

Jean Grey (v1.6.0.x) Update

The last bug fix release of RSS Bandit fixed a few bugs but introduced a couple of even worse bugs [depending on your perspective]. We’ve shipped version that addresses the following issues

  • Application crashes with AccessViolationException on startup on Windows XP.  
  • Application crashes and red 'X' shows in feed subscriptions window on Windows XP.  
  • User's credentials are not used when accessing feeds via a proxy server leading to proxy errors when fetching feeds.  
  • Duplicate feed URLs not detected if they differ by trailing slash or "www." in the host name 
  • Application crashes when displaying an error dialog when a certificate issue is detected with a secure feed.

The first three issues are regressions that were introduced as part of refactoring the code and making it work better on Windows Vista. Yet another data point that shows that you can never have too many unit tests and that beta testing isn’t a bad idea either.

You can download the new release from

Phoenix (v2.0)  Update

I’m continuing with my plan to make RSS Bandit a desktop client for Web based feed readers like NewsGator Online and Google Reader. I’ve been slightly sidetracked by the realization that it would be pretty poor form for a Microsoft employee to write an application that synchronized with Google’s RSS reader but not any of Microsoft’s, even if it is a side project. My current coding project is to integrate with the Windows RSS platform which would allow one to manipulate the same set of feeds in RSS Bandit, Internet Explorer 7 and Outlook 2007. The good news is that with Outlook 2007 integration, you also get Exchange synchronization for free.

The bad news has been having to use the RSS reading features of Internet Explorer 7 and Outlook 2007 on a regular basis as a way of eating my own dog food with regards to the integration features. It’s pretty stunning to see not one but two RSS reading applications that assume “mark all items as read” or “delete all feeds” are actions that users never have to take. When you have people writing shell scripts to perform basic tasks in your application then it is a clear sign that somewhere along the line, the user experience for that particular set of features got the shaft. 

I’m about half way through the integration after which I’ll continue with integrating with Google Reader and finally NewsGator Online using an Outlook + Exchange style model. While I’m working on this, both Oren and Torsten will be mapping out the rewrite of the graphical user interface using WPF. I’ll probably need to buy a book on XAML or something in the next few months so I can contribute to this effort. The only thing I’ve heard about any of the various books about the subject on the market is that they all seem to have had their forewords written by Don Box. Does anyone have recommendations on which book or website I should use to start learning XAML + WPF?

Now playing: Eminem - Sing For The Moment


This past weekend I attended the O’Reilly Social Graph FOO Camp and got to meet a bunch of folks who I’ve only “known” via their blogs or news stories about them. My favorite moment was talking to Mark Zuckerberg about stuff I think is wrong with Facebook and he stops for a second while I’m telling hin the story of naked pictures in my Facebook news feed then says “Dare? I read your blog”. Besides that my favorite part of the experience was learning new things from folks with different perspectives and technical backgrounds from me. Whether it was hearing different perspectives on the social graph problem from folks like Joseph Smarr and Blaine Cook, getting schooled on the various real-world issues around using OpenID/OAuth in practice from John Panzer and Eran Hammer-Lahav or getting to ask getting to Q&A Brad Fitzpatrick about the Google Social Graph API, it was a great learning experience all around.

There have been some ideas tumbling around in my head all week and I wanted to wait a few days before blogging to make sure I’d let the ideas fully marinate. Below are a few of the more important ideas I took away from the conference.

Social Network Discovery vs. Social Graph Portability

One of the most startling realizations I made during the conference is a lot of my assumptions about why developers of social applications are interested in what has been mistakenly called “social graph portability” were incorrect. I had assumed a lot of social networking sites that utilize the password anti-pattern to screen scrape a user’s Hotmail/Y! Mail/Gmail/Facebook address book were doing that as a way to get a list of the user’s friends to spam invite to join the service. However a lot of the folks I met at the SG FOO Camp made me realize how much of a bad idea this would be if they actually did that. Sending out a lot of spam would lead to negativity being associated with their service and brand (Plaxo is still dealing with a lot of the bad karma they generated from their spammy days).

Instead the way social applications often use the contacts from a person’s email address book is to satisfy the scenario in Brad Fitzpatrick’s blog post URLs are People, Too where he wrote

So you've just built a totally sweet new social app and you can't wait for people to start using it, but there's a problem: when people join they don't have any friends on your site. They're lonely, and the experience isn't good because they can't use the app with people they know.  

I then thought of my first time using Twitter and Facebook, and how I didn’t consider them of much use until I started interacting with people I already knew that used those services. More than once someone has told me, “I didn’t really get why people like Facebook until I got over a dozen friends on the site”.

So the issue isn’t really about “portability”. After all, my “social graph” of Hotmail or Gmail contacts isn’t very useful on Twitter if none of my friends use the service. Instead it is about “discovery”.

Why is this distinction important? Let’s go back to the complaint that Facebook doesn’t expose email addresses in it’s API. The site actually hides all contact information from their API which is understandable. However since email addresses are also the only global identifiers we can rely on for uniquely identifying users on the Web, they are useful as way of being able to figure out if Carnage4Life on Twitter is actually Dare Obasanjo on Facebook since you can just check if they are backed by the same email address.

I talked to both John Panzer and Brad Fitzpatrick about how we could bridge this gap and Brad pointed out something really obvious which he takes advantage of in the Google Social Graph API. We can just share email addresses using foaf:mbox_sha1sum (i.e. cryptographical one-way hashes of email addresses). That way we all have a shared globally unique identifier for a user but services don’t have to reveal their user’s email addresses.

I wonder how we can convince the folks working on the Facebook platform to consider adding this as one of the properties returned by Users.getInfo?

You Aren’t Really My Friend Even if Facebook Says We Are

In a post entitled A proposal: email to URL mapping Brad Fitzpatrick wrote

People have different identifiers, of different security, that they give out depending on how much they trust you. Examples might include:

  • Homepage URL (very public)
  • Email address (little bit more secret)
  • Mobile phone number (perhaps pretty secretive)

When I think back to Robert Scoble getting kicked off of Facebook for screen scraping his friends’s email addresses and dates of birth into Plaxo, I wonder how many of his Facebook friends are comfortable with their personal contact information including email addresses, cell phone numbers and home addresses being utilized by Robert in this manner. A lot of people argued at SG FOO Camp that “If you’ve already agreed to share your contact info with me, why should you care whether I write it down on paper or download it into some social networking site?”.

That’s an interesting question.

I realized that one of my answers is that I actually don’t even want to share this info with the majority of the people in my Facebook friends list in the first place [as Brad points out]. The problem is that Facebook makes this a somewhat binary decision. Either I’m your “friend” and you get all my private personal details or I’ve faceslammed you by ignoring your friend request or only giving you access to my Limited Profile. I once tried to friend Andrew ‘Boz’ Bosworth (a former Microsoft employee who works at Facebook) and he told me he doesn’t accept friend requests from people he didn’t know personally so he ignored the friend request. I thought it was fucking rude even though objectively I realize it makes sense since it would mean I could view all his personal wall posts as well as his contact info. Funny enough, I always thought that it was a flaw in the site’s design that we had to have such an awkward social interaction.

I think the underlying problem again points to Facebook’s poor handling of multiple social contexts. In the real world, I separate my interactions with co-workers from that with my close friends or my family. For an application that wants to be the operating system underlying my social interactions, Facebook doesn’t do a good job of handling this fundamental reality of adult life.

Now playing: D12 - Revelation


Categories: Social Software | Trip Report

Niall Kennedy has a blog post entitled Sniff browser history for improved user experience where he describes a common-sense technique to test URLs against a Web browser’s visited page history. He writes

I first blogged about this technique almost two years ago but I will now provide even more details and example implementations.
A web browser such as Firefox or Internet Explorer will load the current user's browser history into memory and compare each link (anchor) on the page against the user's previous history. Previously visited links receive a special CSS pseudo-class distinction of :visited and may receive special styling.
Any website can test a known set of links against the current visitor's browser history using standard JavaScript.

  1. Place your set of links on the page at load or dynamically using the DOM access methods.
  2. Attach a special color to each visited link in your test set using finely scoped CSS.
  3. Walk the evaluated DOM for each link in your test set, comparing the link's color style against your previously defined value.
  4. Record each link that matches the expected value.
  5. Customize content based on this new information (optional).

Each link needs to be explicitly specified and evaluated. The standard rules of URL structure still apply, which means we are evaluating a distinct combination of scheme, host, and path. We do not have access to wildcard or regex definitions of a linked resource.

Niall goes on to describe the common ways one can improve the user experience on a site using this technique. I’ve been considering using this approach to reduce the excess blog flair on my weblog. It doesn’t make much sense to show people a “submit to reddit” button  if they don’t use reddit. The approach suggested in Niall’s article makes it possible for me to detect what sites a user visits and then only display relevant flair on my blog posts. Unfortunately neither of Niall’s posts on the topic provide example code which is why I’m posting this follow up to Niall’s post. Below is an HTML page that uses Javascript function to return which social bookmarking sites a viewer of a Web page actually uses based on their browser history.

<html xmlns="" xml:lang="en" lang="en">  
    <title>What Social Bookmarking Sites Do You Use?</title>  
    <script type="text/javascript">
var bookmarking_sites = new Array("", "", "", "", "", "", "", "")

function DetectSites() {
  var testelem = document.getElementById("linktest");
  var visited_sites = document.getElementById("visited_sites");
  var linkColor; 
  var isVisited = false;

  for (var i = 0; i < bookmarking_sites.length; i++) {          
         var url2test = document.createElement("a");                
         url2test.href = bookmarking_sites[i];       
         url2test.innerHTML = "I'm invisible, you can't click me";        

	 if(document.defaultView){ //Mozilla
           linkColor = document.defaultView.getComputedStyle(url2test,null).getPropertyValue("color");

	   if(linkColor == "rgb(100, 149, 237)"){
	     isVisited = true;
	 }else if(url2test.currentStyle){ //IE
	   if(url2test.currentStyle.color == "cornflowerblue"){
	     isVisited = true;

	 if (isVisited) {           
	 visited_sites.innerHTML = visited_sites.innerHTML + 
	  "<a href='" + url2test.href + "'>" + url2test.href + "</a><br>"
	 isVisited = false; 
 <style type="text/css">
   p#linktest a:visited { color: CornflowerBlue }
  <body onload="DetectSites()">
    <b>Social Bookmarking Sites You've Visited</b>
    <p id="linktest" style="visibility:hidden" />
    <p id="visited_sites" />

Of course, after writing the aforementioned code it occured to me run a Web search and I found that there are bits of code for doing this all over the Web in places like Jermiah Grossman’s blog (Firefox only) and GNUCITIZEN

At least now I have it in a handy format; cut, paste and go.

Now all I need is some free time which in which to tweak my weblogt to start using the above function instead of showing people links to services they don’t use.

Now playing: Disturbed - Numb


Categories: Programming | Web Development

I was recently reading a blog post in response to one of my posts and noticed that the author used my wikipedia entry as the primary resource to figure out who I am. The only problem with that is my Wikipedia entry is pretty outdated and quite scanty. Since it is poor form to edit your own Wikipedia entry I am at a quandary.

My resume is slightly more up to date, it is primarily missing descriptions of the stuff I worked on at Microsoft last year. Thus I saw two choices. I could either change the “About Me” link on my blog to point to my resume or I could implore some kind soul in my readership to update my entry in Wikipeda. I decided to start with the latter and if that doesn’t work out, I’ll be updating the “About Me” link to point to my resume.

Thanks in advance to anyone who takes the time to update my Wikipedia entry (not vandalize it Smile ).

Now playing: Ice Cube - Ghetto Vet


Categories: Personal