Friday, 12 October 2007 - Dare Obasanjo's weblog

October 12, 2007

@ 08:30 AM

Windows Live Events and Updated "What's New" page on Windows Live Spaces

Two things I worked on over the past year or so shipped today. The first is Windows Live Events. You can learn more about in the post entitled Introducing Windows Live Events and new Windows Live Spaces updates by Chris Keating and Jay Fluegel which reads

Easily create a great-looking website for your next event
To offer you more ways to connect and share memories with the people you care about most, the team that brought you Spaces would love to hear your feedback on Windows Live Events, our new, free social event planning service. With Events you can easily:

Plan that next baby shower, birthday party, or family reunion
Create a great-looking event invitation and website using one of over 100 fantastic templates
Invite anyone with an e-mail address and track who’s coming
Make your event unique with familiar customization features - choosing a friendly web address (like http://kates1stbirthday.events.live.com), using custom colors, fonts, and background images, or adding modules and Windows Live web gadgets
Let guests and organizers share photos and stories before and after the event

Click on Events from the new navigation and then click Create event to get started!

My contribution to this release was working on modifying aspects of our contacts and storage platform to understand the concept of groups of users that can be treated as a single entity [especially with regards to joint ownership of objects, sharing and access control lists] instead of being centered on a single user. I expect that Windows Live Events will be just the first of many ways in which this capability will manifest itself across Windows Live.

Unfortunately, I didn't work on the final stage of getting the platform ready for the product to ship. Instead I went on to work on my next feature that shipped today while Ali took over working on the platform support for Windows Live Events including ~~cleaning up my design hacks~~ doing a better job of future proofing the design than I did. Mad props to Bob Bauman, Mike Newman, Jason Antonelli, John Kountz, Lalit Jina, Neel Malik, Mike Torres and everyone else who worked on this release across Windows Live. You guys rock.

The second thing I was a part of building which shipped today is the updated “What’s New” page in Windows Live Spaces which is also described in detail in the aforementioned post by Chris Keating and Jay Fluegel . Before you say anything, Yes, its re-design has been influenced by the Facebook News Feed feature. Below is a screen shot of the old design of the page from the previous release

and now contrast that with the new version of the page

I'm pretty jazzed about getting to work on this feature since it is something I've wanted do for a quite some time. A few years ago, I remember talking to Maya about building a “friends page” similar to the Live Journal friends page in MSN Spaces and at the time the response was that I was requesting that we merge the functionality of an RSS reader with a blogging/social networking site which was at cross purposes. In hindsight, I realize that although the idea was a good one, the implementation I was suggesting was kind of hokey. Then Facebook shipped the News Feed and it all clicked.

I worked with a lot of great folks on this feature. Paul Ming, Deepa Chandramouli, Rob Dolin, Vanesa Polo Dominguez, Jack Liu, Eric Melillo and a bunch of others who I may have failed to mention but still deserve lots of praise. This feature was the most fun I've had working in Windows Live. Not only did I get a deeper appreciation of designing for scalability but I also got to see what it is like to be responsible for components on the live site. All I need now is a pager and I'm good to go. :)

I'd be remiss in my duties if I didn't point out that in the second screen shot, the first item on the What's New page is less than 5 minutes old. If you use other systems that have similar features, you may have noticed a much longer delay than a few minutes from posting to showing up in your news feed. As the saying goes, a lot of effort went into making this look effortless.

I also noticed some initial feedback on this feature in the blog post by Jamie Thomson entitled new spaces home page where he writes

There's a lot of potential for this activity list given that it could capture any activity people commit using their Live ID. Every live property has the potential for being able to post activity on here so one day we may see notifications of:

change of messenger status
posting of photos on Live Space
addition of gadgets to Live Space
items placed for sale on Expo
questions asked or answered on QnA
collection shared from Live Maps
video posted on MSN video
changes to XBox gamer card
changes to Zune Social (after it launches)
items posted to the Live Gallery
an event being planned
purchased a song from Zune marketplace
posts in MSN groups (soon to be Live Groups)
posts to online forums (forums.microsoft.com)
downloads of public files from Skydrive

Its all pretty good but let's be honest, this is basically a clone of of what Facebook already have. Given Facebook's popularity though Microsoft didn't really have a choice but to copy them. If Microsoft really want to differentiate themselves in this arena then one option would be to provide avenues for interacting with other online services such as Flickr, Twitter, Jaiku, Pownce, etc... This list could then become an aggregator for all online activity and that's a pretty compelling scenario. One really quick win in this area would be to capture any blog entry that is posted from Live Writer, regardless of whether it is posted to Live Spaces or not.

Posting of photos already shows up on the "what's new" page. Downloads of files will likely never show up for privacy reasons, I'm sure you can guess why it may not be wise to broadcast what files you were downloading from shared folders to all your IM buddies and the people friends list if you think about it a little. As for the rest of the request, thanks for the feedback. We'll keep it in mind for future releases. Wink

PS: If you work for a Microsoft property that would like to show up on the "what's new" page, host it or just wanna plain chat about the feature then give me a shout if interested in the platform or holler at Rob if it's about the user experience.

Categories: Windows Live

October 10, 2007

@ 04:00 AM

Comments [7]

When Databases Lie: Consistency vs. Availability in Distributed Systems

I recently got an email from a developer about my post Thoughts on Amazon's Internal Storage System (Dynamo) which claimed that I seem to be romanticizing distributed systems that favor availability over consistency. He pointed out that although this sounds nice on paper, it places a significant burden on application developers. He is 100% right. This has been my experience in Windows Live and I’ve heard enough second hand to assume it is the experience at Amazon as well when it comes to Dynamo.

I thought an example of how this trade-off affects developers would be a useful excercise and may be of interest to my readers. The following example is hypothetical and should not be construed as describing the internal architectures of any production systems I am aware of.

Scenario: Torsten Rendelmann, a Facebook user in Germany, accepts a friend request from Dare Obasanjo who is a Facebook user in the United States.

The Distributed System: To improve the response times for users in Europe, imagine Facebook has a data center in London while American users a serviced from a Data Center in Palo Alto. To achieve this, the user database is broken up in a process commonly described as sharding. The question of if and how data is replicated across both data centers isn’t relevant to this example.

The application developer who owns the confirm_friend_request() method, will ideally want to write code that took the following form

public void confirm_friend_request(user1, user2){ begin_transaction(); update_friend_list(user1, user2, status.confirmed); //palo alto update_friend_list(user2, user1, status.confirmed); //london end_transaction(); }

Yay, distributed transactions. You have to love a feature that every vendor advises you not to use if you care about performance. So obviously this doesn’t work for a large scale distributed system where performance and availabilty are important.

Things get particularly ugly when you realize that either data center or the specific server a user’s data is stored on could be unreachable for a variety of reasons (e.g. DoS attack, high seasonal load, drunk sys admin tripped over a power cord, hard drive failure due to cheap components, etc).

There are a number of options one can consider when availability and high performance are considered to be more important than data consistency in the above example. Below are three potential implementations of the code above each with it’s own set of trade offs.

OPTION I: Application developer performs manual rollback on error

public void confirm_friend_request_A(user1, user2){ try{ update_friend_list(user1, user2, status.confirmed); //palo alto }catch(exception e){ report_error(e); return; } try{ update_friend_list(user2, user1, status.confirmed); //london }catch(exception e) { revert_friend_list(user1, user2); report_error(e); return; } }

The problem here is that we don’t handle the case where revert_friend_list() fails. This means that Dare (user1) may end up having Torsten (user2) on his friend list but Torsten won’t have Dare on his friend list. The database has lied.

OPTION II: Failed events are placed in a message queue to be retried until they succeed.

public void confirm_friend_request_B(user1, user2){

try{
   update_friend_list(user1, user2, status.confirmed); //palo alto
}catch(exception e){
report_error(e);
  add_to_retry_queue(operation.updatefriendlist, user1, user2, current_time());
}
try{
update_friend_list(user2, user1, status.confirmed); //london
}catch(exception e) {
report_error(e);
add_to_retry_queue(operation.updatefriendlist, user2, user1, current_time());
}

}

Depending on how long the error exists and how long it takes an item to sit in the message queue, there will be times when the Dare (user1) may end up having Torsten (user2) on his friend list but Torsten won’t have Dare on his friend list. The database has lied, again.

OPTION III: System always accepts updates but application developers may have to resolve data conflicts later. (The Dynamo approach)

/* update_friend_list always succeeds but may enqueue an item in message queue to try again later in the event of failure. This failure is not propagated to callers of the method. */ public void confirm_friend_request_C(user1, user2){ update_friend_list(user1, user2, status.confirmed); // palo alto update_friend_list(user2, user1, status.confirmed); //london } /* get_friends() method has to reconcile results returned by get_friends() because there may be data inconsistency due to a conflict because a change that was applied from the message queue is contradictory to a subsequent change by the user. In this case, status is a bitflag where all conflicts are merged and it is up to app developer to figure out what to do. */ public list get_friends(user1){ list actual_friends = new list(); list friends = get_friends(); foreach (friend in friends){ if(friend.status == friendstatus.confirmed){ //no conflict actual_friends.add(friend); }else if((friend.status &= friendstatus.confirmed) and !(friend.status &= friendstatus.deleted)){ // assume friend is confirmed as long as it wasn’t also deleted friend.status = friendstatus.confirmed; actual_friends.add(friend); update_friends_list(user1, friend, status.confirmed); }else{ //assume deleted if there is a conflict with a delete update_friends_list( user1, friend, status.deleted) } }//foreach return actual_friends; }

These are just a few of the many approaches that can be implemented in such a distributed system to get around the performance and availability implications of using distributed transactions. The main problem with them is that in every single case, the application developer has an extra burden placed on his shoulders due to inherent fragility of distributed systems. For a lot of developers, the shock of this realization is akin to the shock of programming in C++ after using C# or Ruby for a couple of years. Manual memory management? Actually having to perform bounds checking arrays? Being unable to use decades old technology like database transactions?

The challenge in building a distributed storage system like BigTable or Dynamo is in balancing the need for high availability and performance while not building a system that encourages all sorts of insidious bugs to exist in the system by design. Some might argue that Dynamo goes to far in the burden that it places on developers while there are others that would argue that it doesn’t go far enough.

In what camp do you fall?

Now playing: R. Kelly - Rock Star (feat. Ludacris & Kid Rock)

Categories: Platforms | Programming | Web Development

October 10, 2007

@ 04:00 AM

Comments [7]

One Protocol to Rule Them All and in the Darkness Bind Them

Yaron Goland has an entertaining post entitled Interoperability Wars - Episode 6 - Part 1 - Revenge of Babble about some of the philosophical discussions we’ve been having at work about the Atom Publishing Protocol (RFC 5023). The entire post is hilarious if you are an XML protocol geek and I recommend reading it. The following excerpt is a good starting point for another discussion about APP’s suitability as a general purpose editing protocol for the Web. Yaron writes

Emperor Babble: Excellent, Weasdel's death will serve us well in lulling the forces of interoperability into thinking they are making progress. Welcome Restafarian, it is time you understood your true place in my plans.

Luke Restafarian: Do what you want to me emperor, but the noble cause of interoperability will prevail!

The Emperor turns to the center of the chamber where a form, almost blinding in its clarity, takes shape:
GET /someuser/profile HTTP/1.1
host: example.com
content-type: application/xml

<profile xmlns="http://example.com">
   <professional>
      <workTitle>…</workTitle>
      …
   </professional>
   <personal>
      <spouseName>
         …
      </spouseName>
      …
   </personal>
   <clothingPreferences>
      <favoriteColors>
         <shirts>…</shirts>
         …
      </favoriteColors>
      …
   </clothingPreferences>
   …
</profile>
Darth Sudsy is momentarily taken aback from the appearance of the pure interoperable data while Luke Restafarian seems strengthened by it. The Emperor turns back to Luke and then speaks.

Emperor Babble: I see it strengthens you Restafarian, no matter. But look again at your blessed interoperability.

When the Emperor turns back to the form the form begins to morph, growing darker and more sinister:
GET /someuser/profileFeed HTTP/1.1
host: example.com
content-type: application/ATOM+xml

<feed xmlns="http://www.w3.org/2005/Atom">
   <title/>
   <updated>2000-01-01T00:00:00Z</updated>
   <author>
      <name/>
   </author>
   <id>http://www.example.com/someuser/profileFeed</id>
   <category scheme="http://example.com/categories" term="Profile"/>
   <entry>
      <title/>
      <id>http://www.example.com/someuser/profilesFeed/professional</id>
      <updated>2000-01-01T00:00:00Z</updated>
      <content type="Application/XML" xmlns:e="http://example.com">
         <workTitle>…</workTitle>
         …
      </content>
   </entry>
   <entry>
      <title/>
      <id>http://www.example.com/someuser/profilesFeed/personal</id>
      <updated>2000-01-01T00:00:00Z</updated>
      <content type="Application/XML" xmlns:e="http://example.com">
         <spouseName>
            …
         </spouseName>
         …
      </content>
   </entry>
   <entry>
      <title/>
      <id>http://www.example.com/someuser/profilesFeed/clothingPreferences</id>
      <updated>2000-01-01T00:00:00Z</updated>
      <content type="Application/XML" xmlns:e="http://example.com">
         <favoriteColors>
            <shirts>…</shirts>
            …
         </favoriteColors>
         …
      </content>
   </entry>
</feed>
Luke, having recognized the syntax, clearly expects to get another wave of strength but instead he feels sickly. The emperor, looking slightly stronger, turns from the form to look at Luke.

Luke Restafarian: What have you done? That structure is clearly taken from Princess Ape-Pea's system, she is a true follower of interoperability, I should be getting stronger but somehow it's making me ill.

Emperor Babble: You begin to understand young Restafarian. Used properly Princess Ape-Pea's system does indeed honor all the qualities of rich interoperability. But look more carefully at this particular example of her system. Is it not beautiful? Its needless complexity, its useless elements, its bloated form, they herald true incompatibility. No developer will be able to manually deal with such an incomprehensible monstrosity. We have taken your pure interoperable data and hidden it in a mud of useless scaffolding. Princess Ape-Pea and your other minions will accept this poison as adhering to your precious principles of interoperability but in point of fact by turning Princess Ape-Pea's system into a generic structured data manipulation protocol we have forced the data to contort into unnatural forms that are so hard to read, so difficult to understand that no programmer will ever deal with it directly. We will see no repeat of those damned Hit-Tip Knights building their own stacks in a matter of hours in order to enable basic interoperability. In this new world even the most trivial data visualizations and manipulations become a nightmare. Instead of simple transversals of hierarchical structures programmers will be forced into a morass of categories and artificial entry containers. Behold!

Yaron’s point is that since Atom is primarily designed for representing streams of microcontent, the only way to represent other types of data in Atom is to tunnel them as custom XML formats or proprietary extensions to Atom. At this point you’ve added an unnecessary layer of complexity.

The same thing applies to the actual Atom Publishing Protocol. The current design requires clients to use optimistic concurrency to handle conflicts on updates which seems like unnecessary complexity to push to clients as opposed to a “last update wins” scheme. Unfortunately, APP’s interaction model doesn’t support granular updates which means such a scheme isn’t supported by the current design. A number of APP experts have realized this deficiency as you can see from James Snell of IBM’s post entitled Beyond APP - Partial updates and John Panzer of Google’s post entitled RESTful partial updates: PATCH+Ranges.

A potential counter argument that can be presented when pointing these deficiencies of the Atom Publishing Protocol when used outside it’s core scenarios of microcontent publishing is that Google exposes all their services using GData which is effectively APP. This is true, but there are a couple of points to consider

Google had to embrace and extend the Atom format with several proprietary extensions to represent data that was not simply microcontent.
APP experts do not recommend embracing and extending the Atom format the way Google has done since it obviously fragments interoperability. See Joe Gregorio’s post entitled In which we narrowly save Dare from inventing his own publishing protocol and James Snell’s post entitled Silly for more details.
Such practices leads to a world where we have applications that can only speak Google’s flavor of Atom. I remember when it used to be a bad thing when Microsoft did this but for some reason Google gets a pass.
Google has recognized the additional complexity they’ve created and now ship GData client libraries for several popular platforms which they recommend over talking directly to the protocol. I don’t know about you but I don’t need a vendor specific client library to process RSS feeds or access WebDAV resources, so why do I need one to talk to Google’s APP implementation?

It seems that while we weren’t looking, Google move us a step away from a world of simple, protocol-based interoperability on the Web to one based on running the right platform with the right libraries.

Usually I wouldn’t care about whatever bad decisions the folks at Google are making with their API platform. However the problem is that it sends out the wrong message to other Web companies that are building Web APIs. The message that it’s all about embracing and extending Internet standards with interoperability being based on everyone running sanctioned client libraries instead of via simple, RESTful protocols is harmful to the Internet. Unfortunately, this harkens to the bad old days of Microsoft and I’d hate for us to begin a race to the bottom in this arena.

On the other hand, arguing about XML data formats and RESTful protocols are all variations of arguments about what color to paint the bike shed. At the end of the day, the important things are (i) building a compelling end user service and (ii) exposing compelling functionality via APIs to that service. The Facebook REST APIs are a clusterfuck of inconsistency while the Flickr APIs are impressive in how they push irrelevant details of the internals of the service into the developer API (NSIDs anyone?). However both of these APIs are massively popular.

From that perspective, what Google has done with GData is smart in that by standardizing on it even though it isn’t the right tool for the job, they’ve skipped the sorts of ridiculous what-color-to-paint-the-bike-shed discussions that prompted Yaron to write his blog post in the first place. Wink

With that out of the way they can focus on building compelling Web services and exposing interesting functionality via their APIs. By the way, am I the last person to find out about the YouTube GData API?.

Now playing: DJ Khaled - I'm So Hood (feat. T-Pain, Trick Daddy, Rick Ross & Plies)

Categories: Platforms | XML Web Services

October 9, 2007

@ 02:47 PM

Comments [14]

Wedding Weekend and Honeymoon Pictures

Jenna has uploaded pictures from our wedding weekend in Las Vegas and our honeymoon in Puerto Vallarta to her Windows Live space. Below are a couple of entry points into the photo stream.

Signing the marriage licence

The blushing bride

On the way to the after party

One of the many amazing sunsets we saw in Mexico

We got to release baby turtles into the ocean

Nothing beats a pool with a beach view

These are the pictures that we took ourselves. The pics from the professionals capturing the wedding day and reception will show up in a couple of weeks.

Now playing: Jagged Edge - Let's Get Married (remix) (feat. Run DMC)

Categories: Personal

October 6, 2007

@ 04:00 AM

Comments [2]

Thoughts on Amazon's Internal Storage System (Dynamo)

Werner Vogels, CTO of Amazon, has a blog post entitled Amazon's Dynamo which contains the HTML version of an upcoming paper entitled Dynamo: Amazon’s Highly Available Key-value Store which describes a highly available, distributed storage system used internally at Amazon.

The paper is an interesting read and a welcome addition to the body of knowledge about building megascale distributed storage systems. I particularly like that it isn’t simply another GFS or BigTable, but is unique in it’s own right. Hopefully, this will convince folks that just because Google were first to publish papers about their internal infrastructure doesn’t mean that what they’ve done is the bible of building megascale distributed systems. Anyway, on to some of the juicy bits

Traditionally production systems store their state in relational databases. For many of the more common usage patterns of state persistence, however, a relational database is a solution that is far from ideal. Most of these services only store and retrieve data by primary key and do not require the complex querying and management functionality offered by an RDBMS. This excess functionality requires expensive hardware and highly skilled personnel for its operation, making it a very inefficient solution. In addition, the available replication technologies are limited and typically choose consistency over availability. Although many advances have been made in the recent years, it is still not easy to scale-out databases or use smart partitioning schemes for load balancing.

Although I work for a company that sells a relational database product, I think it is still fair to say that there is a certain level of scale where practically every feature traditionally associated with an RDBMS works against you.

Luckily, there are only a handful of companies and Web services in the world that need to operate at that scale.

2.1 System Assumptions and Requirements

The storage system for this class of services has the following requirements:

Query Model: simple read and write operations to a data item that is uniquely identified by a key. State is stored as binary objects (i.e., blobs) identified by unique keys. No operations span multiple data items and there is no need for relational schema. This requirement is based on the observation that a significant portion of Amazon’s services can work with this simple query model and do not need any relational schema. Dynamo targets applications that need to store objects that are relatively small (usually less than 1 MB).

ACID Properties: ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction. Experience at Amazon has shown that data stores that provide ACID guarantees tend to have poor availability. This has been widely acknowledged by both the industry and academia [5]. Dynamo targets applications that operate with weaker consistency (the “C” in ACID) if this results in high availability. Dynamo does not provide any isolation guarantees and permits only single key updates.

Other Assumptions: Dynamo is used only by Amazon’s internal services. Its operation environment is assumed to be non-hostile and there are no security related requirements such as authentication and authorization. Moreover, since each service uses its distinct instance of Dynamo, its initial design targets a scale of up to hundreds of storage hosts. We will discuss the scalability limitations of Dynamo and possible scalability related extensions in later sections.

Lots of worthy items to note here. The first is that you can get a lot of traction out of a simple data structure such as a hash table. Specifically, as noted by Sam Ruby in his post Key + Data, accessing data by key instead of using complex queries is becoming a common pattern in large scale distributed storage systems. Sam actually missed pointing out that Google’s Bigtable is another example of this trend given that data items within it are accessed using the tuple {row key, column key, timestamp} instead of being queried using data manipulation language.

Another interesting thing, from my perspective, is that they’ve gotten around hitting scaling limits at running it on hundreds of storage hosts by having different teams at Amazon run their own instances of Dynamo. Then again, there are 200 clusters of GFS running at Google, so this is probably common sense as well.

4.4 Data Versioning

Dynamo provides eventual consistency, which allows for updates to be propagated to all replicas asynchronously. A put() call may return to its caller before the update has been applied at all the replicas, which can result in scenarios where a subsequent get() operation may return an object that does not have the latest updates.. If there are no failures then there is a bound on the update propagation times. However, under certain failure scenarios (e.g., server outages or network partitions), updates may not arrive at all replicas for an extended period of time.

There is a category of applications in Amazon’s platform that can tolerate such inconsistencies and can be constructed to operate under these conditions. For example, the shopping cart application requires that an “Add to Cart” operation can never be forgotten or rejected. If the most recent state of the cart is unavailable, and a user makes changes to an older version of the cart, that change is still meaningful and should be preserved. But at the same time it shouldn’t supersede the currently unavailable state of the cart, which itself may contain changes that should be preserved. Note that both “add to cart” and “delete item from cart” operations are translated into put requests to Dynamo. When a customer wants to add an item to (or remove from) a shopping cart and the latest version is not available, the item is added to (or removed from) the older version and the divergent versions are reconciled later.

In order to provide this kind of guarantee, Dynamo treats the result of each modification as a new and immutable version of the data. It allows for multiple versions of an object to be present in the system at the same time. Most of the time, new versions subsume the previous version(s), and the system itself can determine the authoritative version (syntactic reconciliation). However, version branching may happen, in the presence of failures combined with concurrent updates, resulting in conflicting versions of an object. In these cases, the system cannot reconcile the multiple versions of the same object and the client must perform the reconciliation in order to collapse multiple branches of data evolution back into one (semantic reconciliation). A typical example of a collapse operation is “merging” different versions of a customer’s shopping cart. Using this reconciliation mechanism, an “add to cart” operation is never lost. However, deleted items can resurface.

Fascinating. I can just imagine how scary this most sound to RDBMS heads to think that instead of the database enforcing the rules of consistency, it just keeps multiple versions of the “row” around and then asks the client to figure out which is which if there were multiple updates that couldn’t be reconciled.

The folks at Amazon have taken acknowledgement of the CAP Conjecture to its logical extreme. Consistency, Availability, and Partition-tolerance. Pick two.

There’s lots of other interesting stuff in the paper but I’ll save some for you to read and end my excerpts here. This will make great bedtime reading this weekend.

Now playing: Geto Boys - My Mind's Playin Tricks On Me

Categories:

October 6, 2007

@ 04:00 AM

Comments [0]

OAuth 1.0 is Here - Delegated Authority Comes to Mashups

According to blog posts like A Flood of Mashups Coming? OAuth 1.0 Released and John Musser’s OAuth Spec 1.0 = More Personal Mashups? , it looks like the OAuth specification has reached it’s final draft.

This is good news because the need for a standardized mechanism for users to give applications permission to access their data or act on their behalf has been obvious for a while. The most obvious manifestation of this are all the applications that ask for your username and password so they can retrieve your contact list from your email service provider.

So what exactly is wrong with applications like the one’s shown below?

meebo

spock

The problem with these applications [which OAuth solves] is that when I give them my username and password, I’m not only giving them access to my address book but also access to

my blog posts and all my photos (http://spaces.live.com)
my travel history (http://www.expedia.com)
my search history (http://www.google.com/psearch)
my personal email (http://www.hotmail.com)
my medical information (http://www.healthvault.com)
my business documents (http://www.officelive.com)
my personal documents (http://docs.google.com)
my purchase history (https://checkout.google.com)
and so on…

because all of those services use the same credentials. Sounds scary when put in those terms doesn’t it?

OAuth allows a service provider (like Google or Yahoo!) to expose an interface that allows their users to give applications permission to access their data while not exposing their login credentials to these applications. As I’ve mentioned in the past, this standardizes the kind of user-centric API model that is utilized by Web services such as the Windows Live Contacts API, Google AuthSub and the Flickr API to authenticate and authorize applications to access a user’s data.

The usage flow end users can expect from OAuth enabled applications is as follows.

1. The application or Web site informs the user that it is about to direct the user to the service provider’s Web site to grant it permission.

2. The user is then directed to the service providers Web site with a special URL that contains information about the requesting application. The user is prompted to login to the service provider’s Website to verify their identity.

3. The user grants the application permission.

4. The application gets access to the user’s data and the user never had to hand over their username and password to some random application which they might not trust.

I’ve read the final draft of the OAuth 1.0 spec and it seems to have done away with some of the worrisome complexity I’d seen in earlier draft (i.e. single use and multi-use tokens). Great work by all those involved.

I never had time to participate in this effort but it looks like I wouldn’t have had anything to add. I can’t wait to see this begin to get deployed across the Web.

Now playing: Black Eyed Peas - Where is the Love (feat. Justin Timberlake)

Categories: Web Development | XML Web Services

October 5, 2007

@ 04:00 AM

Comments [9]

On the Release of the Source Code of the .NET Framework Libraries

Scott Guthrie has a blog post entitled Releasing the Source Code for the .NET Framework Libraries where he writes

One of the things my team has been working to enable has been the ability for .NET developers to download and browse the source code of the .NET Framework libraries, and to easily enable debugging support in them.

Today I'm excited to announce that we'll be providing this with the .NET 3.5 and VS 2008 release later this year.

We'll begin by offering the source code (with source file comments included) for the .NET Base Class Libraries (System, System.IO, System.Collections, System.Configuration, System.Threading, System.Net, System.Security, System.Runtime, System.Text, etc), ASP.NET (System.Web), Windows Forms (System.Windows.Forms), ADO.NET (System.Data), XML (System.Xml), and WPF (System.Windows). We'll then be adding more libraries in the months ahead (including WCF, Workflow, and LINQ). The source code will be released under the Microsoft Reference License (MS-RL).

This is one of those announcements I find hard to get excited about. Any developer who’s been frustrated by the weird behavior of a .NET Framework class and has wanted to look at it’s code, should already know about Lutz Roeder’s Reflector which is well known in the .NET devoper community. So I’m not sure who this anouncement is actually intended to benefit.

On the other hand, I’m sure Java developers are having a chuckle at our expense that it took this long for Microsoft to allow developers to see the source code for ArrayList.Count so we can determine if it is lazily or eagerly evaluated.

Oh well, better late than never.

PS: The ability to debug into .NET Framework classes will be nice. I’ve wanted this more than once while working on RSS Bandit and will definitely take advantage of it if I ever get around to installing VS 2008.

Now playing: TLC - Somethin Wicked This Way Comes (feat. Andre 3000)

Categories: Programming

October 4, 2007

@ 04:00 AM

Comments [6]

Joel Spolsky on Why the Facebook Platform is the Future of the Web

This is another post I was planning to write a few weeks ago which got interrupted by my wedding and honeymoon.

A few weeks ago, Joel Spolsky wrote a post entitled Strategy Letter VI which I initially dismissed as the ravings of a desktop developer who is trying to create an analogy when one doesn’t exist. The Web isn’t the desktop, or didn’t he read There is no Web Operating System (or WebOS)? By the second time I read it, I realized that if you ignore some of the desktop-centric thinking in Joel’s article, then not only is Joel’s article quite insightful but some of what he wrote is already coming to pass.

The relevant excerpt from Joel’s article is

Somebody is going to write a compelling SDK that you can use to make powerful Ajax applications with common user interface elements that work together. And whichever SDK wins the most developer mindshare will have the same kind of competitive stronghold as Microsoft had with their Windows API.

If you’re a web app developer, and you don’t want to support the SDK everybody else is supporting, you’ll increasingly find that people won’t use your web app, because it doesn’t, you know, cut and paste and support address book synchronization and whatever weird new interop features we’ll want in 2010.

Imagine, for example, that you’re Google with GMail, and you’re feeling rather smug. But then somebody you’ve never heard of, some bratty Y Combinator startup, maybe, is gaining ridiculous traction selling NewSDK,

…

And while you’re not paying attention, everybody starts writing NewSDK apps, and they’re really good, and suddenly businesses ONLY want NewSDK apps, and all those old-school Plain Ajax apps look pathetic and won’t cut and paste and mash and sync and play drums nicely with one another. And Gmail becomes a legacy. The WordPerfect of Email. And you’ll tell your children how excited you were to get 2GB to store email, and they’ll laugh at you. Their nail polish has more than 2GB.

Crazy story? Substitute “Google Gmail” with “Lotus 1-2-3”. The NewSDK will be the second coming of Microsoft Windows; this is exactly how Lotus lost control of the spreadsheet market. And it’s going to happen again on the web because all the same dynamics and forces are in place. The only thing we don’t know yet are the particulars, but it’ll happen

A lot of stuff Joel asserts seems pretty clueless on the face of it. Doesn’t he realize that there are umpteen billion AJAX toolkits (e.g. Dojo, Google Web Toolkit, Yahoo! User Interface Library, Script.aculo.us, etc) and rich internet application platforms (e.g. Flash, Silverlight, XUL, etc)? Doesn’t he realize that there isn’t a snowball’s chance in hell of the entire Web conforming to standard user interface guidelines let alone everyone agreeing on using the same programming language and SDK to build Web apps?

But wait…

What happens if you re-read the above excerpt and substitute NewSDK with Facebook platform?

I didn’t classify Facebook as a Social Operating System for no reason. GMail and other email services have become less interesting to me because I primarily communicate with friends and family on the Web via Facebook and it’s various platform applications. I’ve stopped playing casual games at Yahoo! Games and now use Scrabulous and Texas Hold ‘Em when I want to idle some time away on the weekend. All of these applications are part of a consistent user interface, are all accessible from my sidebar and each of them has access to my data within Facebook including my social graph. Kinda like how Windows or Mac OS X desktop applications on my machine have a consistent user interface, are all accessible from my applications menu and can all access the data on my hard drive.

Hmmm…

I suspect that Joel is right about NewSDK, he’s just wrong about which form it will take. “Social operating system” does have a nice ring to it, doesn’t it?

Now playing: Kanye West - Two Words (feat. Mos Def, Freeway & The Harlem Boys Choir)

Categories: Platforms | Web Development

October 4, 2007

@ 04:00 AM

Comments [1]

Facebook Hates Fakesters Too

Mini-Microsoft has a blog post up to let us know that his Facebook account was cancelled. In the comments he clarifies he wasn’t specifically targetted and this is just part of the Facebook terms of service. He writes

For those who probably will never see this Facebook help-topic, this is what I've been directed to:

http://www.facebook.com/help.php?page=45

The only relevant text that I can find:

"Facebook does not allow users to register with fake names, to impersonate any person or entity, or to falsely state or otherwise misrepresent themselves or their affiliations."

I imagine they only do something when someone complains vs. being constantly policing things. And someone out there (scanning the crowd of exceptionally good looking people who visit here) must have taken it upon themselves to complain.

I didn’t realize that if I don’t provide 100% accurate data about myself (thus making identity theft easier) I could get my account banned from Facebook.

I can understand why they want to encourage people to use real names since they want to be the kind of place that have users like “Dare Obasanjo” and “Robert Scoble” not ‘carnage4life’ and ‘scobleizer’ since the former implies a more personal experience.

However it seems dumb to be trying to replicate Friendster’s mistake by killing off every account that didn’t conform to their standards. There are ways to encourage such behavior without being jerks as they’ve clearly been in this case.

Now playing: Dem Franchize Boyz - Oh I Think They Like Me (remix) (feat. Jermaine Dupri, Da Brat & Lil Bow Wow)

Categories: Competitors/Web Companies | Social Software

October 4, 2007

@ 04:00 AM

Comments [2]

"Office is Dead" and Other Obvious Trends

Yesterday morning, I tossed out a hastily written post entitled It Must Be a Fun Time to Work on Microsoft Office which seems to have been misread by some folks based on some of the comments I’ve seen on my blog and in other places. So further exposition of some of the points in that post seems necessary.

First of all, there’s the question of who I was calling stupid when talking about the following announcements

Google announcing the launch of Presently, their Web-based Powerpoint clone. Interestingly enough, one would have expected presentation software to be the most obvious application to move to the Web first instead of the last.
Yahoo! announcing the purchase of Zimbra, a developer of a Web-based office productivity and collaboration suite.
Microsoft announcing the it would integrate Web-based storage and collaboration into it’s desktop office productivity suite.
IBM announcing that it would ship it’s own branded version of an Open Source clone of Microsoft’s desktop productivity suite.

Given that three of these announcements are about embracing the Web and the last one is about building disconnected desktop software, I assumed it was obvious who was jumping on a dying paradigm while the rest of the industry has already moved towards the next generation. To put this another way, James Robertson’s readers were right that I was talking about IBM.

There is something I did want to call out about James Robertson’s post. He wrote

People have moved on to the 80% solution that is the web UI, because the other advantages outweigh that loss of "richness".

I don’t believe that statement when it comes to office productivity software. I believe that the advantages of leveraging the Web are clear. From my perspective

universal access to my data from any device or platform
enabling collaboration with “zero install” requirements on collaborators

are clear advantages that Web-based office productivity software has over disconnected desktop software.

It should be noted that neither of these advantages requires that the user interface is Web-based or that it is rich (i.e. AJAX or Flash if it is Web-based). Both of these things help but they aren’t a hard requirement.

What is important is universal access to my data via the Web. The reason I don’t have an iPhone is because I’m hooked on my Windows Mobile device because of the rich integration it has with my work email, calendar and tasks list. The applications on my phone aren’t Web-based, they are the equivalent of “desktop applications” for my phone. Secondly, I didn’t have to install them because they were already on my phone [actually I did have to install Oxios ToDo List but that’s only because the out-of-the-box task list synchronization in Windows Mobile 5 was less than optimal for my needs].

I used to think that having a Web-based interface was also inevitable but that position softened once I realized that you’ll need offline support which means building support for local storage + synchronization into the application (e.g. Google Reader's offline mode) to truly hit the 80/20 point for most people given how popular laptops are these days. However once you’ve built that platform, the same storage and synchronization engine could be used by a desktop application as well.

In that case, either way I get what I want. So desktop vs. Web-based UI doesn’t matter since they both have to stretch themselves to meet my needs. But it is probably a shorter jump to Web-enable the desktop applications than it is to offline-enable the Web applications.

Now playing: Playa Fly - Feel Me

Categories: Competitors/Web Companies | Technology

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Friday, 12 October 2007 - Dare Obasanjo's weblog

2.1 System Assumptions and Requirements

4.4 Data Versioning