November 5, 2008
@ 06:10 PM

It feels good to realize that I can tell him that he can grow up to be president with a straight face. Congratulations to Team Obama.

Note Now Playing: Young Jeezy - My President (Feat. Nas) Note


 

Categories: Personal

Disclaimer: What follows are my personal impressions from using the beta version of Windows Azure. It is not meant to be an official description of the project from Microsoft, you can find that here. 

Earlier this week I scored an invite to try out the beta version of Windows Azure which is a new hosted services (aka cloud computing) platform from Microsoft. Since there's been a ridiculous amount of press about the project I was interested in actually trying it out by developing and deploying some code using this platform and sharing my experiences with others.

What is it?

Before talking about a cloud computing platform, it is useful to agree on definitions of the term cloud computing. Tim O'Reilly has an excellent post entitled Web 2.0 and Cloud Computing where he breaks the technologies typically described as cloud computing into three broad categories

  1. Utility Computing: In this approach, a vendor provides access to virtual server instances where each instance runs a traditional server operating system such as Linux or Windows Server. Computation and storage resources are metered and the customer can "scale infinitely" by simply creating new server instances. The most popular example of this approach is Amazon EC2.
  2. Platform as a Service: In this approach, a vendor abstracts away the notion of accessing traditional LAMP or WISC stacks from their customers and instead provides an environment for running programs written using a particular platform. In addition, data storage is provided via a custom storage layer and API instead of traditional relational database access. The most popular example of this approach is Google App Engine.
  3. Cloud-based end user applications: This typically refers to Web-based applications that have previously been provided as desktop or server based applications. Examples include Google Docs, Salesforce and Hotmail. Technically every Web application falls under this category, however the term often isn't used that inclusively.

With these definitions clearly stated it is easier to talk about what Windows Azure is and is not. Windows Azure is currently #2; a Platform as a Service offering. Although there have been numerous references to Amazon's offerings both by Microsoft and bloggers covering the Azure announcements, Windows Azure is not a utility computing offering [as defined above].

There has definitely been some confusion about this as evidenced by Dave Winer's post Microsoft's cloud strategy? and commentary from other sources.

Getting Started

To try out Azure you need to be running Windows Server 2008 or Windows Vista with a bunch of prerequisites you can get from running the Microsoft Web Platform installer. Once you have the various prerequisites installed (SQL Server, IIS 7, .NET Framework 3.5, etc) you should then grab the Windows Azure SDK. Users of Visual Studio will also benefit from grabbing the Windows Azure Tools for Visual Studio.

After this process, you should be able to fire up Visual Studio and see the option to create a Cloud Service if you go to File->New->Project.

Building Cloud-based Applications with Azure

The diagram below taken from the Windows Azure SDK shows the key participants in a typical Windows Azure service

The work units that make up a Windows Azure hosted service can have one of two roles. A Web role is an application that listens for and responds to Web requests while a Worker role is a background processing task which acts autonomously but cannot be accessed over the Web. A Windows Azure application can have multiple instances of Web and Worker roles that make up the service. For example, if I was developing a Web-based RSS reader I would need a worker role for polling feeds and Web role for displaying the UI that the user interacts with. Both Web and Worker roles are .NET applications that can be developed locally and then deployed on Microsoft's servers when they are ready to go.

Azure applications have access to a storage layer that provides the following three storage services

  • Blob Storage: This is used for storing binary data. A user account can have one or more containers which in turn can contain one or more blobs of binary data. Containers cannot be nested so one cannot create hierarchical folder structures. However Azure allows applications to work around this by (i) allowing applications to query containers based on substring matching on prefixes and (ii) delimiters such as '\' and other path characters are valid blob names. So I can create blobs with names like 'mypics\wife.jpg' and 'mypics\son.jpg' in the media container and then query for blobs beginning with 'mypics\' thus simulating a folder hierarchy somewhat. 

  • Queue Service: This is a straightforward message queuing service. A user account can have one or more queues from which they can add items to the end of each queue and remove items from the front. Items have a maximum time-to-live of 7 days within the queue. When an item is retrieved from the queue, an associated 'pop receipt' is provided. The item is then hidden from other client applications until some interval (by default 30 seconds) has passed after which the item becomes visible. The item can be deleted from the queue during that interval if the pop receipt from when it was retrieved is provided as part of the DELETE operation.  The queue service is valuable as a way for Web roles to talk to Worker roles and vice versa.

  • Table Storage: This exposes a subset of the capabilities of the ADO.NET Data Services Framework (aka Astoria). In general, this is a schema-less table based model similar to Google's BigTable and Amazon's SimpleDB. The data model consists of tables and entities (aka rows). Each entity has a primary key made of two parts {PartitionKey, RowKey}, a last modified timestamp and an arbitrary number of user-defined properties. Properties can be one of several primitive types including integer, strings, doubles, long integers, GUIDs, booleans and binary. Like Astoria, the Table Storage service supports performing LINQ queries on rows but only supports the FROM, WHERE and TAKE operators. Other differences from Astoria are that it doesn't support batch operations nor is it possible to retrieve individual properties from an entity without retrieving the entire entity.

These storage services are accessible to any HTTP client and not just Azure applications. 

Deploying Cloud-based Applications with Azure

The following diagram taken from the Windows Azure SDK shows the development lifecycle of an Windows Azure application

 

The SDK ships with a development fabric which enables you to deploy an Azure an application locally via IIS 7.0 and development storage which uses SQL Server Express as a storage layer which mimics the Windows Azure storage services.

As the diagram shows above, once the application is tested locally it can be deployed entirely or in part on Microsoft's storage and cloud computation services.

The Azure Services Platform: Windows Azure + Microsoft's Family of REST Web Services

In addition to Windows Azure, Microsoft also announced the Azure Services Platform which is a variety of Web APIs and Web Services that can be used in combination with Windows Azure (or by themselves) to build cloud-based applications. Each of these Web services is worthy of its own post (or whitepaper and O'Reilly animal book) but I'll limit myself to one sentence descriptions for now.

  • Live Services: A set of REST APIs for consumer-centric data types (e.g. calendar, profile, etc) and scenarios (communication, presence, sync, etc). You can see the set of APIs in the Live Framework poster and keep up with the goings on by following the Live Services blogs.

  • Microsoft SQL Services: Relational database in the cloud accessible via REST APIs. You can learn more from the SSDS developer center and keep up with the goings on by following the SQL Server Data Services team blog.

  • Microsoft .NET Services: Three fairly different services for now; hosted access control, hosted workflow engine and a service bus in the cloud. Boring enterprise stuff. :) 

  • Microsoft Sharepoint Services: I couldn't figure out if anything concrete was announced here or whether stuff was pre-announced (i.e. actual announcement to come at a later date).

  • Microsoft Dynamics CRM Services: Ditto.

From the above list, I find the Live Services piece (access to user data in a uniform way) and the SQL Services (hosted storage) most interesting. I will likely revisit them in more depth at a later date.

The Bottom Line

From my perspective, Windows Azure is easiest viewed as a competitor to Google App Engine. As comparisons go, Azure already brings a number of features to the table that aren't even on the Google App Engine road map. The key important feature is the ability to run background tasks instead of just being limited to writing applications that respond to Web requests. This limitation of App Engine means you can't write any application that does any serious background computation like a search engine, email service, or RSS reader on Google App Engine. So Azure can run an entire class of applications that are simply not possible on Google App Engine.

The second key feature is that by supporting the .NET Framework, developers theoretically get a plethora of languages to choose from including Ruby (IronRuby), Python (IronPython), F#, VB.NET and C#. In practice, the Azure SDK only supports creating cloud applications using C# and VB.NET out of the box. However I can't think of any reason why it shouldn't be able to support development with other .NET enabled languages like IronPython. On the flipside, App Engine only supports Python and the timeline for it supporting other languages [and exactly which other languages] is still To Be Determined.

Finally, App Engine has a number of scalability limitations both from a data storage and a query performance perspective. Azure definitely does better than App Engine on a number of these axes. For example, App Engine has a 1MB limit per file while Azure has a 64MB limit on individual blobs and also allows you to split a blob into blocks of 4MB each. Similarly, I've been watching SQL Server Data Services (SSDS) for a while and I haven't seen or heard complaints about query performance.

Azure makes it possible for me to reuse my existing skills as a .NET developer who is savvy with using RESTful APIs to build cloud based applications without having to worry about scalability concerns (e.g. database sharding, replication strategies, server failover, etc). In addition, it puts pressure on competitors to step up to the plate and deliver. However you look at it, this is a massive WIN for Web developers.

The two small things I'd love to see addressed are first class support for IronPython and some clarity on the difference between SSDS and Windows Azure Storage services. Hopefully we can avoid a LINQ to Entities vs. LINQ to SQL-style situation in the future.

Postscript: Food for Thought

It would be interesting to read [or write] further thoughts on the pros and cons of Platform as a Service offerings when compared to Utility Computing offerings. In a previous discussion on my blog there was some consensus that utility computing approaches are more resistant to vendor lock-in than platform as a service approaches since it is easier to find multiple vendors who are providing virtual servers with LAMP/WISC hosting than it will be to find multiple vendors providing the exact same proprietary cloud APIs as Google, Amazon or Microsoft. However it would be informative to look at the topic from more angles, for instance what is the cost/benefit tradeoff of using SimpleDB/BigTable/SSDS for data access instead of MySQL running on multiple virtual hosts? With my paternity leave ending today, I doubt I'll have time to go over these topics in depth but I'd appreciate reading any such analysis.

Note Now Playing: The Game - Money Note


 

November 2, 2008
@ 02:59 PM

UPDATE: This release contained a bug which caused users to lose their feed lists on upgrading from previous versions. Please download v1.8.0.862 from here instead.

I've been talking about it for almost a year and now we've finally shipped it. To celebrate the new release, we've created a new website to capture the awesomeness of the latest version.

Download the installer from here

NEW FEATURES 
- Download manager for viewing pending podcasts/enclosures
- Ability to synchronize feeds with Google Reader, NewsGator Online and the Windows Common Feed List.
- Option to increase or decrease font size in reading pane

BUG FIXES 
- 2153584 FTP storage ignores path, always stores in root directory
- 1987359 System crashes while updating feeds
- 1974485 NewsItem copy content to clipboard only copy the link
- 1971987 Error on start: docs out of order (0 < 6 )
- 1967898 Application hangs on shut down due to search indexing
- 1965832 Exception when refreshing feeds
- 2014927 New feeds not downloaded
- 1933253 Continuously Downloading Enclosures
- 1883614 Javascript in feed content causes exceptions
- 1899299 Oldest unread message shown on alert window on new items
- 1915756 Program crash during close operation
- 1987359 System crashes while updating feeds
- 2096903 Toolbar crashes and replaced with Red X
- 1960767 Crashes on opening due to FIPS policy set on Windows Vista.

Note Now Playing: Young Jeezy - Vacation Note


 

Categories: RSS Bandit

Over the past few days I've been mulling over the recent news from Yahoo! that they are building a Facebook-like platform based on OpenSocial. I find this interesting given that a number of people have come to the conclusion that Facebook is slowly killing it's widget platform in order to replace it with Facebook Connect.

The key reason developers believe Facebook is killing of its platform is captured in Nick O'Neill's post entitled Scott Rafer: The Facebook Platform is Dead which states

When speaking at the Facebook developer conference today in Berlin, Scott Rafer declared that Facebook platform dead. He posted statistics including one that I posted that suggests Facebook widgets are dead. Lookery’s own statistics from Quantcast suggest that their publisher traffic has been almost halved since the new site design was released. Ultimately, I think we may see an increase in traffic as users become educated on the new design but there is no doubt that developers were impacted significantly.

So what is Scott’s solution for developers looking to thrive following the shift to the new design? Leave the platform and jump on the Facebook Connect opportunity.

The bottom line is that by moving applications off of the profile page in their recent redesign, Facebook has reduced the visibility of these application thus reducing their page views and their ability to spread virally. Some may think that the impact of these changes is unforeseen, however the Facebook team is obsessive about testing the impact of their UX changes so it is extremely unlikely that they aren't aware that the redesign would negatively impact Facebook applications.

The question to ask then is why Facebook would knowingly damaging a platform which has been uniformly praised across the industry and has had established Web players like Google and Yahoo! scrambling to deploy copycat efforts? Alex Iskold over at ReadWriteWeb believes he has the answer in his post Why Platforms Are Letting Us Down - And What They Should Do About It which contains the following excerpt

When the Facebook platform was unveiled in 2007, it was called genius. Never before had a company in a single stroke enabled others to tap into millions of its users completely free. The platform was hailed as a game changer under the famous mantra "we built it and they will come". And they did come, hundreds of companies rushing to write Facebook applications. Companies and VC funds focused specifically on Facebook apps.

It really did look like a revolution, but it didn't last. The first reason was that Facebook apps quickly arranged themselves on a power law curve. A handful of apps (think Vampires, Byte Me and Sell My Friends) landed millions of users, but those in the pack had hardly any. The second problem was, ironically, the bloat. Users polluted their profiles with Facebook apps and no one could find anything in their profiles. Facebook used to be simple - pictures, wall, friends. Now each profile features a zoo of heterogenous apps, each one trying to grab the user's attention to take advantage of the network effect. Users are confused.

Worst of all, the platform had no infrastructure to monetize the applications. When Sheryl Sandberg arrived on the scene and looked at the balance sheet, she spotted the hefty expense that was the Facebook platform. Trying to live up to a huge valuation isn't easy, and in the absense of big revenues people rush to cut costs. Since it was both an expense and users were confused less than a year after its glorious launch, Facebook decided to revamp its platform.

The latest release of Facebook, which was released in July, makes it nearly impossible for new applications to take advantage of the network effect. Now users must first install the application, then find it under the application menu or one of the tabs, then check a bunch of boxes to add it to their profile (old applications are grand-daddied in). Facebook has sent a clear message to developers - the platform is no longer a priority.

Alex's assertion is that after Facebook looked at the pros and cons of their widget platform, the company came to the conclusion that the platform was turning into a cost center instead of being away to improve the value of Facebook to its users. There is evidence that applications built on Facebook's platform did cause negative reactions from its users. For example, there was the creation of the "This has got to stop…pointless applications are ruining Facebook" group which at its height had half a million Facebook users protesting the behavior of Facebook apps. In addition, the creation of Facebook's Great Apps program along with the guiding principles for building Facebook applications implies that the Facebook team realized that applications being built on their platform typically don't have their users best interests at heart.

This brings up the interesting point that although there has been a lot of discussion on how Facebook apps make money there haven't been similar conversations on how the application platform improves Facebook's bottom line. There is definitely a win-win equation when so-called "great apps" like iLike and Causes, which positively increase user engagement, are built on Facebook's platform. However there is also a long tail of applications that try their best to spread virally at the cost of decreasing user satisfaction in the Facebook experience. These dissatisfied users likely end up reducing their usage of Facebook thus actually costing Facebook users and page views. It is quite possible that the few "great apps" built on the Facebook platform do not outweigh the amount of not-so-great apps built on the platform which have caused users to protest in the past. This would confirm Alex Iskold's suspicions about why Facebook has started sabotaging the popularity of applications built on its platform and has started emphasizing partnerships via Facebook Connect.


A similar situation has occurred with regards to the Java platform and Sun Microsystems. The sentiment is captured in a Javaworl article by Josh Fruhlinger entitled Sun melting down, and where's Java? which contains the following excerpt

one of the most interesting things about the coverage of the company's problems is how Java figures into the conversation, which is exactly not at all. In most of the articles, the word only appears as Sun's stock ticker; the closest I could find to a mention is in this AP story, which notes that "Sun's strategy of developing free, 'open-source' software and giving it away to spur sales of its high-end computer servers and support services hasn't paid off as investors would like." Even longtime tech journalist Ashlee Vance, when gamely badgering Jon Schwartz for the New York Time about whether Sun would sell its hardware division and focus on software, only mentions Solaris and MySQL in discussing the latter.

Those in the Java community no doubt believe that Java is too big to fail, that Sun can't abandon it because it's too important, even if it can't tangibly be tied to anything profitable. But if Sun's investors eventually dismember the company to try to extract what value is left in it, I'm not sure where Java will fit into that plan.

It is interesting to note that after a decade of investment in the Java platform, it is hard to point to what concrete benefits Sun has gotten from being the originator and steward of the Java platform and programming language. Definitely another example of a platform that may have benefited applications built on it yet which didn't really benefit the platform vendor as expected.

The lesson here is that building a platform isn't just about making the developers who use the platform successful but also making sure that the platform itself furthers the goals of its developers in the first place.

Note Now Playing: Kardinal Offishall - Dangerous (Feat. Akon) Note


 

Categories: Platforms

A couple of months ago, Russell Beattie wrote a post about the end of his startup entitled The end of Mowser which is excerpted below

The argument up to now has been simply that there are roughly 3 billion phones out there, and that when these phones get on the Internet, their vast numbers will outweigh PCs and tilt the market towards mobile as the primary web device. The problem is that these billions of users *haven't* gotten on the Internet, and they won't until the experience is better and access to the web is barrier-free - and that means better devices and "full browsers". Let's face it, you really aren't going to spend any real time or effort browsing the web on your mobile phone unless you're using Opera Mini, or have a smart phone with a decent browser - as any other option is a waste of time, effort and money. Users recognize this, and have made it very clear they won't be using the "Mobile Web" as a substitute for better browsers, rather they'll just stay away completely.

In fact, if you look at the number of page views of even the most popular mobile-only websites out there, they don't compare to the traffic of popular blogs, let alone major portals or social networks.

Let me say that again clearly, the mobile traffic just isn't there. It's not there now, and it won't be.

What's going to drive that traffic eventually? Better devices and full-browsers. M-Metrics recently spelled it out very clearly - in the US 85% of iPhone owners browsed the web vs. 58% of smartphone users, and only 13% of the overall mobile market. Those numbers *may* be higher in other parts of the world, but it's pretty clear where the trend line is now. (What a difference a year makes.) It would be easy to say that the iPhone "disrupted" the mobile web market, but in fact I think all it did is point out that there never was one to begin with.

I filed away Russell's post as interesting at the time but hadn't really experienced it first hand until recently. I recently switched to using Skyfire as my primary browser on my mobile phone and it has made a world of difference in how a use my phone. No longer am I restricted to crippled versions of popular sites nor do I have to lose features when I visit the regular versions of the page. I can view the real version of my news feed on Facebook. Vote up links in reddit or Digg. And reading blogs is no longer an exercise in frustration due to CSS issues or problems rendering widgets. Unsurprisingly my usage of the Web on my phone has pretty much doubled.

This definitely brings to the forefront how ridiculous of an idea it was to think that we need a "mobile Web" complete with its own top level domain (.mobi). Which makes more sense, that every Web site in the world should create duplicate versions of their pages for mobile phones and regular browsers or that software + hardware would eventually evolve to the point where I can run a full fledged browser on the device in my pocket? Thanks to the iPhone, it is now clear to everyone that this idea of a second class Web for mobile phones was a stopgap solution at best whose time is now past. 

One other thing I find interesting is treating the iPhone as a separate category from "smartphones" in the highlighted quote. This is similar to a statement made by Walt Mossberg when he reviewed Google's Android. That article began as follows

In the exciting new category of modern hand-held computers — devices that fit in your pocket but are used more like a laptop than a traditional phone — there has so far been only one serious option. But that will all change on Oct. 22, when T-Mobile and Google bring out the G1, the first hand-held computer that’s in the same class as Apple’s iPhone.

The key feature that the iPhone and Android have in common that separates them from regular "smartphones" is that they both include a full featured browser based on Webkit. The other features like downloadable 3rd party applications, wi-fi support, rich video support, GPS, and so on have been available on phones running Windows Mobile for years. This shows how important having a full Web experience was for mobile phones and just how irrelevant the notion of a "mobile Web" has truly become.

Note Now Playing: Kilo Ali - Lost Y'all Mind Note


 

Categories: Technology

The second most interesting announcement out of PDC this morning is that Windows Live ID is becoming an OpenID Provider. The information below explains how to try it out and give feedback to the team responsible.

Try It Now. Tell Us What You Think

We want you to try the Windows Live ID OpenID Provider CTP release, let us know your feedback, and tell us about any problems you find.

To prepare:

  1. Go to https://login.live-int.com and use the sign-up button to set up a Windows Live ID test account in the INT environment.
  2. Go to https://login.live-int.com/beta/ManageOpenID.srf to set up your OpenID test alias.

Then:

  • Users - At any Web site that supports OpenID 2.0, type openid.live-INT.com in the OpenID login box to sign in to that site by means of your Windows Live ID OpenID alias.
  • Library developers - Test your libraries against the Windows Live ID OP endpoint and let us know of any problems you find.
  • Web site owners - Test signing in to your site by using a Windows Live ID OpenID alias and let us know of any problems you find.
  • You can send us feedback at:
  • E-mail - openidfb@microsoft.com

This is awesome news. I've been interested in Windows Live supporting OpenID for a while and I'm glad to see that we've taken the plunge. Please try it out and send the team your feedback.

I've tried it out already and sent some initial feedback. In general, my feedback was on applying the lessons from the Yahoo! OpenID Usability Study since it looks like our implementation has some of the same usability issues that inspired Jeff Atwood's rants about Yahoo's OpenID implementation. Since it is still a Community Technology Preview, I'm sure the user experience will improve as feedback trickles in.

Kudos to Jorgen Thelin and the rest of the folks on the Identity Services team for getting this out. Great work, guys.

UPDATE: Angus Logan posted a comment with a link to the following screencast of the current user experience when using Windows Live ID as an OpenID provider experience


Note Now Playing: Christina Aguilera - Keeps Gettin' Better Note


 

Categories: Windows Live

October 27, 2008
@ 05:39 PM

Just because you aren't attending Microsoft's Professional Developer Conference doesn't mean you can't follow the announcements. The most exciting announcement so far [from my perspective] has been Windows Azure which is described as follows from the official site

The Azure™ Services Platform (Azure) is an internet-scale cloud services platform hosted in Microsoft data centers, which provides an operating system and a set of developer services that can be used individually or together. Azure’s flexible and interoperable platform can be used to build new applications to run from the cloud or enhance existing applications with cloud-based capabilities. Its open architecture gives developers the choice to build web applications, applications running on connected devices, PCs, servers, or hybrid solutions offering the best of online and on-premises.

Azure reduces the need for up-front technology purchases, and it enables developers to quickly and easily create applications running in the cloud by using their existing skills with the Microsoft Visual Studio development environment and the Microsoft .NET Framework. In addition to managed code languages supported by .NET, Azure will support more programming languages and development environments in the near future. Azure simplifies maintaining and operating applications by providing on-demand compute and storage to host, scale, and manage web and connected applications. Infrastructure management is automated with a platform that is designed for high availability and dynamic scaling to match usage needs with the option of a pay-as-you-go pricing model. Azure provides an open, standards-based and interoperable environment with support for multiple internet protocols, including HTTP, REST, SOAP, and XML.

It will be interesting to read what developers make of this announcement and what kind of apps start getting built on this platform. I'll also be on the look out for any in depth discussions on the platform, there is lots to chew on in this announcement.

For a quick overview of what Azure means to developers, take a look at Azure for Business and Azure for Web Developers

Note Now Playing: Guns N' Roses - Welcome to the Jungle Note


 

Categories: Platforms | Windows Live

John McCrea of Plaxo has written a cleverly titled guest post on TechCrunchIT, Facebook Connect and OpenID Relationship Status: “It’s Complicated”, where he makes the argument that Facebook Connect is a competing technology to OpenID but the situation is complicated by Facebook developers engaging in discussions with the OpenID. He writes

You see, it’s been about a month since the first implementation of Facebook Connect was spotted in the wild over at CBS’s celebrity gossip site, TheInsider.com. Want to sign up for the site? Click a single button. A little Facebook window pops up to confirm that you want to connect via your Facebook account. One more click – and you’re done. You’ve got a new account, a mini profile with your Facebook photo, and access to that subset of your Facebook friends who have also connected their accounts to TheInsider. Oh, and you can have your activities on TheInsider flow into your Facebook news feed automatically. All that, without having to create and remember a new username/password pair for the site. Why, it’s just like the vision for OpenID and the Open Stack – except without a single open building block under the hood!
...
After the intros, Allen Tom of Yahoo, who organized the event, turned the first session over Max Engel of MySpace, who in turn suggested an alternative – why not let Facebook’s Julie Zhuo kick it off instead? And for the next hour, Julie took us through the details of Facebook Connect and the decisions they had to make along the way to get the user interface and user experience just right. It was not just a presentation; it was a very active and engaged discussion, with questions popping up from all over the room. Julie and the rest of the Facebook team were engaged and eager to share what they had learned.

What the heck is going on here? Is Facebook preparing to go the next step of open, switching from the FB stack to the Open Stack? Only time will tell. But one thing is clear: Facebook Connect is the best thing ever for OpenID (and the rest of the Open Stack). Why? Because Facebook has set a high bar with Facebook Connect that is inspiring everyone in the open movement to work harder and faster to bring up the quality of the UI/UX for OpenID and the Open Stack.

There are a number of points worth discussing from the above excerpt. The first is the implication that OpenID is an equivalent technology to Facebook Connect. This is clearly not the case. OpenID just allows you to delegate to act of authenticating a user to another website so the user doesn't need to create credentials (i.e. username + password) on your site. OpenID alone doesn't get you the user's profile data nor does it allow you to pull in the authenticated user's social graph from the other site or publish activities to their activity feed. For that, you would need other other "Open brand" technologies like OpenID Attribute Exchange, Portable Contacts and OpenSocial. So it is fairer to describe the contest as Facebook Connect vs. OpenID + OpenID Attribute Exchange + Portable Contacts + OpenSocial.

The question then is who should we root for? At the end of the day, I don't think it makes a ton of sense for websites to have to target umpteen different APIs that do the same thing instead of targeting one standard implemented by multiple services. Specifically, it seems ridiculous to me that TheInsider.com will have to code against Facebook Connect to integrate Facebook accounts into their site but code against something else if they want to integrate MySpace accounts and yet another API if they want to integrate LinkedIn accounts and so on. This is an area that is crying out for standardization.

Unfortunately, the key company providing thought leadership in this area is Facebook and for now they are building their solution with proprietary technologies instead of de jure or de facto ("Open brand") standards. This is unsurprising given that it takes three or four different specs in varying states of completeness created by different audiences deliver the scenarios they are currently interested in. What is encouraging is that Facebook developers are working with OpenID implementers by sharing their knowledge. However OpenID isn't the only technology needed to satisfy this scenario and I wonder if Facebook will be similarly engaged with the folks working on Portable Contacts and OpenSocial.

Facebook Connect is a step in the right direction when it comes to bringing the vision of social network interoperability to fruition. The key question is whether we will see effective open standards emerge that will target the same scenarios [which eventually even Facebook could adopt] or whether competitors will offer their own proprietary alternatives? So far it sounds like the latter is happening which means unnecessary reinvention of the wheel for sites that want to support "connecting" with multiple social networking sites.

PS: If OpenID phishing is a concern now when the user is redirected to the ID provider's site to login, it seems Facebook Connect is even worse since all it does is provide a pop over. I wonder if this is because the Facebook folks think the phishing concerns are overblown.

Note Now Playing: 2Pac - Mind Made Up (feat. Daz, Method Man & Redman) Note


 

Early this week, Roy Fieldings wrote a post entitled REST APIs must be hypertext-driven where he criticized the SocialSite REST API (a derivative of the OpenSocial REST API) for violating some constraints of the Representational State Transfer architectural style (aka REST). Roy's key criticisms were

API designers, please note the following rules before calling your creation a REST API:

  • A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]
  • A REST API must not define fixed resource names or hierarchies (an obvious coupling of client and server). Servers must have the freedom to control their own namespace. Instead, allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations. [Failure here implies that clients are assuming a resource structure due to out-of band information, such as a domain-specific standard, which is the data-oriented equivalent to RPC's functional coupling].
  • ..
  • A REST API should be entered with no prior knowledge beyond the initial URI (bookmark) and set of standardized media types that are appropriate for the intended audience (i.e., expected to be understood by any client that might use the API). From that point on, all application state transitions must be driven by client selection of server-provided choices that are present in the received representations or implied by the user’s manipulation of those representations. The transitions may be determined (or limited by) the client’s knowledge of media types and resource communication mechanisms, both of which may be improved on-the-fly (e.g., code-on-demand). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]

In reading some of the responses to Roy's post on programming.reddit it seems there are of number of folks who found it hard to glean practical advice from Roy's post. I thought it be useful to go over his post in more depth and with some examples.

The key thing to remember is that REST is about building software that scales to usage on the World Wide Web by being a good participant of the Web ecosystem. Ideally a RESTful API should be designed to be implementable by thousands of websites and consumed by hundreds of applications running on dozens of platforms with zero coupling between the client applications and the Web services. A great example of this is RSS/Atom feeds which happen to be one of the world's most successful RESTful API stories.

This notion of building software that scales to Web-wide usage is critical to understanding Roy's points above. The first point above is that a RESTful API should primarily be concerned about data payloads and not defining how URI end points should handle various HTTP methods. For one, sticking to defining data payloads which are then made standard MIME types gives maximum reusability of the technology. The specifications for RSS 2.0 (application/xml+rss) and the Atom syndication format (application/xml+atom) primarily focus on defining the data format and how applications should process feeds independent of how they were retrieved. In addition, both formats are aimed at being standard formats that can be utilized by any Web site as opposed to being tied to a particular vendor or Web site which has aided their adoption. Unfortunately, few have learned from these lessons and we have people building RESTful APIs with proprietary data formats that aren't meant to be shared. My current favorite example of this is social graph/contacts APIs which seem to be getting reinvented every six months. Google has the Contacts Data API, Yahoo! has their Address Book API, Microsoft has the Windows Live Contacts API, Facebook has their friends REST APIs and so on. Each of these APIs claims to be RESTful in its own way yet they are helping to fragment the Web instead of helping to grow it. There have been some moves to address this with the OpenSocial influenced Portable Contacts API but it too shies away from standard MIME types and instead creates dependencies on URL structures to dictate how the data payloads should be retrieved/processed.

One bad practice Roy calls out, which is embraced by the Portable Contacts and SocialSite APIs, is requiring a specific URL structure for services that implement the API. Section 6.2 of the current Portable Contacts API spec states the following

A request using the Base URL alone MUST yield a result, assuming that adequate authorization credentials are provided. In addition, Consumers MAY append additional path information to the Base URL to request more specific information. Service Providers MUST recognize the following additional path information when appended to the Base URL, and MUST return the corresponding data:

  • /@me/@all -- Return all contact info (equivalent to providing no additional path info)
  • /@me/@all/{id} -- Only return contact info for the contact whose id value is equal to the provided {id}, if such a contact exists. In this case, the response format is the same as when requesting all contacts, but any contacts not matching the requested ID MUST be filtered out of the result list by the Service Provider
  • /@me/@self -- Return contact info for the owner of this information, i.e. the user on whose behalf this request is being made. In this case, the response format is the same as when requesting all contacts, but any contacts not matching the requested ID MUST be filtered out of the result list by the Service Provider.

The problem with this approach is that it assumes that every implementer will have complete control of their URI space and that clients should have URL structures baked into them. The reason this practice is a bad idea is well documented in Joe Gregorio's post No Fishing - or - Why 'robots.txt and 'favicon.ico' are bad ideas and shouldn't be emulated where he lists several reasons why hard coded URLs are a bad idea. The reasons against include lack of extensibility and poor support for people in hosted environments who may not fully control their URI space. The interesting thing to note is that both the robots.txt and favicon.ico scenarios eventually developed mechanisms to support using hyperlinks on the source page (i.e. noindex and rel="shortcut icon") instead of hard coded URIs since that practice doesn't scale to Web-wide usage.

The latest drafts of the OpenSocial specification have a great example of how a service can use existing technologies such as URI templates to make even complicated URL structures to be flexible and discoverable without having to force every client and service to hardcode a specific URL structure. Below is an excerpt from the discovery section of the current OpenSocial REST API spec

A container declares what collection and features it supports, and provides templates for discovering them, via a simple discovery document. A client starts the discovery process at the container's identifier URI (e.g., example.org). The full flow is available athttp://xrds-simple.net/core/1.0/; in a nutshell:

  1. Client GETs {container-url} with Accept: application/xrds+xml
  2. Container responds with either an X-XRDS-Location: header pointing to the discovery document, or the document itself.
  3. If the client received an X-XRDS-Location: header, follow it to get the discovery document.

The discovery document is an XML file in the same format used for OpenID and OAuth discovery, defined at http://xrds-simple.net/core/1.0/:

<XRDS xmlns="xri://$xrds">
<XRD xmlns:simple="http://xrds-simple.net/core/1.0" xmlns="xri://$XRD*($v*2.0)" xmlns:os="http://ns.opensocial.org/2008/opensocial" version="2.0">
<Type>xri://$xrds*simple</Type>
<Service>
<Type>http://ns.opensocial.org/2008/opensocial/people</Type>
<os:URI-Template>http://api.example.org/people/{guid}/{selector}{-prefix|/|pid}</os:URI-Template>
</Service>
<Service>
<Type>http://ns.opensocial.org/2008/opensocial/groups</Type>
<os:URI-Template>http://api.example.org/groups/{guid}</os:URI-Template>
</Service>
<Service
<Type>http://ns.opensocial.org/2008/opensocial/activities</Type>
<os:URI-Template>http://api.example.org/activities/{guid}/{appid}/{selector}</os:URI-Template>
</Service>
<Service>
<Type>http://ns.opensocial.org//2008/opensocial/appdata</Type>
<os:URI-Template>http://api.example.org/appdata/{guid}/{appid}/{selector}</os:URI-Template>
</Service>
<Service>
<Type>http://ns.opensocial.org//2008/opensocial/messages</Type>
<os:URI-Template>http://api.example.org/messages/{guid}/{selector}</os:URI-Template>
</Service>
</XRD>
</XRDS>

This approach makes it possible for a service to expose the OpenSocial end points however way it sees fit without clients having to expect a specific URL structure. 

Similarly links should be used for describing relationships between resources in the various payloads instead of expecting hard coded URL structures. Again, I'm drawn to the example of RSS & Atom feeds where link relations are used for defining the permalink to a feed item, the link to related media files (i.e. podcasts), links to comments, etc instead of applications expecting that every Web site that supports enclosures should have /@rss/{id}/@podcasts  URL instead of just examining the <enclosure> element.

Thus it is plain to see that hyperlinks are important both for discovery of service end points and for describing relationships between resources in a loosely coupled way.

Note Now Playing: Prince - When Doves Cry Note


 

October 19, 2008
@ 08:47 AM

Tim Bray has a thought provoking post on embracing cloud computing entitled Get In the Cloud where he brings up the problem of vendor lock-in. He writes

Tech Issue · But there are two problems. The small problem is that we haven’t quite figured out the architectural sweet spot for cloud platforms. Is it Amazon’s EC2/S3 “Naked virtual whitebox” model? Is it a Platform-as-a-service flavor like Google App Engine? We just don’t know yet; stay tuned.

Big Issue · I mean a really big issue: if cloud computing is going to take off, it absolutely, totally, must be lockin-free. What that means if that I’m deploying my app on Vendor X’s platform, there have to be other vendors Y and Z such that I can pull my app and its data off X and it’ll all run with minimal tweaks on either Y or Z.

At the moment, I don’t think either the Amazon or Google offerings qualify.

Are we so deranged here in the twenty-first century that we’re going to re-enact, wide-eyed, the twin tragedies of the great desktop-suite lock-in and the great proprietary-SQL lock-in? You know, the ones where you give a platform vendor control over your IT budget? Gimme a break.

I’m simply not interested in any cloud offering at any level unless it offers zero barrier-to-exit.

Tim's post is about cloud platforms but I think it is useful to talk about avoiding lock-in when taking a bet on cloud based applications as well as when embracing cloud based platforms. This is especially true when you consider that moving from one application to another is a similar yet smaller scoped problem compared to moving from one Web development platform to another.

So let's say your organization wants to move from a cloud based office suite like Google Apps for Business to Zoho. The first question you have to ask yourself is whether it is possible to extract all of your organization's data from one service and import it without data loss into another. For business documents this should be straightforward thanks to standards like ODF and OOXML. However there are points to consider such as whether there is an automated way to perform such bulk imports and exports or whether individuals have to manually export and/or import their online documents to these standard formats. Thus the second question is how expensive it is for your organization to move the data. The cost includes everything from the potential organizational downtime to account for switching services to the actual IT department cost of moving all the data. At this point, you then have to weigh the impact of all the links and references to your organization's data that will be broken by your switch. I don't just mean links to documents returning 404 because you have switched from being hosted at google.com to zoho.com but more insidious problems like the broken experience of anyone who is using the calendar or document sharing feature of the service to give specific people access to their data. Also you have to ensure that email that is sent to your organization after the switch goes to the right place. Making this aspect of the transition smooth will likely be the most difficult part of the migration since it requires more control over application resources than application service providers typically give their customers. Finally, you will have to evaluate which features you will lose by switching applications and ensure that none of them is mission critical to your business.

Despite all of these concerns, switching hosted application providers is mostly a tractable problem. Standard data formats make data migration feasible although it might be unwieldy to extract the data from the service. In addition, Internet technologies like SMTP and HTTP all have built in ways to handle forwarding/redirecting references so that they aren't broken. However although the technology makes it possible, the majority of hosted application providers fall far short of making it easy to fully migrate to or away from their service without significant effort.

When it comes to cloud computing platforms, you have all of the same problems described above and a few extra ones. The key wrinkle with cloud computing platforms is that there is no standardization of the APIs and platform technologies that underlie these services. The APIs provided by Amazon's cloud computing platform (EC2/S3/EBS/etc) are radically different from those provided by Google App Engine (Datastore API/Python runtime/Images API/etc). For zero lock-in to occur in this space, there need to be multiple providers of the same underlying APIs. Otherwise, migrating between cloud computing platforms will be more like switching your application from Ruby on Rails and MySQL to Django and PostgreSQL (i.e. a complete rewrite).

In response to Tim Bray's post, Dewitt Clinton of Google left a comment which is excerpted below

That's why I asked -- you can already do that in both the case of Amazon's services and App Engine. Sure, in the case of EC2 and S3 you'll need to find a new place to host the image and a new backend for the data, but Amazon isn't trying to stop you from doing that. (Actually not sure about the AMI format licensing, but I assumed it was supposed to be open.)

In App Engine's case people can run the open source userland stack (which exposes the API you code to) on other providers any time the want, and there are plenty of open source bigtable implementations to chose from. Granted, bulk export of data is still a bit of a manual process, but it is doable even today and we're working to make it even easier.

Ae you saying that lock-in is avoided only once the alternative hosts exist?

But how does Amazon or Google facilitate that, beyond getting licensing correct and open sourcing as much code as they can? Obviously we can't be the ones setting up the alternative instances. (Though we can cheer for them, like we did when we saw the App Engine API implemented on top of EC2 and S3.)

To Doug Cutting's very good point, the way Amazon and Google (and everyone else in this space) seem to be trying to compete is by offering the best value, in terms of reliability (uptime, replication) and performance (data locality, latency, etc) and monitoring and management tools. Which is as it should be.

Although Dewitt is correct that Google and Amazon are not explicitly trying to lock-in customers to their platform, the fact is that today if a customer has heavily invested in either platform then there isn't a straightforward way for customers to extricate themselves from the platform and switch to another vendor. In addition there is not a competitive marketplace of vendors providing standard/interoperable platforms as there are with email hosting or Web hosting providers.

As long as these conditions remain the same, it may be that lock-in is too strong a word describe the situation but it is clear that the options facing adopters of cloud computing platforms aren't great when it comes to vendor choice.

Note Now Playing: Britney Spears - Womanizer Note


 

Categories: Platforms | Web Development