Sunday, 09 November 2008 - Dare Obasanjo's weblog

November 9, 2008

@ 12:46 PM

In-Memory Caching: Why We Can't Just Trust the Database to get it Right

I remember taking an operating systems class in college and marveling at the fact that operating system design seemed less about elegant engineering and more about [what I viewed at the time as] performance hacks. I saw a similar sentiment recently captured by Eric Florenzano in his post It's Caches All the Way Down where he starts describing how a computer works to a friend and ends up talking about the various layers of caching from CPU registers to L2 caches to RAM and so on.

At the end of his post Eric Florenzano asks the following question I've often heard at work and in developer forums like programming.reddit

That's what struck me. When you come down to it, computers are just a waterfall of different caches, with systems that determine what piece of data goes where and when. For the most part, in user space, we don't care about much of that either. When writing a web application or even a desktop application, we don't much care whether a bit is in the register, the L1 or L2 cache, RAM, or if it's being swapped to disk. We just care that whatever system is managing that data, it's doing the best it can to make our application fast.

But then another thing struck me. Most web developers DO have to worry about the cache. We do it every day. Whether you're using memcached or velocity or some other caching system, everyone's manually moving data from the database to a cache layer, and back again. Sometimes we do clever things to make sure this cached data is correct, but sometimes we do some braindead things. We certainly manage the cache keys ourselves and we come up with systems again and again to do the same thing.

Does this strike anyone else as inconsistent? For practically every cache layer down the chain, a system (whose algorithms are usually the result of years and years of work and testing) automatically determines how to best utilize that cache, yet we do not yet have a good system for doing that with databases. Why do you think that is? Is it truly one of the two hard problems of computer science? Is it impossible to do this automatically? I honestly don't have the answers, but I'm interested if you do.

Eric is simultaneously correct and incorrect in his statements around caching and database layers. Every modern database system has caching mechanisms that are transparent to the developer. For LAMP developers there is the MySQL Query Cache which transparently caches the text of SELECT query and the results so that the next time the query is performed it is fetched from memory. For WISC developers there are the SQL Server's Data and Procedure caches which store query plans and their results to prevent having to repeatedly perform expensive computations or go to disk to fetch recently retrieved data. As with everything in programming, developers can eke more value out of these caches by knowing how they work. For example, using parameterized queries or stored procedures significantly reduces the size of the procedure cache in SQL Server. Tony Rogerson wrote an excellent post where he showed how switching from SQL queries based on string concatenation to parameterized queries can reduce the size of a procedure cache from over 1 Gigabyte to less than 1 Megabyte. This is similar to understanding how garbage collection or memory allocation works teaches developers to favor recycling objects instead of creating new ones and favoring arrays over linked lists to reduce memory fragmentation.

Even though there are caching mechanisms built into relational databases that are transparent to the developer, they typically aren't sufficient for high performance applications with significant amounts of read load. The biggest problem is hardware limitations. A database server will typically have twenty to fifty times more hard drive storage capacity than it has memory. Optimistically this means a database server can cache about 5% to 10% of its entire data in memory before having to go to disk. In comparison, lets look at some statistics from a popular social networking site (i.e. Facebook)

48% of active users log-in daily
49.9 average pages a day viewed per unique user

How much churn do you think the database cache goes through if half of the entire user base is making data requests every day? This explains why Facebook has over 400 memcached hosts storing over 5 Terabytes of data in-memory instead of simply relying on the query caching features in MySQL. This same consideration applies for the majority of social media and social networking sites on the Web today.

So the problem isn't a lack of transparent caching functionality in relational databases today. The problem is the significant differences in the storage and I/O capacity of memory versus disk in situations where a large percentage of the data set needs to be retrieved regularly. In such cases, you want to serve as much data from memory as possible which means going beyond the physical memory capacity of your database server and investing in caching servers.

Note Now Playing: Kanye West - Two Words (feat. Mos Def, Freeway & The Harlem Boys Choir) Note

Categories: Web Development

November 8, 2008

@ 03:53 PM

Comments [6]

C# is the Next Python: Duck Typing and C# 4.0

At the recent Microsoft Professional Developer's Conference there was a presentation by Anders Hejlsberg about The Future of C# which unveiled some of the upcoming features of C# 4.0. Of all the features announced, the one I found most interesting was the introduction of duck typing via a new static type called dynamic.

For those who aren't familiar with Duck Typing, here is the definition from Wikipedia

In computer programming, duck typing is a style of dynamic typing in which an object's current set of methods and properties determines the valid semantics, rather than its inheritance from a particular class. The name of the concept refers to the duck test, attributed to James Whitcomb Riley (see History below), which may be phrased as follows:

If it walks like a duck and quacks like a duck, I would call it a duck.

In duck typing one is concerned with just those aspects of an object that are used, rather than with the type of the object itself. For example, in a non-duck-typed language, one can create a function that takes an object of type Duck and calls that object's walk and quack methods. In a duck-typed language, the equivalent function would take an object of any type and call that object's walk and quack methods. If the object does not have the methods that are called then the function signals a run-time error.

This is perfectly illustrated with the following IronPython program

def PrintAt(container, index):
    print "The value at [%s] is %s" % (index, container[index])

if __name__ == "__main__":
    #create dictionary

    table = {1 : "one", 2 : "two",3 : "three"}
    #create sequence

    list = ("apple", "banana", "cantalope")

    PrintAt(table, 1)
    PrintAt(list, 1)

In the above program the PrintAt() simply requires that the container object passed in supports the index operator '[]' and can accept whatever type is passed in as the index. This means I can pass both a sequence (i.e. a list) or a dictionary (i.e. a hash table) to the function and it returns results even though the semantics of using the index operator is very different for lists and dictionaries.

Proponents of static typing have long argued that features like duck typing lead to hard-to-find bugs which are only detected at runtime after the application has failed instead of during development via compiler errors. However there are many situations even in statically typed programming where the flexibility of duck typing would be beneficial. A common example is invoking JSON or SOAP web services and mapping these structures to objects.

Recently, I had to write some code at work which spoke to a JSON-based Web service and struggled with how to deal with the fact that C# requires me to define the class of an object up front before I can use it in my application. Given the flexible and schema-less nature of JSON, this was a problem. I ended up using the JsonReaderWriterFactory to create an XmlDictionaryReader which maps JSON into an XML document which can then be processed flexibly using XML technologies. Here's what the code looked like

using System;
using System.IO; 
using System.Runtime.Serialization.Json;
using System.Text;
using System.Xml; 

namespace Test
{
    class Program
    {
        static void Main(string[] args)
        {
            string json = @"{ ""firstName"": ""John"", ""lastName"": ""Smith"", ""age"": 21 }";
            var stream = new MemoryStream(ASCIIEncoding.Default.GetBytes(json));
            var reader = JsonReaderWriterFactory.CreateJsonReader(stream, XmlDictionaryReaderQuotas.Max); 

            var doc = new XmlDocument(); 
            doc.Load(reader);
            
            //error handling omitted for brevity
            string firstName = doc.SelectSingleNode("/root/firstName").InnerText; 
            int age          = Int32.Parse(doc.SelectSingleNode("/root/age").InnerText); 
            
            Console.WriteLine("{0} will be {1} next year", firstName, age + 1);
            Console.ReadLine();  
        }
    }
}

It works but the code is definitely not as straightforward as interacting with JSON from Javascript. This is where the dynamic type from C# 4.0 would be very useful. With this type, I could rewrite the above code as follows

using System;
using System.IO; 
using System.Runtime.Serialization.Json;
using System.Text;
using System.Xml; 

namespace Test
{
    class Program
    {
        static void Main(string[] args)
        {
            string json = @"{ ""firstName"": ""John"", ""lastName"": ""Smith"", ""age"": 21 }";
            var stream = new MemoryStream(ASCIIEncoding.Default.GetBytes(json));
            dynamic person = JsonObjectReaderFactory.CreateJsonObjectReader(stream, XmlDictionaryReaderQuotas.Max); 

            Console.WriteLine("{0} will be {1} next year", person.firstName, person.age + 1);
            Console.ReadLine();  
        }
    }
}

In the C# 4.0 version I can declare a person object whose class/type I don't have to define at compile time. Instead the property accesses are converted to dynamic calls to the named properties using reflection by the compiler. So at runtime when my [imaginary] JsonObjectReader dynamically creates the person objects from the input JSON, my application works as expected with a lot less lines of code.

It's amazing how Python-like C# gets each passing year.

What OpenID Gives Us

So what is the problem that OpenID solves? For this I'll go back to one of the original emails on OpenID from it's founder Brad Fitzpatrick way back in 2005. In that email Brad wrote

So LiveJournal for user "bob" would positively assert the follow root URLs as being owned by bob:

http://www.livejournal.com/users/bob/
http://www.livejournal.com/~bob/
http://bob.livejournal.com/

And for those geeks out there with their own domain names (yes, I'm one, and you're one, but we're not the common case), you either run your own identity server, or you use somebody else's that's paranoid. For example, LiveJournal's would only assert off-site URLs which come to us with a rel="me" type of deal (not using XFN) as we currently do,with ljuser_sha1=9233b6f5388d6867a2a7be14d8b4ba53c86cfde2

OpenID is all about confirming that a user "owns" a URI. The expectation is that users have a URI that they have memorized and enter it into sites that support OpenID (i.e. Relying Parties) so that they can be authenticated. This breaks down in a number of situations.

Users of email services don't have a URI that they own, they have an email address.
Some users of social networking sites may have a URI but they don't have it memorized. for example I don't know anyone who has memorized the URL to their Facebook profile. Mine is http://www.facebook.com/p/Dare_Obasanjo/500050028 for anyone interested.

As different Web vendors have begun implementing OpenID, they've began to run into situations where OpenID's design breaks down for their users. Unsurprisingly a number of OpenID proponents have described these issues as usability issues as opposed to design problems.

Google's OpenID Woes

The most notorious example of a mainstream vendor colliding with the design mismatch between OpenID and their target domain is Google's implementation of OpenID. In a post entitled Google Abandons Standards, Forks OpenID we have the following excerpt

whatever it is that Google has released support for, it sure as hell isn’t OpenID, as they even so kindly point out in their OpenID developer documentation (that media outlets certainly won’t be reading):

The web application asks the end user to log in by offering a set of log-in options, including Google.

The user selects the "Sign in with Google" option.

The web application sends a "discovery" request to Google to get information on the Google authentication endpoint. This is a departure from the process outlined in OpenID 1.0. [Emphasis added]

Google returns an XRDS document, which contains endpoint address.

The web application sends a login authentication request to the Google endpoint address.

This action redirects the user to a Google Federated Login page.

As Google points out, this isn’t OpenID. This is something that Google cooked up that resembles OpenID masquerading as OpenID since that’s what people want to see – and that’s what Microsoft announced just the day before.

It’s not just a “departure” from OpenID, it’s a whole new standard.

With OpenID, the user memorizes a web URI, and provides it to the sites he or she would like to sign in to. The site then POSTs an OpenID request to that URI where the OpenID backend server proceeds to perform the requested authentication.

In Google’s version of the OpenID “standard,” users would enter their @gmail.com email addresses in the OpenID login box on OpenID-enabled sites, who would then detect that a Google email was entered. The server then requests permission from Google to use the OpenID standard in the first place by POSTing an XML document to Google’s “OpenID” servers. If Google decides it’ll accept the request from the server, it’ll return an XML document back to the site in question that contains a link to the actual OpenID URI for the email account in question.

The key problem Google had with their OpenID implementation is that they want their users to log-in with information they know (i.e. their gmail address) instead of some made up URL that is their Google OpenID identifier. Since only the Über geeks in their user base would know their Google OpenID URL if they went with that approach. With Google's approach meant that only sites that had been hardcoded to support Google's flavor of OpenID would be supported.

Thankfully, the outcry from the OpenID community was heard by the folks at Google and they've reversed their decision. Instead they will support OpenID discovery on gmail.com and so users just have to know that domain name and then the OpenID authentication process kicks off from there. The intent is for Google's OpenID implementation to eventually mirror the functionality of Yahoo's OpenID implementation.

It Isn't Over Yet

It should be noted that as Google aspires to have an OpenID implementation that is as compliant as Yahoo's, OpenID experts have also criticized Yahoo's implementation. Specifically Sam Ruby in his post OpenId Minus Id Equals Wide Open writes

Martin Atkins: Yahoo!'s OP and now it seems Microsoft’s OP both ignore the value of openid.identity provided to them, and just return an assertion for whatever user’s logged in.

I may ultimately need to black-list such ids.

Looking at live.com instructions:

At any Web site that supports OpenID 2.0, type openid.live-INT.com in the OpenID login box to sign in to that site by means of your Windows Live ID OpenID alias.

If everybody uses the same URI, I can’t tell them apart. That doesn’t concern me much, but do find it a bit distressing that that’s the recommended usage.

The problem goes back to the original role of OpenID as being a way for a user to claim that they own a URI. As Sam points out, some of the techniques being used to "improve usability" now run counter to the spirit of the original specification.

Only time will tell if by the time all is said and done, it won't seem like we've been trying to shove a square peg in a round hole all these years.

Note Now Playing: Kanye West - Heartless Note

Categories: Web Development

November 5, 2008

@ 06:10 PM

Comments [2]

My Son on Halloween

It feels good to realize that I can tell him that he can grow up to be president with a straight face. Congratulations to Team Obama.

Note Now Playing: Young Jeezy - My President (Feat. Nas) Note

Categories: Personal

November 3, 2008

@ 10:04 AM

Comments [10]

Windows Azure from a Developer's Perspective

Disclaimer: What follows are my personal impressions from using the beta version of Windows Azure. It is not meant to be an official description of the project from Microsoft, you can find that here.

Earlier this week I scored an invite to try out the beta version of Windows Azure which is a new hosted services (aka cloud computing) platform from Microsoft. Since there's been a ridiculous amount of press about the project I was interested in actually trying it out by developing and deploying some code using this platform and sharing my experiences with others.

What is it?

Before talking about a cloud computing platform, it is useful to agree on definitions of the term cloud computing. Tim O'Reilly has an excellent post entitled Web 2.0 and Cloud Computing where he breaks the technologies typically described as cloud computing into three broad categories

Utility Computing: In this approach, a vendor provides access to virtual server instances where each instance runs a traditional server operating system such as Linux or Windows Server. Computation and storage resources are metered and the customer can "scale infinitely" by simply creating new server instances. The most popular example of this approach is Amazon EC2.
Platform as a Service: In this approach, a vendor abstracts away the notion of accessing traditional LAMP or WISC stacks from their customers and instead provides an environment for running programs written using a particular platform. In addition, data storage is provided via a custom storage layer and API instead of traditional relational database access. The most popular example of this approach is Google App Engine.
Cloud-based end user applications: This typically refers to Web-based applications that have previously been provided as desktop or server based applications. Examples include Google Docs, Salesforce and Hotmail. Technically every Web application falls under this category, however the term often isn't used that inclusively.

With these definitions clearly stated it is easier to talk about what Windows Azure is and is not. Windows Azure is currently #2; a Platform as a Service offering. Although there have been numerous references to Amazon's offerings both by Microsoft and bloggers covering the Azure announcements, Windows Azure is not a utility computing offering [as defined above].

There has definitely been some confusion about this as evidenced by Dave Winer's post Microsoft's cloud strategy? and commentary from other sources.

Getting Started

To try out Azure you need to be running Windows Server 2008 or Windows Vista with a bunch of prerequisites you can get from running the Microsoft Web Platform installer. Once you have the various prerequisites installed (SQL Server, IIS 7, .NET Framework 3.5, etc) you should then grab the Windows Azure SDK. Users of Visual Studio will also benefit from grabbing the Windows Azure Tools for Visual Studio.

After this process, you should be able to fire up Visual Studio and see the option to create a Cloud Service if you go to File->New->Project.

Building Cloud-based Applications with Azure

The diagram below taken from the Windows Azure SDK shows the key participants in a typical Windows Azure service

The work units that make up a Windows Azure hosted service can have one of two roles. A Web role is an application that listens for and responds to Web requests while a Worker role is a background processing task which acts autonomously but cannot be accessed over the Web. A Windows Azure application can have multiple instances of Web and Worker roles that make up the service. For example, if I was developing a Web-based RSS reader I would need a worker role for polling feeds and Web role for displaying the UI that the user interacts with. Both Web and Worker roles are .NET applications that can be developed locally and then deployed on Microsoft's servers when they are ready to go.

Azure applications have access to a storage layer that provides the following three storage services

Blob Storage: This is used for storing binary data. A user account can have one or more containers which in turn can contain one or more blobs of binary data. Containers cannot be nested so one cannot create hierarchical folder structures. However Azure allows applications to work around this by (i) allowing applications to query containers based on substring matching on prefixes and (ii) delimiters such as '\' and other path characters are valid blob names. So I can create blobs with names like 'mypics\wife.jpg' and 'mypics\son.jpg' in the media container and then query for blobs beginning with 'mypics\' thus simulating a folder hierarchy somewhat.
Queue Service: This is a straightforward message queuing service. A user account can have one or more queues from which they can add items to the end of each queue and remove items from the front. Items have a maximum time-to-live of 7 days within the queue. When an item is retrieved from the queue, an associated 'pop receipt' is provided. The item is then hidden from other client applications until some interval (by default 30 seconds) has passed after which the item becomes visible. The item can be deleted from the queue during that interval if the pop receipt from when it was retrieved is provided as part of the DELETE operation. The queue service is valuable as a way for Web roles to talk to Worker roles and vice versa.
Table Storage: This exposes a subset of the capabilities of the ADO.NET Data Services Framework (aka Astoria). In general, this is a schema-less table based model similar to Google's BigTable and Amazon's SimpleDB. The data model consists of tables and entities (aka rows). Each entity has a primary key made of two parts {PartitionKey, RowKey}, a last modified timestamp and an arbitrary number of user-defined properties. Properties can be one of several primitive types including integer, strings, doubles, long integers, GUIDs, booleans and binary. Like Astoria, the Table Storage service supports performing LINQ queries on rows but only supports the FROM, WHERE and TAKE operators. Other differences from Astoria are that it doesn't support batch operations nor is it possible to retrieve individual properties from an entity without retrieving the entire entity.

These storage services are accessible to any HTTP client and not just Azure applications.

Deploying Cloud-based Applications with Azure

The following diagram taken from the Windows Azure SDK shows the development lifecycle of an Windows Azure application

The SDK ships with a development fabric which enables you to deploy an Azure an application locally via IIS 7.0 and development storage which uses SQL Server Express as a storage layer which mimics the Windows Azure storage services.

As the diagram shows above, once the application is tested locally it can be deployed entirely or in part on Microsoft's storage and cloud computation services.

The Azure Services Platform: Windows Azure + Microsoft's Family of REST Web Services

In addition to Windows Azure, Microsoft also announced the Azure Services Platform which is a variety of Web APIs and Web Services that can be used in combination with Windows Azure (or by themselves) to build cloud-based applications. Each of these Web services is worthy of its own post (or whitepaper and O'Reilly animal book) but I'll limit myself to one sentence descriptions for now.

Live Services: A set of REST APIs for consumer-centric data types (e.g. calendar, profile, etc) and scenarios (communication, presence, sync, etc). You can see the set of APIs in the Live Framework poster and keep up with the goings on by following the Live Services blogs.
Microsoft SQL Services: Relational database in the cloud accessible via REST APIs. You can learn more from the SSDS developer center and keep up with the goings on by following the SQL Server Data Services team blog.
Microsoft .NET Services: Three fairly different services for now; hosted access control, hosted workflow engine and a service bus in the cloud. Boring enterprise stuff. :)
Microsoft Sharepoint Services: I couldn't figure out if anything concrete was announced here or whether stuff was pre-announced (i.e. actual announcement to come at a later date).
Microsoft Dynamics CRM Services: Ditto.

From the above list, I find the Live Services piece (access to user data in a uniform way) and the SQL Services (hosted storage) most interesting. I will likely revisit them in more depth at a later date.

The Bottom Line

From my perspective, Windows Azure is easiest viewed as a competitor to Google App Engine. As comparisons go, Azure already brings a number of features to the table that aren't even on the Google App Engine road map. The key important feature is the ability to run background tasks instead of just being limited to writing applications that respond to Web requests. This limitation of App Engine means you can't write any application that does any serious background computation like a search engine, email service, or RSS reader on Google App Engine. So Azure can run an entire class of applications that are simply not possible on Google App Engine.

The second key feature is that by supporting the .NET Framework, developers theoretically get a plethora of languages to choose from including Ruby (IronRuby), Python (IronPython), F#, VB.NET and C#. In practice, the Azure SDK only supports creating cloud applications using C# and VB.NET out of the box. However I can't think of any reason why it shouldn't be able to support development with other .NET enabled languages like IronPython. On the flipside, App Engine only supports Python and the timeline for it supporting other languages [and exactly which other languages] is still To Be Determined.

Finally, App Engine has a number of scalability limitations both from a data storage and a query performance perspective. Azure definitely does better than App Engine on a number of these axes. For example, App Engine has a 1MB limit per file while Azure has a 64MB limit on individual blobs and also allows you to split a blob into blocks of 4MB each. Similarly, I've been watching SQL Server Data Services (SSDS) for a while and I haven't seen or heard complaints about query performance.

Azure makes it possible for me to reuse my existing skills as a .NET developer who is savvy with using RESTful APIs to build cloud based applications without having to worry about scalability concerns (e.g. database sharding, replication strategies, server failover, etc). In addition, it puts pressure on competitors to step up to the plate and deliver. However you look at it, this is a massive WIN for Web developers.

The two small things I'd love to see addressed are first class support for IronPython and some clarity on the difference between SSDS and Windows Azure Storage services. Hopefully we can avoid a LINQ to Entities vs. LINQ to SQL-style situation in the future.

Postscript: Food for Thought

It would be interesting to read [or write] further thoughts on the pros and cons of Platform as a Service offerings when compared to Utility Computing offerings. In a previous discussion on my blog there was some consensus that utility computing approaches are more resistant to vendor lock-in than platform as a service approaches since it is easier to find multiple vendors who are providing virtual servers with LAMP/WISC hosting than it will be to find multiple vendors providing the exact same proprietary cloud APIs as Google, Amazon or Microsoft. However it would be informative to look at the topic from more angles, for instance what is the cost/benefit tradeoff of using SimpleDB/BigTable/SSDS for data access instead of MySQL running on multiple virtual hosts? With my paternity leave ending today, I doubt I'll have time to go over these topics in depth but I'd appreciate reading any such analysis.

Note Now Playing: The Game - Money Note

Categories: Life in the B0rg Cube | Windows Live

November 2, 2008

@ 02:59 PM

Comments [0]

RSS Bandit v1.8.0.855 Released

UPDATE: This release contained a bug which caused users to lose their feed lists on upgrading from previous versions. Please download v1.8.0.862 from here instead.

I've been talking about it for almost a year and now we've finally shipped it. To celebrate the new release, we've created a new website to capture the awesomeness of the latest version.

~~Download the installer from here~~

NEW FEATURES
- Download manager for viewing pending podcasts/enclosures
- Ability to synchronize feeds with Google Reader, NewsGator Online and the Windows Common Feed List.
- Option to increase or decrease font size in reading pane

BUG FIXES
- 2153584 FTP storage ignores path, always stores in root directory
- 1987359 System crashes while updating feeds
- 1974485 NewsItem copy content to clipboard only copy the link
- 1971987 Error on start: docs out of order (0 < 6 )
- 1967898 Application hangs on shut down due to search indexing
- 1965832 Exception when refreshing feeds
- 2014927 New feeds not downloaded
- 1933253 Continuously Downloading Enclosures
- 1883614 Javascript in feed content causes exceptions
- 1899299 Oldest unread message shown on alert window on new items
- 1915756 Program crash during close operation
- 1987359 System crashes while updating feeds
- 2096903 Toolbar crashes and replaced with Red X
- 1960767 Crashes on opening due to FIPS policy set on Windows Vista.

Note Now Playing: Young Jeezy - Vacation Note

Categories: RSS Bandit

October 31, 2008

@ 08:49 AM

Comments [2]

Platform Monetization is a Two Way Street: Lessons from Facebook and Sun Microsystems

Over the past few days I've been mulling over the recent news from Yahoo! that they are building a Facebook-like platform based on OpenSocial. I find this interesting given that a number of people have come to the conclusion that Facebook is slowly killing it's widget platform in order to replace it with Facebook Connect.

The key reason developers believe Facebook is killing of its platform is captured in Nick O'Neill's post entitled Scott Rafer: The Facebook Platform is Dead which states

When speaking at the Facebook developer conference today in Berlin, Scott Rafer declared that Facebook platform dead. He posted statistics including one that I posted that suggests Facebook widgets are dead. Lookery’s own statistics from Quantcast suggest that their publisher traffic has been almost halved since the new site design was released. Ultimately, I think we may see an increase in traffic as users become educated on the new design but there is no doubt that developers were impacted significantly.

So what is Scott’s solution for developers looking to thrive following the shift to the new design? Leave the platform and jump on the Facebook Connect opportunity.

The bottom line is that by moving applications off of the profile page in their recent redesign, Facebook has reduced the visibility of these application thus reducing their page views and their ability to spread virally. Some may think that the impact of these changes is unforeseen, however the Facebook team is obsessive about testing the impact of their UX changes so it is extremely unlikely that they aren't aware that the redesign would negatively impact Facebook applications.

The question to ask then is why Facebook would knowingly damaging a platform which has been uniformly praised across the industry and has had established Web players like Google and Yahoo! scrambling to deploy copycat efforts? Alex Iskold over at ReadWriteWeb believes he has the answer in his post Why Platforms Are Letting Us Down - And What They Should Do About It which contains the following excerpt

When the Facebook platform was unveiled in 2007, it was called genius. Never before had a company in a single stroke enabled others to tap into millions of its users completely free. The platform was hailed as a game changer under the famous mantra "we built it and they will come". And they did come, hundreds of companies rushing to write Facebook applications. Companies and VC funds focused specifically on Facebook apps.

It really did look like a revolution, but it didn't last. The first reason was that Facebook apps quickly arranged themselves on a power law curve. A handful of apps (think Vampires, Byte Me and Sell My Friends) landed millions of users, but those in the pack had hardly any. The second problem was, ironically, the bloat. Users polluted their profiles with Facebook apps and no one could find anything in their profiles. Facebook used to be simple - pictures, wall, friends. Now each profile features a zoo of heterogenous apps, each one trying to grab the user's attention to take advantage of the network effect. Users are confused.

Worst of all, the platform had no infrastructure to monetize the applications. When Sheryl Sandberg arrived on the scene and looked at the balance sheet, she spotted the hefty expense that was the Facebook platform. Trying to live up to a huge valuation isn't easy, and in the absense of big revenues people rush to cut costs. Since it was both an expense and users were confused less than a year after its glorious launch, Facebook decided to revamp its platform.

The latest release of Facebook, which was released in July, makes it nearly impossible for new applications to take advantage of the network effect. Now users must first install the application, then find it under the application menu or one of the tabs, then check a bunch of boxes to add it to their profile (old applications are grand-daddied in). Facebook has sent a clear message to developers - the platform is no longer a priority.

Alex's assertion is that after Facebook looked at the pros and cons of their widget platform, the company came to the conclusion that the platform was turning into a cost center instead of being away to improve the value of Facebook to its users. There is evidence that applications built on Facebook's platform did cause negative reactions from its users. For example, there was the creation of the "This has got to stop…pointless applications are ruining Facebook" group which at its height had half a million Facebook users protesting the behavior of Facebook apps. In addition, the creation of Facebook's Great Apps program along with the guiding principles for building Facebook applications implies that the Facebook team realized that applications being built on their platform typically don't have their users best interests at heart.

This brings up the interesting point that although there has been a lot of discussion on how Facebook apps make money there haven't been similar conversations on how the application platform improves Facebook's bottom line. There is definitely a win-win equation when so-called "great apps" like iLike and Causes, which positively increase user engagement, are built on Facebook's platform. However there is also a long tail of applications that try their best to spread virally at the cost of decreasing user satisfaction in the Facebook experience. These dissatisfied users likely end up reducing their usage of Facebook thus actually costing Facebook users and page views. It is quite possible that the few "great apps" built on the Facebook platform do not outweigh the amount of not-so-great apps built on the platform which have caused users to protest in the past. This would confirm Alex Iskold's suspicions about why Facebook has started sabotaging the popularity of applications built on its platform and has started emphasizing partnerships via Facebook Connect.

A similar situation has occurred with regards to the Java platform and Sun Microsystems. The sentiment is captured in a Javaworl article by Josh Fruhlinger entitled Sun melting down, and where's Java? which contains the following excerpt

one of the most interesting things about the coverage of the company's problems is how Java figures into the conversation, which is exactly not at all. In most of the articles, the word only appears as Sun's stock ticker; the closest I could find to a mention is in this AP story, which notes that "Sun's strategy of developing free, 'open-source' software and giving it away to spur sales of its high-end computer servers and support services hasn't paid off as investors would like." Even longtime tech journalist Ashlee Vance, when gamely badgering Jon Schwartz for the New York Time about whether Sun would sell its hardware division and focus on software, only mentions Solaris and MySQL in discussing the latter.

Those in the Java community no doubt believe that Java is too big to fail, that Sun can't abandon it because it's too important, even if it can't tangibly be tied to anything profitable. But if Sun's investors eventually dismember the company to try to extract what value is left in it, I'm not sure where Java will fit into that plan.

It is interesting to note that after a decade of investment in the Java platform, it is hard to point to what concrete benefits Sun has gotten from being the originator and steward of the Java platform and programming language. Definitely another example of a platform that may have benefited applications built on it yet which didn't really benefit the platform vendor as expected.

The lesson here is that building a platform isn't just about making the developers who use the platform successful but also making sure that the platform itself furthers the goals of its developers in the first place.

Note Now Playing: Kardinal Offishall - Dangerous (Feat. Akon) Note

Categories: Platforms

October 31, 2008

@ 08:47 AM

Comments [4]

Russell Beattie was Right, the Mobile Web is Dead

A couple of months ago, Russell Beattie wrote a post about the end of his startup entitled The end of Mowser which is excerpted below

The argument up to now has been simply that there are roughly 3 billion phones out there, and that when these phones get on the Internet, their vast numbers will outweigh PCs and tilt the market towards mobile as the primary web device. The problem is that these billions of users *haven't* gotten on the Internet, and they won't until the experience is better and access to the web is barrier-free - and that means better devices and "full browsers". Let's face it, you really aren't going to spend any real time or effort browsing the web on your mobile phone unless you're using Opera Mini, or have a smart phone with a decent browser - as any other option is a waste of time, effort and money. Users recognize this, and have made it very clear they won't be using the "Mobile Web" as a substitute for better browsers, rather they'll just stay away completely.
…
In fact, if you look at the number of page views of even the most popular mobile-only websites out there, they don't compare to the traffic of popular blogs, let alone major portals or social networks.

Let me say that again clearly, the mobile traffic just isn't there. It's not there now, and it won't be.

What's going to drive that traffic eventually? Better devices and full-browsers. M-Metrics recently spelled it out very clearly - in the US 85% of iPhone owners browsed the web vs. 58% of smartphone users, and only 13% of the overall mobile market. Those numbers *may* be higher in other parts of the world, but it's pretty clear where the trend line is now. (What a difference a year makes.) It would be easy to say that the iPhone "disrupted" the mobile web market, but in fact I think all it did is point out that there never was one to begin with.

I filed away Russell's post as interesting at the time but hadn't really experienced it first hand until recently. I recently switched to using Skyfire as my primary browser on my mobile phone and it has made a world of difference in how a use my phone. No longer am I restricted to crippled versions of popular sites nor do I have to lose features when I visit the regular versions of the page. I can view the real version of my news feed on Facebook. Vote up links in reddit or Digg. And reading blogs is no longer an exercise in frustration due to CSS issues or problems rendering widgets. Unsurprisingly my usage of the Web on my phone has pretty much doubled.

This definitely brings to the forefront how ridiculous of an idea it was to think that we need a "mobile Web" complete with its own top level domain (.mobi). Which makes more sense, that every Web site in the world should create duplicate versions of their pages for mobile phones and regular browsers or that software + hardware would eventually evolve to the point where I can run a full fledged browser on the device in my pocket? Thanks to the iPhone, it is now clear to everyone that this idea of a second class Web for mobile phones was a stopgap solution at best whose time is now past.

One other thing I find interesting is treating the iPhone as a separate category from "smartphones" in the highlighted quote. This is similar to a statement made by Walt Mossberg when he reviewed Google's Android. That article began as follows

In the exciting new category of modern hand-held computers — devices that fit in your pocket but are used more like a laptop than a traditional phone — there has so far been only one serious option. But that will all change on Oct. 22, when T-Mobile and Google bring out the G1, the first hand-held computer that’s in the same class as Apple’s iPhone.

The key feature that the iPhone and Android have in common that separates them from regular "smartphones" is that they both include a full featured browser based on Webkit. The other features like downloadable 3rd party applications, wi-fi support, rich video support, GPS, and so on have been available on phones running Windows Mobile for years. This shows how important having a full Web experience was for mobile phones and just how irrelevant the notion of a "mobile Web" has truly become.

Note Now Playing: Kilo Ali - Lost Y'all Mind Note

Categories: Technology

October 27, 2008

@ 07:39 PM

Comments [4]

Windows Live is now an OpenID Provider

The second most interesting announcement out of PDC this morning is that Windows Live ID is becoming an OpenID Provider. The information below explains how to try it out and give feedback to the team responsible.

Try It Now. Tell Us What You Think

We want you to try the Windows Live ID OpenID Provider CTP release, let us know your feedback, and tell us about any problems you find.

To prepare:

Go to https://login.live-int.com and use the sign-up button to set up a Windows Live ID test account in the INT environment.

Go to https://login.live-int.com/beta/ManageOpenID.srf to set up your OpenID test alias.

Then:

Users - At any Web site that supports OpenID 2.0, type openid.live-INT.com in the OpenID login box to sign in to that site by means of your Windows Live ID OpenID alias.

Library developers - Test your libraries against the Windows Live ID OP endpoint and let us know of any problems you find.

Web site owners - Test signing in to your site by using a Windows Live ID OpenID alias and let us know of any problems you find.

You can send us feedback at:

E-mail - openidfb@microsoft.com

This is awesome news. I've been interested in Windows Live supporting OpenID for a while and I'm glad to see that we've taken the plunge. Please try it out and send the team your feedback.

I've tried it out already and sent some initial feedback. In general, my feedback was on applying the lessons from the Yahoo! OpenID Usability Study since it looks like our implementation has some of the same usability issues that inspired Jeff Atwood's rants about Yahoo's OpenID implementation. Since it is still a Community Technology Preview, I'm sure the user experience will improve as feedback trickles in.

Kudos to Jorgen Thelin and the rest of the folks on the Identity Services team for getting this out. Great work, guys.

UPDATE: Angus Logan posted a comment with a link to the following screencast of the current user experience when using Windows Live ID as an OpenID provider experience

Note Now Playing: Christina Aguilera - Keeps Gettin' Better Note

Categories: Windows Live

October 27, 2008

@ 05:39 PM

Comments [3]

Microsoft Announces Windows Azure

Just because you aren't attending Microsoft's Professional Developer Conference doesn't mean you can't follow the announcements. The most exciting announcement so far [from my perspective] has been Windows Azure which is described as follows from the official site

The Azure™ Services Platform (Azure) is an internet-scale cloud services platform hosted in Microsoft data centers, which provides an operating system and a set of developer services that can be used individually or together. Azure’s flexible and interoperable platform can be used to build new applications to run from the cloud or enhance existing applications with cloud-based capabilities. Its open architecture gives developers the choice to build web applications, applications running on connected devices, PCs, servers, or hybrid solutions offering the best of online and on-premises.

Azure reduces the need for up-front technology purchases, and it enables developers to quickly and easily create applications running in the cloud by using their existing skills with the Microsoft Visual Studio development environment and the Microsoft .NET Framework. In addition to managed code languages supported by .NET, Azure will support more programming languages and development environments in the near future. Azure simplifies maintaining and operating applications by providing on-demand compute and storage to host, scale, and manage web and connected applications. Infrastructure management is automated with a platform that is designed for high availability and dynamic scaling to match usage needs with the option of a pay-as-you-go pricing model. Azure provides an open, standards-based and interoperable environment with support for multiple internet protocols, including HTTP, REST, SOAP, and XML.

It will be interesting to read what developers make of this announcement and what kind of apps start getting built on this platform. I'll also be on the look out for any in depth discussions on the platform, there is lots to chew on in this announcement.

For a quick overview of what Azure means to developers, take a look at Azure for Business and Azure for Web Developers.

Note Now Playing: Guns N' Roses - Welcome to the Jungle Note

Categories: Platforms | Windows Live

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Sunday, 09 November 2008 - Dare Obasanjo's weblog