Friday, 16 January 2009 - Dare Obasanjo's weblog

January 16, 2009

@ 02:23 PM

Building Scalable Databases: Pros and Cons of Various Database Sharding Schemes

Database sharding is the process of splitting up a database across multiple machines to improve the scalability of an application. The justification for database sharding is that after a certain scale point it is cheaper and more feasible to scale a site horizontally by adding more machines than to grow it vertically by adding beefier servers.

Why Shard or Partition your Database?

Let's take Facebook.com as an example. In early 2004, the site was mostly used by Harvard students as a glorified online yearbook. You can imagine that the entire storage requirements and query load on the database could be handled by a single beefy server. Fast forward to 2008 where just the Facebook application related page views are about 14 billion a month (which translates to over 5,000 page views per second, each of which will require multiple backend queries to satisfy). Besides query load with its attendant IOPs, CPU and memory cost there's also storage capacity to consider. Today Facebook stores 40 billion physical files to represent about 10 billion photos which is over a petabyte of storage. Even though the actual photo files are likely not in a relational database, their metadata such as identifiers and locations still would require a few terabytes of storage to represent these photos in the database. Do you think the original database used by Facebook had terabytes of storage available just to store photo metadata?

At some point during the development of Facebook, they reached the physical capacity of their database server. The question then was whether to scale vertically by buying a more expensive, beefier server with more RAM, CPU horsepower, disk I/O and storage capacity or to spread their data out across multiple relatively cheap database servers. In general if your service has lots of rapidly changing data (i.e. lots of writes) or is sporadically queried by lots of users in a way which causes your working set not to fit in memory (i.e. lots of reads leading to lots of page faults and disk seeks) then your primary bottleneck will likely be I/O. This is typically the case with social media sites like Facebook, LinkedIn, Blogger, MySpace and even Flickr. In such cases, it is either prohibitively expensive or physically impossible to purchase a single server to handle the load on the site. In such situations sharding the database provides excellent bang for the buck with regards to cost savings relative to the increased complexity of the system.

Now that we have an understanding of when and why one would shard a database, the next step is to consider how one would actually partition the data into individual shards. There are a number of options and their individual tradeoffs presented below – Pseudocode / Joins

How Sharding Changes your Application

In a well designed application, the primary change sharding adds to the core application code is that instead of code such as

//string connectionString = @"Driver={MySQL};SERVER=dbserver;DATABASE=CustomerDB;"; <-- should be in web.config
string connectionString = ConfigurationSettings.AppSettings["ConnectionInfo"];          
OdbcConnection conn = new OdbcConnection(connectionString);
conn.Open();
          
OdbcCommand cmd = new OdbcCommand("SELECT Name, Address FROM Customers WHERE CustomerID= ?", conn);
OdbcParameter param = cmd.Parameters.Add("@CustomerID", OdbcType.Int);
param.Value = customerId; 
OdbcDataReader reader = cmd.ExecuteReader();

the actual connection information about the database to connect to depends on the data we are trying to store or access. So you'd have the following instead

string connectionString = GetDatabaseFor(customerId);          
OdbcConnection conn = new OdbcConnection(connectionString);
conn.Open();
         
OdbcCommand cmd = new OdbcCommand("SELECT Name, Address FROM Customers WHERE CustomerID= ?", conn);
OdbcParameter param = cmd.Parameters.Add("@CustomerID", OdbcType.Int);
param.Value = customerId; 
OdbcDataReader reader = cmd.ExecuteReader();

the assumption here being that the GetDatabaseFor() method knows how to map a customer ID to a physical database location. For the most part everything else should remain the same unless the application uses sharding as a way to parallelize queries.

A Look at a Some Common Sharding Schemes

There are a number of different schemes one could use to decide how to break up an application database into multiple smaller DBs. Below are four of the most popular schemes used by various large scale Web applications today.

Vertical Partitioning: A simple way to segment your application database is to move tables related to specific features to their own server. For example, placing user profile information on one database server, friend lists on another and a third for user generated content like photos and blogs. The key benefit of this approach is that is straightforward to implement and has low impact to the application as a whole. The main problem with this approach is that if the site experiences additional growth then it may be necessary to further shard a feature specific database across multiple servers (e.g. handling metadata queries for 10 billion photos by 140 million users may be more than a single server can handle).
Range Based Partitioning: In situations where the entire data set for a single feature or table still needs to be further subdivided across multiple servers, it is important to ensure that the data is split up in a predictable manner. One approach to ensuring this predictability is to split the data based on values ranges that occur within each entity. For example, splitting up sales transactions by what year they were created or assigning users to servers based on the first digit of their zip code. The main problem with this approach is that if the value whose range is used for partitioning isn't chosen carefully then the sharding scheme leads to unbalanced servers. In the previous example, splitting up transactions by date means that the server with the current year gets a disproportionate amount of read and write traffic. Similarly partitioning users based on their zip code assumes that your user base will be evenly distributed across the different zip codes which fails to account for situations where your application is popular in a particular region and the fact that human populations vary across different zip codes.
Key or Hash Based Partitioning: This is often a synonym for user based partitioning for Web 2.0 sites. With this approach, each entity has a value that can be used as input into a hash function whose output is used to determine which database server to use. A simplistic example is to consider if you have ten database servers and your user IDs were a numeric value that was incremented by 1 each time a new user is added. In this example, the hash function could be perform a modulo operation on the user ID with the number ten and then pick a database server based on the remainder value. This approach should ensure a uniform allocation of data to each server. The key problem with this approach is that it effectively fixes your number of database servers since adding new servers means changing the hash function which without downtime is like being asked to change the tires on a moving car.
Directory Based Partitioning: A loosely couples approach to this problem is to create a lookup service which knows your current partitioning scheme and abstracts it away from the database access code. This means the GetDatabaseFor() method actually hits a web service or a database which actually stores/returns the mapping between each entity key and the database server it resides on. This loosely coupled approach means you can perform tasks like adding servers to the database pool or change your partitioning scheme without having to impact your application. Consider the previous example where there are ten servers and the hash function is a modulo operation. Let's say we want to add five database servers to the pool without incurring downtime. We can keep the existing hash function, add these servers to the pool and then run a script that copies data from the ten existing servers to the five new servers based on a new hash function implemented by performing the modulo operation on user IDs using the new server count of fifteen. Once the data is copied over (although this is tricky since users are always updating their data) the lookup service can change to using the new hash function without any of the calling applications being any wiser that their database pool just grew 50% and the database they went to for accessing John Doe's pictures five minutes ago is different from the one they are accessing now.

Problems Common to all Sharding Schemes

Once a database has been sharded, new constraints are placed on the operations that can be performed on the database. These constraints primarily center around the fact that operations across multiple tables or multiple rows in the same table no longer will run on the same server. Below are some of the constraints and additional complexities introduced by sharding

Joins and Denormalization – Prior to sharding a database, any queries that require joins on multiple tables execute on a single server. Once a database has been sharded across multiple servers, it is often not feasible to perform joins that span database shards due to performance constraints since data has to be compiled from multiple servers and the additional complexity of performing such cross-server.

A common workaround is to denormalize the database so that queries that previously required joins can be performed from a single table. For example, consider a photo site which has a database which contains a user_info table and a photos table. Comments a user has left on photos are stored in the photos table and reference the user's ID as a foreign key. So when you go to the user's profile it takes a join of the user_info and photos tables to show the user's recent comments. After sharding the database, it now takes querying two database servers to perform an operation that used to require hitting only one server. This performance hit can be avoided by denormalizing the database. In this case, a user's comments on photos could be stored in the same table or server as their user_info AND the photos table also has a copy of the comment. That way rendering a photo page and showing its comments only has to hit the server with the photos table while rendering a user profile page with their recent comments only has to hit the server with the user_info table.

Of course, the service now has to deal with all the perils of denormalization such as data inconsistency (e.g. user deletes a comment and the operation is successful against the user_info DB server but fails against the photos DB server because it was just rebooted after a critical security patch).
Referential integrity – As you can imagine if there's a bad story around performing cross-shard queries it is even worse trying to enforce data integrity constraints such as foreign keys in a sharded database. Most relational database management systems do not support foreign keys across databases on different database servers. This means that applications that require referential integrity often have to enforce it in application code and run regular SQL jobs to clean up dangling references once they move to using database shards.

Dealing with data inconsistency issues due to denormalization and lack of referential integrity can become a significant development cost to the service.
Rebalancing (Updated 1/21/2009) – In some cases, the sharding scheme chosen for a database has to be changed. This could happen because the sharding scheme was improperly chosen (e.g. partitioning users by zip code) or the application outgrows the database even after being sharded (e.g. too many requests being handled by the DB shard dedicated to photos so more database servers are needed for handling photos). In such cases, the database shards will have to be rebalanced which means the partitioning scheme changed AND all existing data moved to new locations. Doing this without incurring down time is extremely difficult and not supported by any off-the-shelf today. Using a scheme like directory based partitioning does make rebalancing a more palatable experience at the cost of increasing the complexity of the system and creating a new single point of failure (i.e. the lookup service/database).

Further Reading

Sharding for startups by Eric Ries
Federation at Flickr: Doing Billions of Queries Per Day by Dathan Vance Pattishall

Note Now Playing: The Kinks - You Really Got Me Note

Categories: Web Development

January 12, 2009

@ 02:10 PM

Comments [9]

Can RDF really save us from data format proliferation?

Bill de hÓra has a blog post entitled Format Debt: what you can't say where he writes

The closest thing to a deployable web technology that might improve describing these kind of data mashups without parsing at any cost or patching is RDF. Once RDF is parsed it becomes a well defined graph structure - albeit not a structure most web programmers will be used to, it is however the same structure regardless of the source syntax or the code and the graph structure is closed under all allowed operations.

If we take the example of MediaRSS, which is not consistenly used or placed in syndication and API formats, that class of problem more or less evaporates via RDF. Likewise if we take the current Zoo of contact formats and our seeming inability to commit to one, RDF/OWL can enable a declarative mapping between them. Mapping can reduce the number of man years it takes to define a "standard" format by not having to bother unifying "standards" or getting away with a few thousand less test cases.

I've always found this particular argument by RDF proponents to be suspect. When I complained about the the lack of standards for representing rich media in Atom feeds, the thrust of the complaint is that you can't just plugin a feed from Picassa into a service that understands how to process feeds from Zooomr without making changes to the service or the input feed.

RDF proponents often to argue that if we all used RDF based formats then instead of having to change your code to support every new photo site's Atom feed with custom extensions, you could instead create a mapping from the format you don't understand to the one you do using something like the OWL Web Ontology Language. The problem with this argument is that there is a declarative approach to mapping between XML data formats without having to boil the ocean by convincing everyone to switch to RD; XSL Transformations (XSLT).

The key problem is that in both cases (i.e. mapping with OWL vs. mapping with XSLT) there is still the problem that Picassa feeds won't work with an app that understand's Zoomr's feeds until some developer writes code. Thus we're really debating on whether it is ~~better~~ cheaper to have the developer write declarative mappings like OWL or XSLT instead of writing new parsing code in their language of choice.

In my experience I've seen that creating a software system where you can drop in an XSLT, OWL or other declarative mapping document to deal with new data formats is cheaper and likely to be less error prone than having to alter parsing code written in C#, Python, Ruby or whatever. However we don't need RDF or other Semantic Web technologies to build such solution today. XSLT works just fine as a tool for solving exactly that problem.

Note Now Playing: Lady GaGa & Colby O'Donis - Just Dance Note

Categories: Syndication Technology | XML

January 12, 2009

@ 02:03 PM

Comments [0]

Upcoming Conference Appearance: SXSW '09

It looks like I'll be participating in two panels at the upcoming SXSW Interactive Festival. The descriptions of the panels are below

Feed Me: Bite Size Info for a Hungry Internet

In our fast-paced, information overload society, users are consuming shorter and more frequent content in the form of blogs, feeds and status messages. This panel will look at the social trends, as well as the technologies that makes feed-based communication possible. Led by Ari Steinberg, an engineering manager at Facebook who focuses on the development of News Feed.

Post Standards: Creating Open Source Specs

Many of the most interesting new formats on the web are being developed outside the traditional standards process; Microformats, OpenID, OAuth, OpenSocial, and originally Jabber — four out of five of these popular new specs have been standardized by the IETF, OASIS, or W3C. But real hackers are bringing their implementations to projects ranging from open source apps all the way up to the largest companies in the technology industry. While formal standards bodies still exist, their role is changing as open source communities are able to develop specifications, build working code, and promote it to the world. It isn't that these communities don't see the value in formal standardization, but rather that their needs are different than what formal standards bodies have traditionally offered. They care about ensuring that their technologies are freely implementable and are built and used by a diverse community where anyone can participate based on merit and not dollars. At OSCON last year, the Open Web Foundation was announced to create a new style of organization that helps these communities develop open specifications for the web. This panel brings together community leaders from these technologies to discuss the "why" behind the Open Web Foundation and how they see standards bodies needing to evolve to match lightweight community driven open specifications for the web.

If you'll be at SxSw and are a regular reader of my blog who would like to chat in person, feel free to swing by during one or both panels. I'd also be interested in what people who plan to attend either panel would like to get out of the experience. Let me know in the comments.

Note Now Playing: Estelle - American Boy (feat. Kanye West) Note

Categories: Trip Report

January 10, 2009

@ 01:43 PM

Comments [0]

Live Mesh Wins Best Technology Innovation/Achievement at Crunchies 2008

Angus Logan has the scoop

I’m in San Francisco at the 2008 Crunchie Awards and after ~ 350k votes were cast Ray Ozzie and David Treadwell accepted the award for Best Technology Innovation/Achievement on behalf of the Live Mesh team.

The Crunchies are an annual competition co-hosted by GigaOm, VentureBeat, Silicon Alley Insider, and TechCrunch which culminates in an award the most compelling startup, internet and technology innovations.

Kudos to the Live Mesh folks on getting this award. I can't wait to see what 2009 brings for this product.

PS: I noticed from the TechCrunch post that Facebook Connect was the runner up. I have to give an extra special shout out to my friend Mike for being a key figure behind two of the most innovative technology products of 2008. Nice work man.

Categories: Windows Live

January 9, 2009

@ 05:46 PM

Comments [2]

Windows Live What's New Sidebar Gadget (beta) now available

Since we released the latest version of the Windows Live what's new feed which shows what's been going on with your social network at http://home.live.com, we've gotten repeated asks to provide a Windows Vista gadget so people can keep up with their social circle directly from their desktop.

You asked, and now we've delivered. Get it from here.

What I love most about this gadget is that a huge chunk of the work to get this out the door was done by our summer interns from 2008. I love how interns can be around for a short time but provide a ton of bang for the buck while they are here. Hope you enjoy the gadget as much as I have.

Note Now Playing: Akon, Lil Wayne & Young Jeezy - I'm So Paid Note

Categories: Windows Live

January 9, 2009

@ 08:01 AM

Comments [2]

A People-Centric Contact Management Experience in a Smart Phone

From Palm Pre and Palm WebOS in-depth look we learn

The star of the show was the new Palm WebOS. It's not just a snazzy new touch interface. It's a useful system with some thoughtful ideas that we've been looking for. First of all, the Palm WebOS takes live, while-you-type searching to a new level. On a Windows Mobile phone, typing from the home screen initiates a search of the address book. On the Palm WebOS, typing starts a search of the entire phone, from contacts through applications and more. If the phone can't find what you need, it offers to search Google, Maps and Wikipedia. It's an example of Palm's goal to create a unified, seamless interface.

Other examples of this unified philosophy can be found in the calendar, contacts and e-mail features. The Palm Pre will gather all of your information from your Exchange account, your Gmail account and your Facebook account and display them in a single, unduplicated format. The contact listing for our friend Dave might draw his phone number from our Exchange account, his e-mail address from Gmail and Facebook, and instant messenger from Gtalk. All of these are combined in a single entry, with a status indicator to show if Dave is available for IM chats.

This is the holy grail of contact management experiences on a mobile phone. Today I use Exchange as the master for my contact records and then use tools like OutSync to merge in contact data for my Outlook contacts who are also on Facebook before pushing it all down to my Windows Mobile phone (the AT&T Tilt). Unfortunately this is a manual process and I have to be careful of creating duplicates when importing contacts from different places.

If the Palm Pre can do this automatically in a "live" anmd always connected way without creating duplicate or useless contacts (e.g. Facebook friends with no phone or IM info shouldn't take up space in my phone contact list) then I might have to take this phone for a test drive.

Anyone at CES get a chance to play with the device up close?

Note Now Playing: Hootie & The Blowfish - Only Wanna Be With You Note

Categories: Social Software | Technology

January 8, 2009

@ 01:02 PM

Comments [8]

Representing Rich Media and Social Network Activities in RSS/Atom Feeds

As I've mentioned previously, one of the features we shipped in the most recent release of Windows Live is the ability to import your activities from photo sharing sites like Flickr and PhotoBucket or even blog posts from a regular RSS/Atom feed onto your Windows Live profile. You can see this in action on my Windows Live profile.

One question that has repeatedly come up for our service and others like it, is how users can get a great experience from just importing RSS/Atom feeds of sites that we don't support in a first class way. A couple of weeks ago Dave Winer asked this of FriendFeed in his post FriendFeed and level playing fields where he writes

Consider this screen (click on it to see the detail):

Suppose you used a photo site that wasn't one of the ones listed, but you had an RSS feed for your photos and favorites on that site. What are you supposed to do? I always assumed you should just add the feed under "Blog" but then your readers will start asking why your pictures don't do all the neat things that happen automatically with Flickr, Picasa, SmugMug or Zooomr sites. I have such a site, and I don't want them to do anything special for it, I just want to tell FF that it's a photo site and have all the cool special goodies they have for Flickr kick in automatically.

If you pop up a higher level, you'll see that this is actually contrary to the whole idea of feeds, which were supposed to create a level playing field for the big guys and ordinary people.

We have a similar problem when importing arbitrary RSS/Atom feeds onto a user's profile in Windows Live. For now, we treat each imported RSS feed as a blog entry and assume it has a title and a body that can be used as a summary. This breaks down if you are someone like Kevin Radcliffe who would like to import his Picasa Web albums. At this point we run smack-dab into the fact that there aren't actually consistent standards around how to represent photo albums from photo sharing sites in Atom/RSS feeds.

Let's look at the RSS/Atom feeds from three of the sites that Dave names that aren't natively supported by Windows Live's Web Activities feature.

Picassa

<item>
  <guid isPermaLink='false'>http://picasaweb.google.com/data/entry/base/user/bo.so.po.ro.sie/albumid/5280893965532109969/photoid/5280894045331336242?alt=rss&amp;hl=en_US</guid>
  <pubDate>Wed, 17 Dec 2008 22:45:59 +0000</pubDate>
  <atom:updated>2008-12-17T22:45:59.000Z</atom:updated>
  <category domain='http://schemas.google.com/g/2005#kind'>http://schemas.google.com/photos/2007#photo</category>
  <title>DSC_0479.JPG</title>
  <description>&lt;table&gt;&lt;tr&gt;&lt;td style="padding: 0 5px"&gt;&lt;a href="http://picasaweb.google.com/bo.so.po.ro.sie/DosiaIPomaraCze#5280894045331336242"&gt;&lt;img style="border:1px solid #5C7FB9" src="http://lh4.ggpht.com/_xRL2P3zJJOw/SUmBJ6RzLDI/AAAAAAAABX8/MkPUBcKqpRY/s288/DSC_0479.JPG" alt="DSC_0479.JPG"/&gt;&lt;/a&gt;&lt;/td&gt;&lt;td valign="top"&gt;&lt;font color="#6B6B6B"&gt;Date: &lt;/font&gt;&lt;font color="#333333"&gt;Dec 17, 2008 10:56 AM&lt;/font&gt;&lt;br/&gt;&lt;font color=\"#6B6B6B\"&gt;Number of Comments on Photo:&lt;/font&gt;&lt;font color=\"#333333\"&gt;0&lt;/font&gt;&lt;br/&gt;&lt;p&gt;&lt;a href="http://picasaweb.google.com/bo.so.po.ro.sie/DosiaIPomaraCze#5280894045331336242"&gt;&lt;font color="#3964C2"&gt;View Photo&lt;/font&gt;&lt;/a&gt;&lt;/p&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;</description>
  <enclosure type='image/jpeg' url='http://lh4.ggpht.com/_xRL2P3zJJOw/SUmBJ6RzLDI/AAAAAAAABX8/MkPUBcKqpRY/DSC_0479.JPG' length='0'/>
  <link>http://picasaweb.google.com/lh/photo/PORshBK0wdBV0WPl27g_wQ</link>
  <media:group>
    <media:title type='plain'>DSC_0479.JPG</media:title>
    <media:description type='plain'></media:description>
    <media:keywords></media:keywords>
    <media:content url='http://lh4.ggpht.com/_xRL2P3zJJOw/SUmBJ6RzLDI/AAAAAAAABX8/MkPUBcKqpRY/DSC_0479.JPG' height='1600' width='1074' type='image/jpeg' medium='image'/>
    <media:thumbnail url='http://lh4.ggpht.com/_xRL2P3zJJOw/SUmBJ6RzLDI/AAAAAAAABX8/MkPUBcKqpRY/s72/DSC_0479.JPG' height='72' width='49'/>
    <media:thumbnail url='http://lh4.ggpht.com/_xRL2P3zJJOw/SUmBJ6RzLDI/AAAAAAAABX8/MkPUBcKqpRY/s144/DSC_0479.JPG' height='144' width='97'/>
    <media:thumbnail url='http://lh4.ggpht.com/_xRL2P3zJJOw/SUmBJ6RzLDI/AAAAAAAABX8/MkPUBcKqpRY/s288/DSC_0479.JPG' height='288' width='194'/>
    <media:credit>Joanna</media:credit>
  </media:group>
</item>

Smugmug

<entry>
   <title>Verbeast's photo</title>
   <link rel="alternate" type="text/html" href="http://verbeast.smugmug.com/gallery/5811621_NELr7#439421133_qFtZ5"/>
   <content type="html">&lt;p&gt;&lt;a href="http://verbeast.smugmug.com"&gt;Verbeast&lt;/a&gt; &lt;/p&gt;&lt;a href="http://verbeast.smugmug.com/gallery/5811621_NELr7#439421133_qFtZ5" title="Verbeast's photo"&gt;&lt;img src="http://verbeast.smugmug.com/photos/439421133_qFtZ5-Th.jpg" width="150" height="150" alt="Verbeast's photo" title="Verbeast's photo" style="border: 1px solid #000000;" /&gt;&lt;/a&gt;</content>
   <updated>2008-12-18T22:51:58Z</updated>
   <author>
     <name>Verbeast</name>
     <uri>http://verbeast.smugmug.com</uri>
   </author>
   <id>http://verbeast.smugmug.com/photos/439421133_qFtZ5-Th.jpg</id>
   <exif:DateTimeOriginal>2008-12-12 18:37:17</exif:DateTimeOriginal>
 </entry>

Zooomr

 <item>
      <title>ギンガメアジとジンベイ</title>
      <link>http://www.zooomr.com/photos/chuchu/6556014/</link>
      <description>
        &lt;a href=&quot;http://www.zooomr.com/photos/chuchu/&quot;&gt;chuchu&lt;/a&gt; posted a photograph:&lt;br /&gt;

        &lt;a href=&quot;http://www.zooomr.com/photos/chuchu/6556014/&quot; class=&quot;image_link&quot; &gt;&lt;img src=&quot;http://static.zooomr.com/images/6556014_00421b6456_m.jpg&quot; alt=&quot;ギンガメアジとジンベイ&quot; title=&quot;ギンガメアジとジンベイ&quot;  /&gt;&lt;/a&gt;&lt;br /&gt;

      </description>
      <pubDate>Mon, 22 Dec 2008 04:14:52 +0000</pubDate>
      <author zooomr:profile="http://www.zooomr.com/people/chuchu/">nobody@zooomr.com (chuchu)</author>
      <guid isPermaLink="false">tag:zooomr.com,2004:/photo/6556014</guid>
      <media:content url="http://static.zooomr.com/images/6556014_00421b6456_m.jpg" type="image/jpeg" />
      <media:title>ギンガメアジとジンベイ</media:title>
      <media:text type="html">
        &lt;a href=&quot;http://www.zooomr.com/photos/chuchu/&quot;&gt;chuchu&lt;/a&gt; posted a photograph:&lt;br /&gt;

        &lt;a href=&quot;http://www.zooomr.com/photos/chuchu/6556014/&quot; class=&quot;image_link&quot; &gt;&lt;img src=&quot;http://static.zooomr.com/images/6556014_00421b6456_m.jpg&quot; alt=&quot;ギンガメアジとジンベイ&quot; title=&quot;ギンガメアジとジンベイ&quot;  /&gt;&lt;/a&gt;&lt;br /&gt;

      </media:text>
      <media:thumbnail url="http://static.zooomr.com/images/6556014_00421b6456_s.jpg" height="75" width="75" />
      <media:credit role="photographer">chuchu</media:credit>
      <media:category scheme="urn:zooomr:tags">海遊館 aquarium kaiyukan osaka japan</media:category>
    </item>

As you can see from the above XML snippets there is no consistency in how they represent photo streams. Even though both Picasa and Zoomr use Yahoo's Media RSS extensions, they generate different markup. Picasa has the media extensions as a set of elements within a media:group element that is a child of the item element while Zooomr simply places a grab bag of Media RSS elements such including media:thumbnail and media:content as children of the item element. Smugmug takes the cake by simply tunneling some escaped HTML in the atom:content element instead of using explicit metadata to describe the photos.

The bottom line is that it isn't possible to satisfy Dave Winer's request and create a level playing field today because there are no consistently applied standards for representing photo streams in RSS/Atom. This is unfortunate because it means that services have to write one off code (aka the beautiful fucking snowflake problem) for each photo sharing site they want to integrate with. Not only is this a lot of unnecessary code, it also prevents such integration from being a simple plug and play experience for users of social aggregation services.

So far, the closest thing to a standard in this space is Media RSS but as the name states it is an RSS based format and really doesn't fit with the Atom syndication format's data model. This is why Martin Atkins has started working on Atom Media Extensions which is an effort to create a similar set of media extensions for the Atom syndication format.

What I like about the first draft of Atom media extensions is that it is focused on the basic case of syndicating audio, video and image for use in activity streams and doesn't have some of the search related and feed republishing baggage you see in related formats like Media RSS or iTunes RSS extensions.

The interesting question is how to get the photo sites out there to adopt consistent standards in this space? Maybe we can get Google to add it to their Open Stack™ since they've been pretty good at getting social sites to adopt their standards and have been generally good at evangelization.

Note Now Playing: DMX - Ruff Ryders' Anthem Note

Categories: Syndication Technology

January 8, 2009

@ 12:59 PM

Comments [4]

Windows Live Tip: How to Add your Flickr Photos to your Windows Live Profile

One of the features we recently shipped in Windows Live is the ability to link your Windows Live profile to your Flickr account so that whenever you add photos to Flickr they show up on your profile and in the What's New list of members of your network in Windows Live. Below are the steps to adding your Flickr photos to your Windows Live profile.

1. Go to your Windows Live profile at http://profile.live.com and locate the link to your Web Activities on the bottom left

2. Click the link to add Web Activities which should take you to http://profile.live.com/WebActivities/ shown below. Locate Flickr on that page.

3.) Click on the "Add" link for Flickr which should take you to http://profile.live.com/WebActivities/Add.aspx?appid=1073750531 shown below

4.) Click on the link to sign-in to Flickr. This should take you to the Flickr sign-in page shown below (if you aren't signed in)

5.) After signing in, you will need to grant Windows Live access to your Flickr photo stream. Click the "OK I'll Allow It" button shown below

6.) You should then be redirected to Windows Live where you can complete the final step and link both accounts. In addition, you can decide who should able to view your Flickr photos on your Windows Live profile as shown below

7.) After pushing the "Add" button you should end up back on your profile with your Flickr information now visible on it.

A.) People in your network can now see your Flickr updates in various Windows Live applications including Windows Live Messenger as shown below

PS: The same basic set of steps work for adding activities from Twitter, Pandora, StumbleUpon, Flixster, PhotoBucket, Yelp, iLike, blogs hosted on Wordpress.com, or from any RSS/Atom feed to your Windows Live profile. Based on announcements at CES yesterday, you'll soon be able to add your activities from Facebook to Windows Live as well.

Note Now Playing: DMX - Party Up (Up in Here) Note

Categories: Windows Live

January 7, 2009

@ 03:22 PM

Comments [3]

Some Thoughts on Choosing Partition Keys in Windows Azure's Table Storage

In my recent post on building a Twitter search engine on Windows Azure I questioned the need the expose the notion of both partition and row keys to developers on the platforms. Since then I've had conversations with a couple of folks at work that indicate that I should have stated my concerns more explicitly. So here goes.

The documentation on Understanding the Windows Azure Table Storage Data Model states the following

PartitionKey Property

Tables are partitioned to support load balancing across storage nodes. A table's entities are organized by partition. A partition is a consecutive range of entities possessing the same partition key value. The partition key is a unique identifier for the partition within a given table, specified by the PartitionKey property. The partition key forms the first part of an entity's primary key. The partition key may be a string value up to 32 KB in size.

You must include the PartitionKey property in every insert, update, and delete operation.

RowKey Property

The second part of the primary key is the row key, specified by the RowKey property. The row key is a unique identifier for an entity within a given partition. Together the PartitionKey and RowKey uniquely identify every entity within a table.

The row key is a string value that may be up to 32 KB in size.

You must include the RowKey property in every insert, update, and delete operation.

In my case I'm building an application to represent users in a social network and each user is keyed by user ID (e.g. their Twitter user name). In my application I only have one unique key and it identifies each row of user data (e.g. profile pic, location, latest tweet, follower count, etc). My original intuition was to use the unique ID as the row key while letting the partition key be a single value. The purpose of the partition key is that it is a hint to say this data belongs on the same machine which in my case seems like overkill.

Where this design breaks down is when I actually end up storing more data than the Windows Azure system can or wants to fit on a single storage node. For example, what if I've actually built a Facebook crawler (140 million users) and I cache people's profile pics locally (10kilobytes). This ends up being 1.3 terabytes of data. I highly doubt that the Azure system will be allocating 1.3 terabytes of storage on a single server for a single developer and even if it did the transaction performance would suffer. So the only reasonable assumption is that the data will either be split across various nodes at some threshold [which the developer doesn't know] or at some point the developer gets a "disk full error" (i.e. a bad choice which no platform would make).

On the other hand, if I decide to use the user ID as the partition key then I am in essence allowing the system to theoretically store each user on a different machine or at least split up my data across the entire cloud. That sucks for me if all I have is three million users for which I'm only storing 1K of data so it could fit on a single storage node. Of course, the Windows Azure system could be smart enough to not split up my data since it fits underneath some threshold [which the developer doesn't know]. And this approach also allows the system to take advantage of parallelism across multiple machines if it does split my data.

Thus I'm now leaning towards the user ID being the partition key instead of the row key. So what advice do the system's creators actually have for developers?

Well from the discussion thread POST to Azure tables w/o PartitionKey/RowKey: that's a bug, right? on the MSDN forums there is the following advice from Niranjan Nilakantan of Microsoft

If the key for your logical data model has more than 1 property, you should default to (multiple partitions, multiple rows for each partition).

If the key for your logical data model has only one property, you would default to (multiple partitions, one row per partition).

We have two columns in the key to separate what defines uniqueness(PartitionKey and RowKey) from what defines scalability(just PartitionKey).
In general, write and query times are less affected by how the table is partitioned. It is affected more by whether you specify the PartitionKey and/or RowKey in the query.

So that answers the question and validates the conclusions we eventually arrived at. It seems we should always use the partition key as the primary key and may optionally want to use a row key as a secondary key, if needed.

In that case, the fact that items with different partition keys may or may not be stored on the same machine seems to be an implementation detail that shouldn't matter to developers since there is nothing they can do about it anyway. Right?

Note Now Playing: Scarface - Hand of the Dead Body Note

Categories: Web Development

January 6, 2009

@ 02:10 PM

Comments [2]

Will the Online Identity War turn out like the XML Syndication War?

I've been spending some time thinking about the ramifications of centralized identity plays coming back into vogue with the release of Facebook Connect, MySpaceID and Google's weird amalgam Google Friend Connect. Slowly I began to draw parallels between the current situation and a different online technology battle from half a decade ago.

About five years ago, one of the most contentious issues among Web geeks was the RSS versus Atom debate. On the one hand there was RSS 2.0, a widely deployed and fairly straightforward XML syndication format which had some ambiguity around the spec but whose benevolent dictator had declared the spec frozen to stabilize the ecosystem around the technology. On the other hand you had the Atom syndication format, an up and coming XML syndication format backed by a big company (Google) and a number of big names in the Web standards world (Tim Bray, Sam Ruby, Mark Pilgrim, etc) which intended to do XML syndication the right way and address some of the flaws of RSS 2.0.

During that time I was an RSS 2.0 backer even though I spent enough time on the atom-syntax mailing list to be named as a contributor on the final RFC. My reasons for supporting RSS 2.0 are captured in my five year old blog post The ATOM API vs. the ATOM Syndication Format which contained the following excerpt

Based on my experiences working with syndication software as a hobbyist developer for the past year is that the ATOM syndication format does not offer much (if anything) over RSS 2.0 but that the ATOM API looks to be a significant step forward compared to previous attempts at weblog editing/management APIs especially with regard to extensibility, support for modern practices around service oriented architecture, and security.
...
Regardless of what ends up happening, the ATOM API is best poised to be the future of weblog editting APIs. The ATOM syndication format on the other hand...

My perspective was that the Atom syndication format was a burden on consumers of feeds since it meant they had to add yet another XML syndication format to the list of formats they supported; RSS 0.91, RSS 1.0, RSS 2.0 and now Atom. However the Atom Publishing Protocol (AtomPub) was clearly an improvement to the state of the art at the time and was a welcome addition to the blog software ecosystem. It would have been the best of both worlds if AtomPub simply used RSS 2.0 so we got the benefits with none of the pain of duplicate syndication formats.

As time has passed, it looks like I was both right and wrong about how things would turn out. The Atom Publishing Protocol has been more successful than I could have ever imagined. It not only became a key blog editing API but evolved to become a key technology for accessing data from cloud based sources that has been embraced by big software companies like Google (GData) and Microsoft (ADO.NET Data Services, Live Framework, etc). This is where I was right.

I was wrong about how much of a burden having XML syndication formats would be on developers and end users. Although it is unfortunate that every consumer of XML feed formats has to write code to process both RSS and Atom feeds, this has not been a big deal. For one, this code has quickly been abstracted out into libraries on the majority of popular platforms so only a few developers have had to deal with it. Similarly end users also haven't had to deal with this fragmentation that much. At first some sites did put out feeds in multiple formats which just ended up confusing users but that is mostly a thing of the past. Today most end users interacting with feeds have no reason to know about the distinction between Atom or RSS since for the most part there is none when you are consuming the feed from Google Reader, RSS Bandit or your favorite RSS reader.

I was reminded by this turn of events when reading John McCrea's post As Online Identity War Breaks Out, JanRain Becomes “Switzerland” where he wrote

Until now, JanRain has been a pureplay OpenID solution provider, hoping to build a business just on OpenID, the promising open standard for single sign-on. But the company has now added Facebook as a proprietary login choice amidst the various OpenID options on RPX, a move that shifts them into a more neutral stance, straddling the Facebook and “Open Stack” camps. In my view, that puts JanRain in the interesting and enviable position of being the “Switzerland” of the emerging online identity wars.

For site operators, RPX offers an economical way to integrate the non-core function of “login via third-party identity providers” at a time when the choices in that space are growing and evolving rapidly. So, rather than direct its own technical resources to integrating Facebook Connect and the various OpenID implementations from MySpace, Google, Yahoo, AOL, Microsoft, along with plain vanilla OpenID, a site operator can simply outsource all of those headaches to JanRain.

Just as standard libraries like the Universal Feed Parser and the Windows RSS platform insulated developers from the RSS vs. Atom formatwar, JanRain's RPX makes it so that individual developers don't have to worry about the differences between supporting proprietary technologies like Facebook Connect or Open Stack™ technologies like OpenID.

At the end of the day, it is quite likely that the underlying technologies will not matter to matter to anyone but a handful of Web standards geeks and library developers. Instead what is important is for sites to participate in the growing identity provider ecosystem not what technology they use to do so.

Note Now Playing: Yung Wun featuring DMX, Lil' Flip & David Banner - Tear It Up Note

Categories: Web Development

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Friday, 16 January 2009 - Dare Obasanjo's weblog

Why Shard or Partition your Database?

How Sharding Changes your Application

A Look at a Some Common Sharding Schemes

Problems Common to all Sharding Schemes

PartitionKey Property

RowKey Property