Over the weekend, Torsten and I shipped a new release of RSS Bandit. Besides bug fixes there is one key new feature in the release, the ability to view and comment on your Facebook news feed. The flow for adding Facebook to the application is as follows. Go to the File->Synchronize Feeds menu option then select "Facebook"

synchronize_fb[1]

then go through the Facebook Connect authorization flow including optionally signing in and granting the application permission to view your news feed

allowaccess_fb%5B1%5D[1]

This creates a new feed source containing your Facebook news feed complete with inline comments as shown below

comments_fb%5B1%5D_2[1]

You can download the new release from here. More details about the bug fixes in the release are in the official RSS Bandit blog post on the release.


The primary purpose of this release (codenamed Colossus) was to bring stability to the code base before we made radical changes. With this release out of the way, we will start working on the Gambit release right away. The purpose of the next release is primarily to make RSS Bandit a more modern application that looks like it belongs on Windows 7 and Windows Vista instead of harkening back to the Office 2003 look of yesteryear.

You can see our plans for updating the RSS Bandit user interface in my blog post and prototype screenshots of the RSS Bandit ribbon. We will also support new features of Windows 7 such as jump lists. As usual, comments and feedback are welcome.

Note Now Playing: Dashboard Confessional - Stolen Note


 

Categories: RSS Bandit

Last week Joel Spolsky wrote a blog post entitled The Duct Tape Programmer where he praises developers who favor simple programming practices to complex ones. This blog post strongly resonated with me and made me recall some related thoughts on complexity and solving problems in software projects. Some key excerpts from his which I'll use as a jumping off point are below

Jamie Zawinski is what I would call a duct-tape programmer. And I say that with a great deal of respect. He is the kind of programmer who is hard at work building the future, and making useful things so that people can do stuff.
...
Duct tape programmers are pragmatic. Zawinski popularized Richard Gabriel’s precept of
Worse is Better. A 50%-good solution that people actually have solves more problems and survives longer than a 99% solution that nobody has because it’s in your lab where you’re endlessly polishing the damn thing. Shipping is a feature. A really important feature. Your product must have it.

One principle duct tape programmers understand well is that any kind of coding technique that’s even slightly complicated is going to doom your project. Duct tape programmers tend to avoid C++, templates, multiple inheritance, multithreading, COM, CORBA, and a host of other technologies that are all totally reasonable, when you think long and hard about them, but are, honestly, just a little bit too hard for the human brain.

The urge the reduce the complexity of the tools used to solve software problems is one that every software developer should share. However even more important is reducing the complexity of the actual solutions that are delivered to your customers at the end of the day. End users can't tell if you used complicated C++ techniques like template metaprogramming and mixins to build the application. They can tell when your application fails to solve their actual problems in a straightforward way or is so late to ship due to project delays that they lose interest in waiting for you to solve their problems.

There are many famous and everyday examples of this culture of complexity in software projects which are eventually trumped by solutions that solve 80% of the problem in a simple way. My favorite example is contrasting the World Wide Web invented by Tim Berners-Lee with Project Xanadu as envisioned by Ted Nelson.  Today the WWW is used by over a billion people to enrich their lives in myriad ways on a daily basis and has created hundreds of billions dollars in value by minting an entire new industry. Project Xanadu is a sad footnote spoken about in hushed tones by fans of hypertext who bewail the success of the Web and how it has forced us to settle for less (i.e. Worse Is Better).

If you aren't familiar with Project Xanadu you can think of it as a networked system of hyperlinked documents and media just like the WWW which had to satisfy the following seventeen rules

    1. Every Xanadu server is uniquely and securely identified.
    2. Every Xanadu server can be operated independently or in a network.
    3. Every user is uniquely and securely identified.
    4. Every user can search, retrieve, create and store documents.
    5. Every document can consist of any number of parts each of which may be of any data type.
    6. Every document can contain links of any type including virtual copies ("transclusions") to any other document in the system accessible to its owner.
    7. Links are visible and can be followed from all endpoints.
    8. Permission to link to a document is explicitly granted by the act of publication.
    9. Every document can contain a royalty mechanism at any desired degree of granularity to ensure payment on any portion accessed, including virtual copies ("transclusions") of all or part of the document.
    10. Every document is uniquely and securely identified.
    11. Every document can have secure access controls.
    12. Every document can be rapidly searched, stored and retrieved without user knowledge of where it is physically stored.
    13. Every document is automatically moved to physical storage appropriate to its frequency of access from any given location.
    14. Every document is automatically stored redundantly to maintain availability even in case of a disaster.
    15. Every Xanadu service provider can charge their users at any rate they choose for the storage, retrieval and publishing of documents.
    16. Every transaction is secure and auditable only by the parties to that transaction.
    17. The Xanadu client-server communication protocol is an openly published standard. Third-party software development and integration is encouraged.

Reading this list is like going through a list of places where World Wide Web fails. Rule #14 which implies every document on the network is redundantly backed up in disparate locations so they can always be is something the WWW doesn't do today which is why we have broken links and 404s all the time. Rule #9 implies that not only is copyright respected and tracked throughout the system but there is even a micropayment platform built in. All the discussions on micropayments saving newspapers would be moot if Project Xanadu ruled the world since it would have existed from day one. Rule #16 on transactions being secure and auditable sounds like Nirvana in today's world of botnets, malware and phishing scams which plague the Web.

Yet despite the fact that the forty year old Project Xanadu is a more compelling vision than were we are today it failed and Tim Berners-Lee's World Wide Web succeeded. In practical terms, Project Xanadu was trying to solve too many complex problems in a v1 product. In contrast, Tim Berners-Lee focused on the most valuable problems to solve for end users which was sharing documents and media with anyone on the Internet and punted on a bunch of the hard problems that would require a more controlled and tightly coupled network as well as a ton of more code. Tim Berners-Lee solved less than half the problems Project Xanadu set out to solve but has changed the world immeasurably for billions of people by providing simple solutions to complex problems and running away from trying to create complex solutions to complex problems.

The bottom line is that a lot of the time it's OK to create a solution that solves 80% of the problem. Always remember that shipping is a feature.

Note Now Playing: Drake, Kanye West, Lil Wayne & Eminem - Forever Note


 

Categories: Programming | Ramblings

Every week there seems to be some new A-list blogger criticizing Twitter's Suggested User's List which is a selection of celebrities and brands that are suggested to new Twitter users as people the user might like to follow. This week it's Robert Scoble with You’re not on Twitter’s suggested user list but you are in good company that points out a number of interesting celebrities and brands that aren't on the list. Last week Dave Winer asked The SUL as a tool to control news?

I've had my issues with the SUL mainly from the perspective of how it ends up presenting Twitter to new users. When my wife joined Twitter I'd have loved it if the service had used integration with Facebook, Windows Live, MySpace, etc to suggest people who she already knew who were on Twitter. Instead the service prioritized pitching that she follow Shaquille O'Neal, Dell Outlet stores, NBC's Today Show and Jessica Simpson's kid sister. To find me on Twitter, my wife had to ask me for my Twitter handle in person. I felt like we were back in the dark ages of social networking.

In retrospect, not doing what I preferred them to do shows a lot of insight. It prevents the site from being viewed as yet another service where you have a duplicated social graph and thus has to compete head to head with the Facebooks and MySpaces of the world. Instead it pitches Twitter as a sort of user friendly RSS reader where you connect with your favorite celebrities and brands instead of another place where you get status updates from people who you're already getting status updates from in Facebook.

Brilliant.

Note Now Playing: Jay Sean - Down (feat. Lil Wayne) Note


 

Categories: Social Software

Database normalization is a technique for designing relational database schemas that ensures that the data is optimal for ad-hoc querying and that modifications such as deletion or insertion of data does not lead to data inconsistency. Database denormalization is the process of optimizing your database for reads by creating redundant data. A consequence of denormalization is that insertions or deletions could cause data inconsistency if not uniformly applied to all redundant copies of the data within the database.

Why Denormalize Your Database?

Today, lots of Web applications have "social" features. A consequence of this is that whenever I look at content or a user in that service, there is always additional content from other users that also needs to be pulled in to page. When you visit the typical profile on a social network like Facebook or MySpace, data for all the people that are friends with that user needs to be pulled in. Or when you visit a shared bookmark on del.icio.us you need data for all the users who have tagged and bookmarked that URL as well. Performing a query across the entire user base for "all the users who are friends with Robert Scoble" or "all the users who have bookmarked this blog link" is expensive even with caching. It is orders of magnitude faster to return the data if it is precalculated and all written to the same place.

This is optimizes your reads at the cost of incurring more writes to the system. It also means that you'll end up with redundant data because there will be multiple copies of some amount of user data as we try to ensure the locality of data.

A good example of a Web application deciding to make this trade off is the recent post on the Digg Blog entitled Looking to the Future with Cassandra which contains the following excerpt

The Problem

In both models, we’re computing the intersection of two sets:

  1. Users who dugg an item.
  2. Users that have befriended the digger.

The Relational Model

The schema for this information in MySQL is:

CREATE TABLE `Diggs` (
  `id`      INT(11),
  `itemid`  INT(11),
  `userid`  INT(11),
  `digdate` DATETIME,
  PRIMARY KEY (`id`),
  KEY `user`  (`userid`),
  KEY `item`  (`itemid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;   CREATE TABLE `Friends` (
  `id`           INT(10) AUTO_INCREMENT,
  `userid`       INT(10),
  `username`     VARCHAR(15),
  `friendid`     INT(10),
  `friendname`   VARCHAR(15),
  `mutual`       TINYINT(1),
  `date_created` DATETIME,
  PRIMARY KEY                (`id`),
  UNIQUE KEY `Friend_unique` (`userid`,`friendid`),
  KEY        `Friend_friend` (`friendid`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

The Friends table contains many million rows, while Diggs holds hundreds of millions. Computing the intersection with a JOIN is much too slow in MySQL, so we have to do it in PHP. The steps are:

  1. Query Friends for all my friends. With a cold cache, this takes around 1.5 seconds to complete.
  2. Query Diggs for any diggs of a specific item by a user in the set of friend user IDs. This query is enormous, and looks something like:
    SELECT `digdate`, `id` FROM `Diggs`
     WHERE `userid` IN (59, 9006, 15989, 16045, 29183,
                        30220, 62511, 75212, 79006)
       AND itemid = 13084479 ORDER BY `digdate` DESC, `id` DESC LIMIT 4;

    The real query is actually much worse than this, since the IN clause contains every friend of the user, and this can balloon to hundreds of user IDs. A full query can actually clock in at 1.5kb, which is many times larger than the actual data we want. With a cold cache, this query can take 14 seconds to execute.

Of course, both queries are cached, but due to the user-specific nature of this data, it doesn’t help much.

The solution the Digg development team went with was to denormalize the data. They also went an additional step and decided that since the data was no longer being kept in a relational manner there was no point in using a traditional relational database (i.e. MySQL) and instead they migrated to a non-RDBMS technology to solve this problem.

 

How Denormalization Changes Your Application

There are a number of things to keep in mind once you choose to denormalize your data including

  1. Denormalization means data redundancy which translates to significantly increased storage costs. The fully denormalized data set from the Digg exampled ended up being 3 terabytes of information. It is typical for developers to underestimate the data bloat that occurs once data is denormalized.

  2. Fixing data inconsistency is now the job of the application. Let's say each user has a list of the user names of all of their friends. What happens when one of these users changes their user name? In a normalized database that is a simple UPDATE query to change a single piece of data and then it will be current everywhere it is shown on the site. In a denormalized database, there now has to be a mechanism for fixing up this name in all of the dozens, hundreds or thousands of places it appears. Most services that create denormalized databases have "fixup" jobs that are constantly running on the database to fix such inconsistencies.

The No-SQL Movement vs. Abusing Relational Databases for Fun & Profit

If you’re a web developer interested in building large scale applications, it doesn’t take long in reading the various best practices on getting Web applications to scale such as practicing database sharding or eschewing transactions before it begins to sound like all the advice you are getting is about ignoring or abusing the key features that define a modern relational database system. Taken to its logical extreme all you really need is a key<->value or tuple store that supports some level of query functionality and has decent persistence semantics. Thus the NoSQL movement was borne.

The No-SQL movement is a used to describe the increasing usage of non-relational databases among Web developers. This approach has initially pioneered by large scale Web companies like Facebook (Cassandra), Amazon (Dynamo) & Google (BigTable) but now is finding its way down to smaller sites like Digg. Unlike relational databases, there is a yet to be a solid technical definition of what it means for a product to be a "NoSQL" database aside from the fact that it isn't a relational database. Commonalities include lack of fixed schemas and limited support for rich querying. Below is a list of some of the more popular NoSQL databases that you can try today along with a brief description of their key qualities 

  1. CouchDB: A document-oriented database where documents can be thought of as JSON/JavaScript objects. Creation, retrieval, update and deletion (CRUD) operations are performed via a RESTful API and support ACID properties. Rich querying is handled by creating Javascript functions called "Views" which can operate on the documents in the database via Map/Reduce style queries. Usage: Although popular among the geek set most users seem to be dabblers as opposed to large scale web companies. 

  2. Cassandra: A key-value store where each key-value pair comes with a timestamp and can be grouped together into a column family (i.e. a table). There is also a notion of super columns which are columns that contain whose values are a list of other key-value pairs. Cassandra is optimized to be always writable and uses eventual consistency to deal with the conflicts that inevitably occur when a distributed system aims to be always writable yet node failure is a fact of life. Querying is available via the Cassandra Thrift API and supports fairly basic data retrieval operations based on key values and column names. Usage: Originally developed and still used at Facebook today. Digg and Rackspace are the most recent big name adopters.

  3. Voldemort: Very similar to Cassandra which is unsurprising since they are both inspired by Amazon's Dynamo. Voldemort is a key-value store where each key value pair comes with a timestamp and eventual consistency is used to address write anomalies. Values can contain a list of further key value pairs. Data access involves creation, retrieval and deletion of serialized objects whose format can be one of JSON, strings, binary BLOBs, serialized Java objects and Google Protocol Buffers. Rich querying is non-existent, simple get and put operations are all that exist.  Usage: Originally developed and still used at LinkedIn.

There are a number of other interesting NoSQL databases such as HBase, MongoDB and Dynomite but the three above seem to be the most mature from my initial analysis. In general, most of them seem to be a clone of BigTable, Dynamo or some amalgam of ideas from both papers. The most original so far has been CouchDB.

An alternative to betting on a speculative database technologies at varying levels of maturity is to misuse an existing mature relational database product. As mentioned earlier, many large scale sites use relational databases but eschew relational features such as transactions and joins to achieve scalability. Some developers have even taken that practice to an extreme and built schema-less data models on top of traditional relational database. A great example of this How FriendFeed uses MySQL to store schema-less data which is a blog post excerpted below

Lots of projects exist designed to tackle the problem storing data with flexible schemas and building new indexes on the fly (e.g., CouchDB). However, none of them seemed widely-used enough by large sites to inspire confidence. In the tests we read about and ran ourselves, none of the projects were stable or battle-tested enough for our needs (see this somewhat outdated article on CouchDB, for example). MySQL works. It doesn't corrupt data. Replication works. We understand its limitations already. We like MySQL for storage, just not RDBMS usage patterns.

After some deliberation, we decided to implement a "schema-less" storage system on top of MySQL rather than use a completely new storage system.

Our datastore stores schema-less bags of properties (e.g., JSON objects or Python dictionaries). The only required property of stored entities is id, a 16-byte UUID. The rest of the entity is opaque as far as the datastore is concerned. We can change the "schema" simply by storing new properties.

In MySQL, our entities are stored in a table that looks like this:

CREATE TABLE entities (
    added_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    id BINARY(16) NOT NULL,
    updated TIMESTAMP NOT NULL,
    body MEDIUMBLOB,
    UNIQUE KEY (id),
    KEY (updated)
) ENGINE=InnoDB;

The added_id column is present because InnoDB stores data rows physically in primary key order. The AUTO_INCREMENT primary key ensures new entities are written sequentially on disk after old entities, which helps for both read and write locality (new entities tend to be read more frequently than old entities since FriendFeed pages are ordered reverse-chronologically). Entity bodies are stored as zlib-compressed, pickled Python dictionaries.

Now that the FriendFeed team works at Facebook I suspect they'll end up deciding that a NoSQL database that has solved a good story around replication and fault tolerance is more amenable to solving the problem of building a schema-less database than storing key<->value pairs in a SQL database where the value is a serialized Python object.

As a Web developer it's always a good idea to know what the current practices are in the industry even if they seem a bit too crazy to adopt…yet.

Further Reading

Note Now Playing: Jay-Z - Run This Town (feat. Rihanna & Kanye West) Note


     

    Categories: Web Development

    August 26, 2009
    @ 05:44 PM

    Facebook unique user chart (2007 - 2009)

    Twitter unique user chart (2007 - 2009)

    FriendFeed unique users chart (2007 - 2009)

    With the sale of FriendFeed to Facebook for $50 million, there doesn’t seem to be much harm in talking about why FriendFeed failed to take off with mainstream audiences despite lots of hype from all of the usual corners. A good starting place is the recent blog post by Robert Scoble entitled Where’s the gang of 2,000 who controls tech hype hanging out today? where he wrote

    You see, there’s a gang of about 2,000 people who really control tech industry hype and play a major role in deciding which services get mainstream hype (this gang was all on Twitter by early 2007 — long before Oprah and Ashton and all the other mainstream celebrities, brands, and journalists showed up). I have not seen any startup succeed without getting most of these folks involved. Yes, Mike Arrington of TechCrunch is the parade leader, but he hardly controls this list. Dave Winer proved that by launching Bit.ly by showing it first to Marshall Kirkpatrick and Bit.ly raced through this list.

    By the way, having this list use your service does NOT guarantee market success. This list has all added me on Dopplr, for instance, but Dopplr has NOT broken out of this small, geeky crowd. Studying why not is something we should do.

    For the past few years, I’ve been watching services I used that were once the domain of geeks like Robert Scoble’s inner circle have eventually been adopted by mainstream users like my wife. In general, the pattern has always seemed to boil down to some combination of network effects (i.e. who do I know that is using this service?) and value proposition to the typical end user. Where a lot of services fall down is that although their value is obvious and instantly apparent to the typical Web geek, that same value is hidden or even non-existent to non-geeks. I tried the exercise of listing some of the services I’ve used that eventually got used by my wife and writing down the one or two sentence description of how I’d have explained the value proposition to here

    • Facebook – an online rolodex of your friends, family & coworkers that let’s you stay connected to what they’re up to. Also has some cool time wasting games and quizzes if your friends are boring that day.
    • Twitter – stay connected to the people you find interesting but wouldn’t or couldn’t “friend” to on Facebook (e.g. celebrities like Oprah & Ashton Kutcher or amusing sources like Sh*t My Dad Says). Also has a cool trending topics feature so you can see what people are talking about if your friends are boring that day.
    • Blogger – an online diary where you can share stories and pictures from your life with friends and family. Also a place where you can find stories and opinions from people like you when you’re boring and have nothing to write that day (Note: Blogger doesn’t actually make it easy to find blogs you might find interesting).
    • Google Reader – a way to track the blogs you read regularly once your list of blog bookmarks gets unwieldy. Also solves the problem of finding blogs you might like based on your current reading list. 

    These are four sites or technologies that I’ve used that my wife now uses ordered by how much she still uses them today. All four sites are somewhat mainstream although they may differ in popularity by an order of magnitude in some cases. Let’s compare these descriptions to those of two sites that haven’t yet broken into the mainstream but my geek friends love

    • FriendFeed – republish all of the content from the different social networking media websites you use onto this site. Also One place to stay connected to what people are saying on multiple social media sites instead of friending them on multiple sites.
    • Dopplr – social network for people who travel a lot and preferably have friends who either travel a lot or are spread out across multiple cities/countries.

    Why Dopplr isn’t mainstream should be self evident. If you’re a conference hopping geek who bounds from SXSW to MIX in the spring or the Web 2.0 summit to Le Web in the fall like Robert Scoble then a site like Dopplr makes sense especially since you likely have a bunch of friends from the conference circuit. On the other hand, if you’re the typical person who either only travels on vacation or occasionally for business then the appeal of Dopplr is lost on you.

    Similarly FriendFeed value proposition is that it is a social network for people who are on too many social networks. But even that really didn’t turn out to be how it went since Twitter ended being the dominant social network on the site and so FriendFeed was primarily a place to have conversations about what people were saying on Twitter. Thus there were really two problems with FriendFeed at the end of the day. The appeal of the service isn’t really broad (e.g. joining a 3rd social network because she has overlapping friends on Twitter & Facebook would be exacerbating the problem for my wife not solving it). Secondly, although the site ended up being primarily used as a Twitter app/conversation hub, its owners didn’t really focus on this aspect of the service which would likely have been avenue for significant growth. For what I mean, look at the graph of unique users for sites that acted as adjuncts to Twitter versus FriendFeed’s which chose not to

    There are definitely lessons to learn here for developers who are trying to figure out how to cross the chasm from enthusiastic praise from the Robert Scoble’s of the world to being used by regular non-geeks in their daily lives.


     

    Categories: Social Software

    Sam Diaz over at ZDNet wrote the following in a blog entry titled RSS: A good idea at the time but there are better ways now in response to an announcement of a new feature in Google Reader

    Once a big advocate for Google Reader, I have to admit that I haven’t logged in in weeks, maybe months. That’s not to say I’m not reading. Sometimes I feel like reading - and writing this blog - are the only things I do. But my sources of for reading material are scattered across the Web, not in one aggregated spot.

    I catch headlines on Yahoo News and Google News. I have a pretty extensive lineup of browser bookmarks to take me to sites that I scan throughout the day. Techmeme is always in one of my browser tabs so I can keep a pulse on what others in my industry are talking about. And then there are Twitter and Facebook. I actually pick up a lot of interesting reading material from people I’m following on Twitter and some friends on Facebook, with some of it becoming fodder for blog posts here.

     

    The truth of the matter is that RSS readers are a Web 1.0 tool, an aggregator of news headlines that never really caught on with the mainstream the way Twitter and Facebook have.

    I take issue with the title of Sam’s post since his complaint is really about the current generation of consumer tools for reading RSS feeds not the underlying technology itself. In general, I agree with Sam that the current generation of RSS readers have failed users and I now use pretty much the same tools that he does to catch up on blog (i.e. Twitter & Techmeme). I’ve listed some of my gripes with RSS readers including the one I wrote (RSS Bandit) in the past and will reiterate some of these points below

    1. Dave Winer was right about River of News style aggregators. A user interface where I see a stream of news and can click on the bits that interest me without doing a lot of management is superior to the using the current dominant RSS reader paradigm where I need to click on multiple folders, manage read/unread state and wade through massive walls of text I don’t want to read to get to the gems.

    2. Today’s RSS readers are a one way tool instead of a two-way tool. One of the things I like about shared links in Twitter & Facebook is that I can start or read a conversation about the story and otherwise give feedback (i.e. “like” or retweet) to the publisher of the news as part of the experience. This is where I think Sam’s comment that these are “Web 1.0” tools rings the truest. Google Reader recently added a “like” feature but it is broken in that the information about who liked one of my posts never gets back to me whereas it does when I share this post on Twitter or Facebook.

    3. As Dave McClure once ranted, it's all about the faces. The user interface of RSS readers is sterile and impersonal compared to social sites like Twitter and Facebook because of the lack of pictures/faces of the people whose words you are reading. It always makes a difference to me when I read a blog and there is a picture of the author and the same goes for just browsing a Twitter account.

    4. No good ways to separate the wheat from the chaff. As if it isn’t bad enough that you are nagged about having thousands of unread blog posts when you don’t visit your RSS reader for a few days, there isn’t a good way to get an overview of what is most interesting/pressing and then move on by marking everything as read. On the other hand, when I go to Techmeme I can always see what the current top stories are and can even go back to see what was popular on the days I didn’t visit the site. 

    5. The process of adding feeds still takes too many steps. If I see your Twitter profile and think you’re worth following, I click the “follow” button and I’m done. On the other hand, if I visit your blog there’s a multi-step process involved to adding you to my subscriptions even if I use a web-based RSS aggregator like Google Reader.

    These are the five biggest bugs in the traditional RSS reading experience today that I hope eventually get fixed since it is holding back the benefits people can get from reading blogs and/other activity streams using the open & standard infrastructure of the Web.


     

    Voting starts today for the various panel proposals for the 2010 SXSW Interactive conference. After learning a lot from participating in panels at this year’s conference, I’ve submitted two proposals for panel discussions for next years conference. Below are their descriptions and links to each panel presentation for voting

    Social Network Interop
    Portable contacts, life streaming and various ‘Connect’ offerings have begun to break down the silos and walled gardens that are social networks. Come hear a panel of experts discuss some of the technologies, design issues and future direction of this trend.

    Drinking from the activity stream when it becomes a tidal wave
    The stream is overflowing. How do you make sure the stream is still useful when there is SO MUCH getting pushed into it

    If you click through the links you’ll find a list of the seven to nine questions that will be asked and answered by the panelists. The trickiest part of this process was trying to come up with proposals six months ahead of the conference. A lot changes in six months and it was a little difficult trying to come up with panel topics that wouldn’t seem like rehashing old news by the time 2010 rolls around. At least the panel ideas aren’t as topical as discussing Facebook’s purchase of Friendfeed. :) 

    Let me know what you think of the panel ideas and who you think should be on the panels if they get accepted.


     

    Categories: Social Software

    Brad Fitzpatrick has been dropping some interesting mind bombs since starting at Google. First it was the Social Graph API recently followed by PubSubHubbub (which I need to write about one of these days) and most recently the WebFinger protocol. The underlying theme in all of these ideas is creating an open infrastructure for simplifying the tasks that are common to social networking media sites and thus improving the user experience.

    The core idea behind WebFinger is excerpted below from the project site

    If I give you my email address today, you can't do anything with it except email me. I can't attach public metadata to my email address to give you more information. WebFinger is about making email addresses more valuable, by letting people attach public metadata to them. That metadata might include:

    • public profile data
    • pointer to identity provider (e.g. OpenID server)
    • a public key
    • other services used by that email address (e.g. Flickr, Picasa, Smugmug, Twitter, Facebook, and usernames for each)
    • a URL to an avatar
    • profile data (nickname, full name, etc)
    • whether the email address is also a JID, or explicitly declare that it's NOT an email, and ONLY a JID, or any combination to disambiguate all the addresses that look like something@somewhere.com
    • or even a public declaration that the email address doesn't have public metadata, but has a pointer to an endpoint that, provided authentication, will tell you some protected metadata, depending on who you authenticate as.

    ... but rather than fight about the exact contents

    The way this is written makes it sound like this would be a useful service for end users but I think that is misleading. If you want to find out about someone you’re best of plugging their name into a search decision engine like Bing or the people search of a site like Facebook which should give you a similar or better experience today without deploying any new infrastructure on the Web.

    Where I find WebFinger to be interesting is in simplifying a lot of the common workflows that exist on the Social Web today. For example, I’ve often criticized Twitter for using the hand picked Suggested User’s List as the primary way of suggesting who you should follow instead of your social graph from a social networking site like Facebook or MySpace. However when you look at their Find People on Other Networks page it is clear that this would end up being an intimidating user experience if they listed all of the potential sources of social graphs on that page (i.e. IM services, email address books, social networking sites, etc) then asked the user to pick which ones they use.

    On the other hand, if there was a way for Twitter to know which sites I belong to just from the email address I used to signup, then there is a much smoother user experience that is possible.   

    This is a fairly boring and mundane piece of Social Web plumbing when you think about it but the ramifications if it takes off could be very powerful. Imagine what direction Twitter would have taken if it used your real social graph to suggest friends to you instead of the S.U.L. as one example. 


     

    Categories: Social Software

    June 4, 2009
    @ 04:11 PM

    I initially planned to write up some detailed thoughts on the Google Wave video and the Google Wave Federation protocol. However the combination of the fact that literally millions of people have watched the video [according to YouTube] and I’ve had enough private conversations with others that have influenced my thinking that I’d rather not post something that makes it seem like I’m taking credit for the ideas of others. That said, I thought it would still be useful to share some of the most insightful commentary I’ve seen on Google Wave from various developer blogs.

    Sam Ruby writes in his post Google Wave 

    At one level, Google Wave is clearly a bold statement that “this is the type of application that every browser should be able to run natively without needing to resort to a plugin”.  At to give Google credit, they have been working relentlessly towards that vision, addressing everything from garbage collection issues, to enabling drag and drop of photos, to providing compelling content (e.g., Google Maps, GMail, and now Google Wave).

    But stepping back a bit, the entire and much hyped HTML5 interface is just a facade.  That’s not a criticism, in fact that’s generally the way the web works.  What makes Google Wave particularly interesting is that there is an API which operates directly on the repository.  Furthermore, you can host your own server, and such servers federate using XMPP.

    These servers are not merely passive, they can actively interact with processes called “robots” using HTTP (More specifically, JSON-RPC over POST).  Once invoked, these robots have access to a full range of operations (Java, Python).  The Python library implementation looks relatively straightforward, and would be relatively easy to port to, say Ruby.

    This dichotomy pointed out by Sam is very interesting. One the one hand, there is the Google Wave web application which pushes the boundaries of what it means to be a rich web application that simply uses Javascript and the HTML DOM. This is a companion step in Google’s transition to taking an active role in the future of building Web applications where previous steps have included Google representatives drafting the HTML 5 specification, Google Gears and Google Chrome. However where things get interesting is that the API makes it possible to build alternate client applications (e.g. a .NET Wave client written in C#) and even build services that interact with users regardless of which Wave client they are using.

    Joe Gregorio has more on these APIs in his blog post Wave Protocol Thoughts where he writes

    There are actually 3 protocols and 2 APIs that are used in Wave:

    • Federation (XMPP)
    • The robot protocol (JSONRPC)
    • The gadget API (OpenSocial)
    • The wave embed API (Javascript)
    • The client-server protocol (As defined by GWT)

    The last one in that list is really nothing that needs to be, or will probably ever be documented, it is generated by GWT and when you build your own Wave client you will need to define how it talks to your Wave server. The rest of the protocols and APIs are based on existing technologies.

    The robot protocol looks very easy to use, here is the code for an admittedly simple robot. Now some people have commented that Wave reminds them of Lotus Notes, and I'm sure with a little thought you could extend that to Exchange and Groove. The difference is that the extension model with Wave is events over HTTP, which makes it language agnostic, a feature you get when you define things in terms of protocols. That is, as long as you can stand up an HTTP server and parse JSON, you can create robots for Wave, which is a huge leap forward compared to the extension models for Notes, Exchange and Groove, which are all "object" based extension models. In the "object" based extension model the application exposes "objects" that are bound to locally that you manipulate to control the application, which means that your language choices are limited to those that have bindings into that object model.

    As someone’s whose first paying job in the software industry was an internship where I had to write Outlook automation scripts to trigger special behaviors when people sent or modified Outlook task requests, I can appreciate the novelty of moving away from a programming model based on building a plugin in an application’s object model and instead building a Web service and having the web application notify you when it is time to act which is the way the Wave robot protocol works. Now that I’ve been exposed to this idea, it seems doubly weird that Google also shipped Google Apps Script within weeks of this announcement. 

    Nick Gall writes in his post My 2¢ on Google Wave: WWW is a Unidirectional Web of Published Documents -- Wave is a bidirectional Web of Instant Messages that

    Whether or not the Wave client succeeds, Wave is undoubtedly going to have a major impact on how application designers approach web applications. The analogy would be that even if Google Maps had "failed" to become the dominant map site/service, it still had major impact on web app design.

    I suspect this as well. Specifically I have doubts about the viability of the communications paradigm shift that Google Wave is trying to force taking hold. On the other hand, I’m sure there are thousands of Web developers out there right now asking themselves "would my app be better if users could see each other’s edits in real time?","should we add a playback feature to our service as well" [ed note - wikipedia could really use this] and "why don’t we support seamless drag and drop in our application?". All inspired by their exposure to Google Wave.

    Finally, I've ruminated publicly that I see a number of parallels between Google Wave and the announcement of Live Mesh. The one interesting parallel worth calling out is that both products/visions/platforms are most powerful when there is a world of different providers each exposing their data types to one or more of these rich user applications (i.e. a Mesh client or Wave client). Thus far I think Google has done a better job than we did with Live Mesh in being very upfront about this realization and evangelizing to developers that they participate as providers. Of course, the proof will be in the pudding in a year or so when we see just how many services have what it takes to implement a truly interoperable federated provider model for Google Wave.

    Note Now Playing: Eminem - Underground/Ken Kaniff Note
     

    Categories: Platforms | Web Development

    In between watching the Google Wave video and Slumdog Millionaire, I got around to completing the first set of tabs for the ribbon in RSS Bandit. Screenshots are below, as usual let me know what you think.

    Fig 1: The home tab. This is the default tab on launching the application. I like that formerly hidden features of the application like subscribing to newsgroups and managing podcasts are now front and center without having to compromise on the common tasks that people want to perform.

    Fig 2: The ability to synchronize RSS Bandit with your Google Reader or NewsGator Online feeds is also now a lot more discoverable instead of being hidden in some obscure menu with an obscure name ("Synchronize Feeds"). 

    Fig 3: The folder tab. This is menu is contextual and becomes selected when you click on a folder in the tree view. There are two features I’d like to call out in this view; Rules and Filters.

    Fig 4: The rules tool is where we’ll end up placing existing and new options on behavior the user would like executed on receipt or viewing of new content.

    Fig 5: The filter tool is used for filtering the items that show up in the list view. We've had several requests for this feature over the past few years but couldn’t figure out an elegant way to incorporate it into the user interface.

    Fig 6: The feed tab. This is a contextual tab that is selected when you click on a feed in the tree view. One feature that I love which is now properly highlighted is that we support creating new posts in feeds that support this such as newsgroups (existing feature) or posting a new status update on Facebook if you have hooked it up as a feed source (new feature).

    Fig 7: The item tab. This is the contextual tab that is highlighted when you select an item in the list view. There are no new features highlighted here. What we do think will be interesting is if we make it straightforward for existing and new IBlogExtension plugins to end up showing up in the item tab. So you should think of this tab as being extensible and should expect that some of our existing plugins (e.g. "Email This", "Post to Twitter", etc) will also end up in this tab.


     

    Categories: RSS Bandit