Paul Buchheit of FriendFeed has written up a proposal for a new protocol that Web sites can implement to reduce the load on their services from social network aggregators like FriendFeed and SocialThing. He unveils his proposal in his post Simple Update Protocol: Fetch updates from feeds faster which is excerpted below
When you add a web site like Flickr or Google Reader to FriendFeed, FriendFeed's servers constantly download your feed from the service to get your updates as quickly as possible. FriendFeed's user base has grown quite a bit since launch, and our servers now download millions of feeds from over 43 services every hour.
One of the limitations of this approach is that it is difficult to get updates from services quickly without FriendFeed's crawler overloading other sites' servers with update checks. Gary Burd and I have thought quite a bit about ways we could augment existing feed formats like Atom and RSS to make fetching updates faster and more efficient. Our proposal, which we have named Simple Update Protocol, or SUP, is below.
...
Sites wishing to produce a SUP feed must do two things:
- Add a special
<link>
tag to their SUP enabled Atom or RSS feeds. This <link>
tag includes the feed's SUP-ID and the URL of the appropriate SUP feed. - Generate a SUP feed which lists the SUP-IDs of all recently updated feeds.
Feed consumers can add SUP support by:
- Storing the SUP-IDs of the Atom/RSS feeds they consume.
- Watching for those SUP-IDs in their associated SUP feeds.
By using SUP-IDs instead of feed urls, we avoid having to expose the feed url, avoid URL canonicalization issues, and produce a more compact update feed (because SUP-IDs can be a database id or some other short token assigned by the service). Because it is still possible to miss updates due to server errors or other malfunctions, SUP does not completely eliminate the need for polling.
Although there's a healthy conversation about SUP going on in FriendFeed in response to one of my tweets, I thought it would be worth sharing some thoughts on this with a broader audience.
The problem statement that FriendFeed's SUP addresses is the following issue raised in my previous post When REST Doesn't Scale, XMPP to the Rescue?
On July 21st, FriendFeed had 45,000 users who had associated their Flickr profiles with their FriendFeed account. FriendFeed polls Flickr about once every 20 – 30 minutes to see if the user has uploaded new pictures. However only about 6,000 of those users logged into Flickr that day, let alone uploaded pictures. Thus there were literally millions of HTTP requests made by FriendFeed that were totally unnecessary.
FriendFeed's proposal is similar to the Six Apart Update Stream and the Twitter XMPP Firehose in that it is a data stream containing information about all of the updates users are making on a particular service. It differs in a key way in that it doesn't actually contain the data from the user updates but instead identifiers which can be used to determine the users that changed so their feeds can be polled.
This approach aims at protecting feeds that use security through obscurity such as the Google Reader's Shared Items feed and Netflix's Personalized Feeds. The user shares their "secret" feed URL with FriendFeed who then obtains the SUP ID of the user's feed when the feed is first polled. Then whenever that SUP ID is seen in the SUP feed by FriendFeed, they know to go re-poll the user's "secret" feed URL.
For services that are getting a ton of traffic from social network aggregators or Web-based feed readers it does make sense to provide some sort of update stream or fire hose to reduce the amount of polling that gets done. In addition, it also makes sense that if more and more services are going to provide such update streams then it should be standardized so that social network aggregators and similar services do not end up having to target multiple update protocols.
I believe that at the end we will see a continuum of options in this space. The vast majority of services will be OK with the load generated by social networking aggregators and Web-based feed readers when polling their feeds. These services won't see the point of building additional features to handle this load. Some services will run the numbers like Twitter & Six Apart have done and will provide update streams in an attempt to reduce the impact of polling. For these services, SUP seems like a somewhat elegant solution and it would be good to standardize on something, anything at all is better than each site having its own custom solution. For a smaller set of services, this still won't be enough since they don't provide feeds (e.g. Blockbuster's use of Facebook Beacon) and you'll need an explicit server to server publish-subscribe mechanism. XMPP or perhaps something an HTTP based publish-subscribe mechanism like what Joshua Schachter proposed a few weeks ago will be the best fit for those scenarios.
Now Playing: Jodeci - I'm Still Waiting