I recently found a complaint about how NetFlix's RSS feeds appear in RSS Bandit from Danny Glasser, a dev manager on my team, in his post Netflix sucks less?. He wrote

Netflix has recently created RSS feeds for subscribers' current queues and recent rental activity, so in theory I can exchange the URLs with friends and view their queues in an RSS aggregator.  I've been playing with this a bit and unfortunately it doesn't render particularly well in RSS Bandit.  It doesn't sort nicely and old entries aren't expired properly.  I'm not sure if this is true with other aggregators but I suppose I could ask Dare

I decided to take a look at the various Netflix RSS feeds and the problem became instantly obvious. Below is an excerpted version of the Netflix Top 100 RSS feed which I'll use the discuss the various problems with syndicating lists in RSS.

<rss version="2.0">
  <channel>
    <title>Netflix Top 100</title>
    <ttl>20160</ttl>
    <link>http://www.netflix.com/Top100</link>
    <description>Top 100 Netflix movies, published every 2 weeks.</description>
    <language>en-us</language>
    <item>
      <title>1- Mystic River</title>
      <link>http://www.netflix.com/MovieDisplay?movieid=60031232&amp;trkid=134852</link>
      <description><![CDATA[Three childhood friends, Sean (Kevin Bacon), Dave (Tim Robbins) and Jimmy (Sean Penn) are reunited in Boston 25 years later when they are linked together in the murder investigation of Jimmy's daughter. ]]></description>
    </item>
    <item>
      <title>2- The Last Samurai</title>
      <link>http://www.netflix.com/MovieDisplay?movieid=60031274&amp;trkid=134852</link>
      <description><![CDATA[Tom Cruise stars as Captain Nathan Algren in this epic movie set in 1870s Japan. ]]></description>
    </item>
    <item>
      <title>3- Something's Gotta Give</title>
      <link>http://www.netflix.com/MovieDisplay?movieid=60031278&amp;trkid=134852</link>
      <description><![CDATA[Sixty and still sexy, Harry (Jack Nicholson) is having the time of his life, wining, dining and bedding women half his age.]]></description>
    </item>
  </channel>
</rss>

There are several problems with the above feed. The first is a combination of the fact that no mechanism is provided for uniquely identifying items in the feed using GUIDs and the lack of dates in the feed.  The problem manifests itself when two weeks from now the top 100 list is refreshed. Using the above feed as an example imagine that a new entry becomes number 1 thus moving Mystic River and Last Samurai one notch down. Now several things break at once.

The first problem is that the user has no way of grouping together top 100 lists for each week so I can't have last month's top 100 list and this week's top 100 list in my aggregator in any sort of meaningful way. Even if there were dates the fact that there are no GUIDs means that the aggregator will likely use the <link> element to uniquely identify the item for determining whether the user has seen it or not. This means that only the new entrant to the list will be marked as unread while movies that were already in the list and have been seen remain unhighlighted. I can see arguments for both viewpoints. On the one hand Netflix may expect that the aggregator should always have 100 items in it with only the new entrants in the list being marked as unread and positions of movies changing from week to week. On the other hand, a user may want to keep the top 100 feeds for each time period in their aggregator so they can see a timeline of the movie rankings in their aggregator. In that case, every two weeks there should be a 100 new items waiting for the user. Unfortunately neither of these happens in RSS Bandit or a number of other aggregators with Netflix's current implementation. Instead old entries in the feed and new entries show up munged together with no separation of them based on date so users can't group by date. Another problem is that he link to the movie's page is the only thing used to uniquely identify the item. So when the feed is fetched and the position of a movie changes (i.e. the title changes) instead of creating a new item in the aggregator, RSS Bandit assumes it is a post whose title has been changed and simply updates the feed in place. This makes sense in 99% of aggregator scenarios when changing the title usually means a typo was fixed in a blog post. However in the Netflix case this means a movie will always show up with its most recent position in the top 100 list. BUT once the movie leaves the list (i.e. is dropped off the feed) the movie will remain at its last position seen in the feed within the aggregator.

The second problem is the fact that there is no way to tell the aggregator how to sort the list of movies. Sorting using the title won't work because it will be an alphabetical sort, ditto for using the description. Even if there were dates, using those for sorting wouldn't make much sense either. Ideally there would have to be some way for the item to specify its position relative to other items in the same list with it at a given point in time. Again, this would require the dates should be attached to the items in the feed.

There are a number of issues raised by the Netflix problem. One could look at the problem as an indication that there should be an item expiry mechanism in RSS so the aggregator should know to dump the list every 2 weeks and refresh it with the new list. Others could argue that this could be solved by giving each item a unique ID independent of the movie and specify its date as well as a sort position. This would allow the user to track changing lists over time even if the same item appears in the list multiple times.

I don't think I've seen anyone raise any of the various problems with the Netflix feeds online. This is surprising since I'd be hard pressed to imagine how any aggregator does the 'right' thing with these feeds. More importantly the Netflix feeds show a significant hole in RSS as well as syndication formats like Atom whose primary goal seems to be RSS feature parity.

I'm going to bring this up on the RSS-AggDev mailing list and see what the other aggregator developers think about this problem.


 

January 12, 2005
@ 02:28 PM

Its begun to spread around the blogosphere that MSN has added support for RSS to a couple more of its web offerings. Yesterday on the MSN Search weblog, Brady announced that there are now RSS Feeds for Search Results on the MSN Search beta site. The URL below returns an RSS feed containing the first 20 items for a search for 'rss bandit'.  

http://beta.search.msn.com/results.aspx?q=rss+bandit&format=rss&count=20

Looking at the results returned using Rex Swain's HTTP Viewer it seems the results don't return the Last-Modified or ETag HTTP headers. This means every time the aggregator queries the feed it'll get an XML document downloaded even if nothing has changed in the search results since the last time the query was sent. So as not to waste bandwidth on the client side I'll probably specify that the MSN Search feeds should only be fetched once a day. One surprising thing is that sponsored links don't show up in the search results. I'd have expected that they would given that they are often relevant to the search as well.

This is totally cool feature. The MSN Search folks are doing good things.


 

Categories: MSN

I've been playing around with the photo album in my MSN Space and have begun to get interested in online photo sharing. I've never been big on taking pictures. The last time I took pictures were on my vacation in Hawaii with the ex last year but I didn't even get them after the breakup. Before that it was Freaknik in 1998. However after playing around with the MSN Spaces photo album I feel like sharing some pics other than RSS Bandit screenshots as part of my space. I'd definitely appreciate any tips from folks out there on purchasing a digital camera.

Once I was done geeking out about the MSN Spaces photo album I decided to check out what other hosted blogging services provided with regards to photo sharing. This is where I found out about Hello and BloggerBot. For those who aren't aware of it, Hello is an application for sharing images with people in real-time. A sort of instant messaging client with a photo slideshow feature. The BloggerBot feature of Hello allows you to post images to your blog hosted on Blogger.com from the Hello application. This integration makes sense since the company that created Hello was recently purchased by Google.

During my next daily rap session with Mike about Spaces, I brought up the photo sharing features of Hello and its integration with Blogger. Mike pointed out that a similar user experience was already possible using MSN. This is where I first learned about MSN Premium. The MSN Premium service is an MSN offering that provides a bunch of value adds to browsing the Web for under $10 a month. It includes a firewall, anti-virus software, Encarta, Microsoft Money, Outlook plugins and a number of photo management features. I tried the service yesterday and so far I like it. The MSN Outlook Connector which allows you to access Hotmail from Outlook is quite nice.

The photo sharing features of MSN Premium come in a couple of flavors. The first part is MSN Messenger Photo Swap which enables you to initiate a photo sharing session with any MSN Messenger user. This seems to be provide an equivalent experience to the real-time photo sharing features in Hello. Here is a screenshot of Mike Torres using Messenger Photo Swap to show me his vacation pics. The second major photo sharing feature of MSN premium is called Photo Email. With Photo Email you can send photo slideshows to people as regular HTML email. The email slide shows are a compressed version of a slide show of the full resolution images hosted on an automatically generated Web site which is linked to from the email. People can then view the full slide show then either download the images for printing or order prints online. Here is a screenshot of Photo Email I sent to myself of a modified version of RSS Bandit.

The ActiveX slideshow control used to host the images on the automatically generated website is extremely similar to that used by MSN Spaces. It shouldn't be too hard to send some sort of MSN Spaces photo email to invite people to view the photo album on your Space. I should remember to add this as a feature request on the MSN Spaces Wiki

Then there is still the question of how one sends a picture to their MSN Spaces blog as a blog posting the same way Hello allows one to do so using the BloggerBot. The answer is the email posting feature of MSN Spaces. Simply enable Mobile Publishing on Mobile Settings tab of the Settings page of the MSN Space. Enter an email address (e.g. your mobile phone email if you are a moblogger) and turn on “publish immediately.” Enter a secret word. You can now blog direct to that email address (e.g. carnage4life.blogthis@spaces.msn.com) with a photo attachment and/or text. The subject of the e-mail becomes the subject of the post.


 

Categories: MSN

The folks behind FeedBurner have a blog post about RSS Market Share which discusses the distribution of aggregators they see polling their most popular feeds. They write

...RSS Client market is not yet consolidating, it's expanding. There were 409 different clients polling the top 800 FeedBurner feeds in September and now there are 719 different clients. FeedBurner actively catalogs the behavior and specifications for hundreds of these user-agents...

...This list is heavily skewed toward aggregators used on blog feeds, since most of our feeds are from blogs. This list might read quite differently for more traditional media feeds such as Reuters, NYT, CNET, etc. On a similar theme, individual publishers will notice that the overall market share may be wildly different from their own feed's market share. Simply removing our top 10 feeds from this data results in a wildly different market share list, possibly because of clients that ship with one or more of our top 10 feeds as a default. All of this pointing to the caution not to read too much into this single data point. We could make qualifications about everything on the list. Your mileage may vary, caveat emptor, mea culpa, c'est la vie..

Top 20 RSS clients across FeedBurner most highly subscribed 800 feeds as of January 6, 2005

Aggregator Name (Market Share Percentage)
1. Bloglines (32.86%)
2. NetNewsWire (16.95%)
3. Firefox Live Bookmarks (7.78%)
4. Pluck (7.20%)
5. NewsGator Online(4.45%)
6. (not identified) (4.07%)*
7. FeedDemon (3.83%)
8. SharpReader (3.27%)
9. My Yahoo (2.58%)
10. iPodder (2.42%)
11. NewsGator (2.23%)
12. Thunderbird (2.13%)
13. RSS Bandit (1.12%)
14. NewsFire (1.05%)
15. iPodderX (1.02%)
16. Sage (0.71%)
17. FeedReader (0.67%)
18. RssReader (0.54%)
19. LiveJournal (0.46%)
20. Opera RSS Reader (0.45%)

Although interesting, their numbers probably aren't reflective of the reality of the RSS aggregator market share. LiveJournal has over 5 million accounts with at least half of them being active users. I suspect there are far more people using their LiveJournal friends page as an RSS aggregator than the entire top 10 list combined.

However this does bring up a question I've been considering for a while. What should be the default feeds in an RSS Bandit installation? Besides the various RSS Bandit feeds we also subscribe the user to the RSS feeds for Microsoft Watch, Yahoo! News, BBC, Rolling Stone, Slashdot, Boing Boing and InstaPundit. I've been considering removing a few of these feeds such as InstaPundit since I don't read it regularly but the one or two times I've read it I didn't think much of it. I've also considered adding more blogs I read such as Robert Scoble or Dave Winer.  

Given that RSS Bandit is moderately popular with about 50,000 downloads of the most recent version and about 130,000 total downloads over the past year I'm sure we'd be contributing a decent amount of readership to whatever feeds we install as default. Therefore I'd like some ideas from our users on what you think the best mix of feeds should be for folks installing RSS Bandit for the first time which in certain cases may be their first RSS aggregator.


 

Categories: RSS Bandit

Doc Searls has a post entitled Resistance isn't futile where he writes

Russell Beattie says "it's game over for a lot of Microsoft competitors." I don't buy it, and explained why in a comment that's still pending moderation. (When the link's up, I'll put it here.)

Meanwhile, I agree with what Phillip Swann (who is to TVs what Russell is to mobile devices) says about efforts by Microsoft and others to turn the TV into a breed of PC:

...it's not going to happen, no matter how much money is spent in the effort. Americans believe the TV is for entertainment and the PC is for work. New TV features that enhance the viewing experience, such as Digital Video Recorders, High-Definition TV, Video on Demand, Internet TV (the kind that streams Net-based video to the television, expanding programming choices) and some Interactive TV features (and, yes, just some), will succeed. Companies that focus on those features will also succeed.

But the effort to force viewers to perform PC tasks on the TV will crash faster than a new edition of a buggy PC software. I realize that doesn't speak to all of Russell's points, or to more than a fraction of Microsoft's agenda in the consumer electronics world; but it makes a critical distinction (which I boldfaced, above) that's extremely important, and hard to see when you're coming from the PC world.

It seems Doc Searls is ignoring the truth around him. Millions of people [including myself] watch TV by interacting with a PC via TiVo and other PVRs. I haven't met anyone who after using a PVR who wants to go back to regular TV. As is common with most Microsoft detractors Doc Searls is confusing the problems with v1/v2 of a product with the long term vision for the product. People used to say the same things about Windows CE & PalmOS but now Microsoft has taken the lead in the handheld market.

The current crop of Windows Media Centers have their issues, many of which have even been pointed out by Microsoft employees. However it is a big leap to translate that to people don't want more sophistication out of their television watching experience. TiVo has already taught us that people do. The question is who will be providing the best experience possible when the market matures?


 

Categories: Technology

Recently Ted Leung posted a blog entry entitled Linguistic futures where he summarized a number of recent discussions in the blogosphere about potential new features for the current crop of popular programming languages. He wrote

1. Metaprogramming facilities

Ian Bicking and Bill Clementson were the primary sources on this particular discussion. Ian takes up the simplicity argument, which is that metaprogramming is hard and should be limited -- of course, this gets you things like Python 2.4 decorators, which some people love, and some people hate. Bill Mill hates decorators so much that he wrote the redecorator, a tool for replacing decorators with their "bodies". 

2. Concurrency

Tim Bray and Herb Sutter provided the initial spark here. The basic theme is that the processor vendors are finding it really hard to keep the clock speed increases going (that's actually been a trend for all of 2004), so they're going to start putting more cores on a chip... But the big take away for software is that uniprocessors are going to get better a lot more slowly than we are used to. So that means that uniprocessor efficiency matters again, and the finding concurrency in your program is also going to be important. This impacts the design of programming languages as well as the degree of skill required to really get performance out of the machine...

Once that basic theme went out, then people started digging up relevant information. Patrick Logan produced information on Erlang, Mozart, ACE, Doug Lea, and more. Brian McCallister wrote about futures and then discovered that they are already in Java 5.

It seems to me that Java has the best support for threaded programming. The dynamic languages seem to be behind on this, which is must change if these predictions hold up. 

3. Optional type checking in Python

Guido van Rossum did a pair of posts on this topic. The second post is the scariest because he starts talking about generic types in Python, and after seeing the horror that is Java and C# generics, it doesn't leave me with warm fuzzies.

Patrick Logan, PJE, and Oliver Steele had worthwhile commentary on the whole mess. Oliver did a good job of breaking out all the issues, and he worked for quite a while on Dylan which had optional type declarations. PJE seems to want types in order to do interfaces and interface adaptation, and Patrick's position seems to be that optional type declarations were an artifact of the technology, but now we have type inference so we should use that instead. 

Coincidentally I recently finished writing an article about Cω which has integrated both optional typing via type inference and concurrency into C#. My article indirectly discusses the existence of type inference in Cω but doesn't go into much detail. I don't mention the concurrency extensions in Cω in the article primarily due to space constraints. I'll give a couple of examples of both features in this blog post.

Type inference in Cω allows one to write code such as

public static void Main(){
  x = 5; 
  Console.WriteLine(x.GetType()); //prints "System.Int32"
}

This feature is extremely beneficial when writing queries using the SQL-based operators in Cω. Type inference allows one turn the following Cω code

public static void Main(){

  struct{SqlString ContactName; SqlString Phone;} row;
  
  struct{SqlString ContactName; SqlString Phone;}* rows = select
            ContactName, Phone from DB.Customers;
 
  foreach( row in rows) {
      Console.WriteLine("{0}'s phone number is {1}", row.ContactName, row.PhoneNumber);
   }
}

to

public static void Main(){

  foreach( row in select ContactName, PhoneNumber from DB.Customers ) {
      Console.WriteLine("{0}'s phone number is {1}", row.ContactName, row.PhoneNumber);
   }
}

In the latter code fragment the type of the row variable is inferred so it doesn't have to be declared. The variable is now seemingly dynamically typed but really isn't since the type checking is done at compile time. This seems to offer the best of both worlds because the programmer can write code as if it is dynamically typed but is warn of type errors at compile time when a type mismatch occurs.

As for concurrent programming, many C# developers have embraced the power of using delegates for asynchronous operations. This is one place where I think C# and the .NET framework did a much better job than the Java language and the JVM. If Ted likes what exists in the Java world I bet he'll be blown away by using concurrent programming techniques in C# and .NET. Cω takes the support for asynchronous programming further by adding mechanisms for tying methods together in the same way a delegate and its callbacks are tied together. Take the following class definition as an example

public class Buffer {
   public async Put(string s);
   public string Get() & Put(string s) { return s; }
}

In the Buffer class a call to the Put() Get() method blocks until a corresponding call to a Get() Put() method is made. Once this happens the parameters to the Put() method are treated as local variable declarations in the Get() method and then the code block runs. Similarly a call to a Get() method blocks until a corresponding Put() method is called. On the other hand a call to a Put() method returns immediately while its arguments are queued as inputs to a matching call to the Get() method. This assumes that each Put() call has a corresponding Get() call and vice versa.  

There are a lot more complicated examples in the documentation available on the Cω website.


 

Categories: Technology

An RSS Bandit user has created the RSS Bandit Flagged Item Merge Utility. The description of its usage states

If you use RSSBandit as your feedreader, you've probably used the Flag Item feature to store interesting articles for future access. However, if you use RSSBandit on multiple computers, your flagged item lists aren't synchronized. While you could upload your data to a central location, I find it more convenient to use a USB Memory Stick to keep these flagged item lists. I wrote this utility to synchronize from various lists to your main list. Get it here.

Usage: Select your main flagitems.xml file as the RSS Bandit Flag File and the file you want to merge into it as the Import File and hit Start. The application will display the items that it imported successfully or failed.

Although it is quite cool to see people writing tools to work with RSS Bandit, the fact is that you can already get this functionality out of RSS Bandit without resorting to tools. One of the options in the Remote Storage tab of the Tools->Options menu is 'File Share'. Although this option states that it only works on network shares it also works on local drives as well. So this means you can select an external drive such as "H:\" or whatever drive letter the USB keychain maps to and then synchronize with that.

That way all you need to synchronize RSS Bandit between two machines would be your USB keychain. I guess this means we should probably update the text of that dialog to explain that it's actually any drive not just network shares.


 

Categories: RSS Bandit

C|Net News has an interview with Bill Gates entitled Gates taking a seat in your den. One of his most interesting answers from my perspective was his take on Microsoft and blogging. The question and his answer are excerpted below

One of the big phenomena of the year has been Web logging. Has the growth surprised you?

Well, actually I think the biggest blogging statistic I know, which really blew me away, is that we've got close to a million people setting up blogs with the Spaces capability that's connected up to Messenger.

Now, with blogs, you always have to be careful. The decay rate of "I started and I stopped" or "I started and nobody visited" is fairly high, but as RSS (Really Simple Syndication) has gotten more sophisticated and value-added search capabilities have come along, this thing is really maturing.

And we've done some things in Japan and Korea that are unique blog experiments. The Spaces thing is a worldwide effort. It's a great phenomena, and it's sort of built on e-mail, and so we need to integrate more blogging capability into the e-mail world--and as we do the next generation of Outlook, you'll see that. We need to integrate it more into our SharePoint, which is our collaboration Office platform, and then, as I discussed, MSN is embracing it so that instead of thinking about, "OK, I go to one community to do photos, one community to do social networking, one community to do this," we say, "Hey," off of Messenger, which has got your buddy list already, then, "Let's let you do the photos and the social networking and everything--but starting in an integrated way off of Messenger."

I also have been quite impressed by our signup rate, it has totally exceeded expectations. As BillG says above, we at MSN have been thinking a lot about the problems facing the existing social software landscape and how we can create the best place on the Web for people to communicate, share their experiences and interact with friends, family and strangers who may one day become friends or family. You guys haven't seen anything yet.

It's going to be a fun ride.
 

Categories: Mindless Link Propagation | MSN

I was quite surprised to find out that my blog was mentioned in the Wall Street Journal. For those that don't have a WSJ subscription below is an excerpt of the story (***fair use***)

The rivalry between Google Inc. and Microsoft Corp. has been heating up since the Redmond, Wash., software behemoth last year unveiled its own search-engine technology. But tension between the two recently flared amid an online scrap about Google's use of open-source software.

The scuffle started with a Dec. 29 Web log post by Krzysztof Kowalczyk entitled "Google -- we take it all, give nothing back," in which the former Microsoft employee accused Google of freeloading. Mr. Kowalczyk, who now works at PalmOne Inc., cited a blog post by Google executive -- and former Microsoft staffer himself -- Adam Bosworth in which Mr. Bosworth called for open-source programmers to build better database software that Google and other big companies could use.

Mr. Kowalczyk wrote in his blog that Google gets an estimated tens of millions of dollars worth of software for free thanks to open-source developers, who release their programs without charge. And he alleged that Google gives little back to open source in return: "Microsoft creates more open-source code than Google." Microsoft staffer Dare Obasanjo excerpted portions of Mr. Kowalczyk's post on his personal blog and also took issue with at least one element of Mr. Bosworth's blogged response.

Mr. Bosworth fired back, posting in the comments section of Mr. Obasanjo's blog. "For Microsoft to condemn those of us who benefit from Open Source is rich," he wrote.

Spokesmen for Google and Microsoft declined to comment on the exchange. The Microsoft spokesman said the company "treats blogs as individuals expressing their independent opinion."

For those who missed the discussion and the original posts you can find them in my work blog in the posts entitled  Google and Open Source and More on Google and Open Source.


 

January 5, 2005
@ 03:54 PM

I finished my first article since I switching jobs this weekend. It's tentatively titled Integrating XML into Popular Programming Languages: An Overview of Cω and should show up on both XML.com and my Extreme XML column on MSDN at the end of the month. I had initially planned to do the overview of (C-Omega) for MSDN and do a combined article about ECMAScript for XML (E4X) &  for XML.com but it turned out that just an article on  was already fairly long. My plan is to follow up with an E4X piece in a couple of months. For the geeks in the audience who are a little curious as to exactly what the heck  is, here's an introduction to one of the sections of the article to whet your appetite.

The Cω Type System

The goal of the Cω type system is to bridge the gap between Relational, Object and XML data access by creating a type system that is a combination of all three data models. Instead of adding built-in XML or relation types to the C# language, the approach favored by the Cω type system has been to add certain general changes to the C# type system that make it more conducive for programming against both structured relational data and semi structured XML data.

A number of the changes to C# made in Cω make it more conducive for programming against strongly typed XML, specifically XML constrained using W3C XML Schema. Several concepts from XML and XML Schema have analogous features in Cω. Concepts such as document order, the distinction between elements and attributes, having multiple fields with the same name but different values, and content models that specify a choice of types for a given field exist in Cω. A number of these concepts are handled in traditional Object<->XML mapping technologies but it is often with awkwardness. Cω aims at makes programming against strongly typed XML as natural as programming against arrays or strings in traditional programming languages.

I got a lot of good feedback on the article from a couple of excellent reviewers including the father of the X#/Xen himself, Erik Meijer. For those not in the know, X#/Xen was merged with Polyphonic C# to create Cω. Almost all of my article focuses on the aspects of Cω inherited from X#/Xen.


 

Categories: XML