July 25, 2004
@ 12:30 AM

Recently there have been some complaints about duplicate entries showing up in RSS Bandit. This is due to a change I made in the most recent version of RSS Bandit. In RSS 2.0 there is an optional guid element that can be used to uniquely identify an item in an RSS feed. Unfortunately this element is optional so most aggregators end up using the link element instead in feeds that don't use guids. 

For the most part this worked fine. However I stumbled across a feed that used the same link for each item from a given day; the Cafe con Leche RSS feed. This meant that RSS Bandit couldn't differentiate between items posted on the same day. This was particularly important when tracking what items a user has read or whether an item has already been downloaded or not. I should have pinged the owner of the feed to point this problem out but instead I decided to code around this issue by using the combination of the link and title elements for uniquely identifying items. This actually turned out to be worse.

Although this fixed the problems with the Cafe con Leche RSS feed it caused other issues. This means that any time an item in a feed changed its title but kept the permalink the same (for example, if a typo was fixed in the title) then RSS Bandit thinks it's a different post and a duplicate entry shows up in the list view. Since popular sites like Boing Boing and Slashdot tend to do this almost every other day it means I turned a problem with a niche site that affects a few users to one that affects a number of popular websites thus affecting lots of users.

This problem will be fixed in the next version of RSS Bandit.