Nick Bradbury, the author of the excellent FeedDemon RSS reader, has a blog post entitled Simplicity Ain't So Simple, Part II: Stop Showing Off where he writes

One mistake I see developers make over and over again is that we make a feature look complicated just because it was hard to create.
For example, the prefetching feature I blogged about last week hasn't been easy to create.  This feature prefetches (downloads) links and images in your feeds so that they're browse-able inside FeedDemon when you're working offline.  It works in the background so you can keep using FeedDemon while it does its business, and it's smart enough to skip web bugs, links to large downloads, and other items that shouldn't be cached (including items that have already been cached in a previous session).

It didn't seem like a complex feature when I started on it, but it ended up being a lot more work than I anticipated.  It could easily be an application all by itself, complete with all sorts of configurable options.

But instead of turning this feature into a mini-application, I demoted it to a lowly menu item

I've had that feeling recently when thinking about a feature I'm currently working on as part of podcasting support in RSS Bandit. The feature is quite straightforward. It is the ability for users to specify a maximum amount of space dedicated to podcasts on computer to prevent their hard drive from filling up with dozens of gigabytes of ScobleShow and Channel 9 videos. Below is a screenshot of what the option looks like.

As I started to implement this feature every question I asked myself led to two or three more questions and the complexity just spiralled. I started with the assumption that we'd enforce the download limit before files were downloaded. So if you have allocated 500MB as the maximum amount of space dedicated to podcasts and you attempt to download (200MB), funny_song.mp3 (5MB) and scary_short_movie.mpg (300MB) in order then we will issue a warning or an error indicating that there won't be enough room to download the last file before attempting to download it. Here's where I got my first rude awakening; there's no guaranteed way to determine the size of the file before downloading. There is a length attribute of the <enclosure> element but it sometimes doesn't have a valid value in certain podcast feeds. Being a Web geek, I thought to myself "Ha, I can always fall back on making an HTTP HEAD request and then reading the Content-Length header". It turns out this isn't always guaranteed to be set either.

So now we have the possibility that the user could initiate three downloads which would exceed the 500MB she has allocated to enclosures. The next question was when to enforce the limit on the files being downloaded. Should we wait until the files have finished downloading and then fail when we attempt to move the downloaded file from the temporary folder to the user specified podcast folder? Or should we stop downloads as soon as we hit 500MB regardless of the state of the downloaded files which means we'll have to regularly collate the size of all pending downloads and add that to the size of all downloads in the podcast folder to ensure that we aren't over the limit? I was leaning towards the former but when I talked to Torsten he pointed out that it seems like cheating if I limit the amount of space allocated to podcasts to 500MB but they could actually be taking over 1GB on disk because I have four 300MB files being downloaded simultaneously. Unfortunately for me, I agreed. :)

Then there's the question of what to actually do when the limit is hit. Do we prompt the user to delete old files, if so what interface do we provide the user to make the user flow sensible and not irritating? Especially since some of the files will be podcasts in the podcast folder and others will be incomplete files that are pending downloads in a temp folder. Yeah, and it goes on and on.

However all our users will see is that one checkbox and field to enter the numeric value.


Wednesday, December 13, 2006 4:58:10 PM (GMT Standard Time, UTC+00:00)
Since I've been downloading podcasts manually for a whie, I've noticed that once the download starts, the file size is almost always (95-99% of the time) in the download box. I would say if the number is available, use it. Start with, for example, the most recently posted file, check it's size, if it fits in the cache start the download (or if it doesn't fit abort the download). Then process each successive file int he download queue in the same way. In your example above, the 300MB file would get skipped, at least for now, because we've already got 205MB of a 500MB cache being used by files already being downloaded. But if the next file in the download queue is less than 295MB, there's enough room in the cache, so get it. If I remember right from when I tried Doppler, that's how they do it, except that their cache is per feed. (One nice thing abuot that was that for the AP radio news feed I was able to set it up to delete the old download when a newer one succeeded, but for 1 feed that's not bad to manage manually

That reminds me... feature request: the ability to keel old items in feeds for some amount of time less than 24 hours (the 8am entry for today is identical to the 8am entry from yesterday, except for the file name of the attachment, so what often happens is that RssBandit doesn't bold the feed to indicate a new entry, although if I manually go to the entry with the most recent hour in the headline and download the attachment, I will hear the current news even though RssBandit indicates it's a day old (
Wednesday, February 7, 2007 12:28:56 AM (GMT Standard Time, UTC+00:00)
just testing
Comments are closed.