Torsten and I have started working on RSS Bandit regularly again. Last weekend I fixed a bunch of bugs including the problem that prevented IE 7 from importing OPML files from RSS Bandit. I've gotten a few emails from folks at work about that particular issue so I thought it would be good to knock that issue out early. This morning, I checked-in support for the Atom Thread Extensions which means I can now see comment counts and view comments inline on Sam Ruby's blog.

One change we're planning to make is to switch to using a full-fledged text search engine to power the search feature of RSS Bandit. Currently, we load all the text in memory and use the .NET Framework's string comparison operators to find the target text. We want to move to a model where files on disk are indexed in the background and we don't have to have stuff in memory to search it. This should significantly improve the memory consumed by RSS Bandit.

We've investigated a couple of options for our search solution. My first thought was integrating with MSN Windows Desktop Search. After exchanging some mail with various members of the team, I decided that this wouldn't meet our needs for a number of reasons

  • Users will need to have Windows Desktop Search installed so we either need to figure out how to bundle it with RSS Bandit or disable the feature if it is not installed.
  • The indexing service is file-centric. However we need to index individual RSS/Atom items within the cached RSS/Atom feeds on disk. This means we'll have to change our model to storing one file per RSS/Atom item which could lead hundreds to thousands of files per feed.
  • The biggest gotcha was that making the indexer understand the structure of RSS/Atom feeds requires writing a custom IFilter which involves gnarly C++ coding then dealing with hairy COM<->.NET interop issues. Not exactly the kind of work one wants to do in their free time.

After further investigation we've settled on Lucene.NET which doesn't have any of the aforementioned problems. However we have been dealing with some issues that could either be bugs or just a misunderstanding of how the APIs should be used. We'll keep you posted. 


Categories: RSS Bandit
Tracked by:
"Search Options" (Square Pegs, Round Holes) [Trackback]
"Jubilee: we make progress again" (torsten's .NET blog) [Trackback] [Pingback]
"Our Multi-Lingual World and Search Indexes" (Dare Obasanjo aka Carnage4Life) [Trackback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback] [Pingback]

Sunday, May 28, 2006 11:36:09 PM (GMT Daylight Time, UTC+01:00)
Ok. Here is another possible feature... For some blogs and newsgroups it is well understood how to add comments. At the end of every news group posting (when in digest view and in single item view) put one of those small twist triangles with the word "comment" next to it, or "follow up". When the user clicks on it DHTML makes a small frame visible. Mostly it has a large text area where the user can enter a response, and a "post" button as well. The post button would be similar to the special url's you use for the flags and read/unread buttons you already have... That way you could comment inline. I think this UI would be a lot better than the current newsgroup version of poping up a new window (which if you stick with that I'd suggest making some minor imporvements to).

I started to try to implement this, but gave up because the DHTML was way way beyond the amount of free time I had. ;-)
Gordon Watts
Monday, May 29, 2006 10:01:09 AM (GMT Daylight Time, UTC+01:00)

Are you also looking into the Newsgator<->RSS Bandit synch issues?
Monday, May 29, 2006 4:16:59 PM (GMT Daylight Time, UTC+01:00)
I'll be looking at the Newsgator sync issues as well as making the synchronization process more automatic in the next release.
Tuesday, May 30, 2006 8:04:42 AM (GMT Daylight Time, UTC+01:00)
I wish the MetaWebLog API (or any other method) had a standard way of accessing comments for read/write.
Thursday, June 1, 2006 10:57:40 AM (GMT Daylight Time, UTC+01:00)

Are you going to be including podcasting support into this release of RSS Bandit?

Thursday, June 1, 2006 10:04:35 PM (GMT Daylight Time, UTC+01:00)
Podcasting support will be included in the Jubilee release of RSS Bandit.
Friday, June 2, 2006 5:12:11 PM (GMT Daylight Time, UTC+01:00)
One feature that would greatly help is New Tabs opening in background instead of immediately stealing focus. This will let users view the newspaper once and then pay attention to any interesting links they find later.
Friday, June 2, 2006 5:22:49 PM (GMT Daylight Time, UTC+01:00)
You can do that today by holding down the [Ctrl] key when you click on a link. This is consistent with how Firefox and IE 7 work today.
Sunday, June 4, 2006 5:40:48 AM (GMT Daylight Time, UTC+01:00)
Thanks Dare, the tabs opening is not consistent at all. Sometimes Ctrl-Tab works, sometimes it doesn't. Same for not using existing tabs. Sometimes it opens a new tab, sometimes it reuses an existing, even unread tab even though my config says not to reuse the tabs. This is (CVS) with IE7 beta but I've observed the same behaviour with Rss Bandit stable and IE stable too.
Monday, June 5, 2006 9:13:56 AM (GMT Daylight Time, UTC+01:00)

In terms of using WDS you aren't forced to going with a single file per post. There is another COM interface for you to implement, i.e. more gnarly C++ code to allow WDS to know about your internal structure and therefore index individual items inside your single file.

For example that's how Outlook has individual emails, contacts etc. indexed all inside a single pst file.

Also there doesn't necessarily need to be any .Net - COM interop. I'm not sure of your exact file structure currently, but if it's one big xml feed with multiple entries then you just need to write some independent gnarly C++ code to implement the IFilter interfaces and return the relevant text using MSXML using the C++ interfaces.

Alternatively I've seen some references to a C# wrapper class to handle most of the details of the IFilter interface and allowing you to write your indexing filter all in C#. Not sure if it happens to have any nasty .Net - COM interop suprises.

The code you write in RSS Bandit to then query the search index is fairly trivial and doesn't have any real nasty .Net - COM interop issues. See my example code at:

Wednesday, June 7, 2006 4:19:26 PM (GMT Daylight Time, UTC+01:00)
Tab Navigation issue: Closing a Tab takes the reader back to the main pane. When multiple tabs are open, readers want to read through the open tabs first and then move back to the main tab and not in last focus order.
______ ______ ______ ______ _______
/ main \/__1___\/___2__\/__3___\/___4___\

Picking 4, closing 4, should switch to 3 then closing 3 should switch to 2 etc.

Currently picking 4, switches to main, picking 3, closing it switches back to main, which is a lot of unnecessary mouse clicks needed. For laptop users with touchpads this gets tiring very fast.

Comments are closed.