Recently I was chatting with Steve Rider on the Start.com team about the various gotchas awaiting them as they continue to improve the RSS aggregator at http://www.start.com/1/. I mentioned issues like feeds which don't use title's like Dave Winer's and HTML showing up in titles.

I thought I knew all the major RSS gotchas and RSS Bandit handled them pretty well. However I recently got two separate bug reports from users of WordPress about RSS Bandit's inability to handle extension elements in their feed. The first complaint was about Kevin Devin's RSS feed which couldn't be read at all. A similar complaint was made by Jason Bock who was helpful enough to debug the problem himself and provide an answer in his post RSS Bandit problem fixed where he wrote

I can't believe what it took to fix my feed such that Rss Bandit could process it.

I'm just dumbfounded.

Basically, this is the way it looked:

						<rss xmlns:blog="urn:blog">
   <blog:info directory="JB\Blog" />
   <channel>
      <!--  Channel stuff goes here... -->
   </channel>
</rss>

				

This is what I did to fix it:

						<rss xmlns:blog="urn:blog">
   <channel>
      <!--  Channel stuff goes here... -->
   </channel>
   <blog:info directory="JB\Blog" />
</rss>

				

After debugging the Rss Bandit code base, I found out what the problem was. Rss Bandit reads the file using an XmlReader. Basically, it goes through the elements sequentially, and since the next node after <rss> wasn't <channel>, it couldn't find any information in the feed, and that's what was causing the choke. Moving <blog:info> to the end of the document solved it.

The assumption I made when developing the RSS parser in RSS Bandit was that the top level rss element would have a channel element as its first child element. I handle extension elements if they appear as children of the channel or item element since these seem logical but never thought anyone would apply an extension to the rss element. I took a look at what the RSS 2.0 specification says about where extension elements can appear and it seems my assumption was wrong since it states

RSS originated in 1999, and has strived to be a simple, easy to understand format, with relatively modest goals. After it became a popular format, developers wanted to extend it using modules defined in namespaces, as specified by the W3C.

RSS 2.0 adds that capability, following a simple rule. A RSS feed may contain elements not described on this page, only if those elements are defined in a namespace.

Since there is no explicit restriction of where extension elements can appear it looks like I'll have to make changes to be able to expect extension elements anywhere in the feed.

My apologies to the folks who've had problems reading feeds because of this oversight on my part. I'll fix the issue today and refresh the installer later this week.

 


 

Monday, 28 March 2005 04:31:31 (GMT Daylight Time, UTC+01:00)
Hey, glad I could help out. Once validation engines were telling me my feed was OK I had to start digging ;)
Monday, 28 March 2005 19:23:22 (GMT Daylight Time, UTC+01:00)
Dare:

Will this be available to currently installed users through the Check for Updates function under RSS Bandit Help menu?

Thanks.
gkrallAT NOSPAMverisign dot com
Tuesday, 29 March 2005 12:55:13 (GMT Daylight Time, UTC+01:00)
I haven't yet decided how I'll deal with the refresh. On the one hand I think everyone needs to get the changes but on the other most of the bug fixes have been in edge cases so there is no pressing need to reinstall.
Monday, 11 April 2005 13:21:20 (GMT Daylight Time, UTC+01:00)
Dare,

I can not seem to find any updated installer. Has this issue been fixed? If so, where can I find the update?

THanks

Kevin M.
Kevin M.
Comments are closed.