November 26, 2003
@ 10:48 PM

A few months ago Mark Pilgrim posted an blog entry entitled How to Consume RSS Safely where he points out

RSS, by design, is difficult to consume safely. The RSS specification allows for description elements to contain arbitrary entity-encoded HTML. While this is great for RSS publishers (who can just “throw stuff together” and make an RSS feed), it makes writing a safe and effective RSS consumer application exceedingly difficult. And now that RSS is moving into the mainstream, the design decisions that got it there are becoming more and more of a problem.

HTML is nasty. Arbitrary HTML can carry nasty payloads: scripts, ActiveX objects, remote image “web bugs”, and arbitrary CSS styles that (as you saw with my platypus prank) can take over the entire screen. Browsers protect against the worst of these payloads by having different rules for different “zones”. For example, pages in the general Internet are marked “untrusted” and may not have privileges to run ActiveX objects, but pages on your own machine or within your own intranet can. Unfortunately, the practice of republishing remote HTML locally eliminates even this minimal safeguard.

The workaround Mark proposes is that aggregators strip out a bunch of tags from the HTML content of a feed before displaying it to the user. The only problem with this approach is that sometimes users to want  to be able to view this dynamic content be it Flash animations or special behaviors on hovering the mouse over an image via Javascript. Well, in the next version of RSS Bandit this will be a user configurable option, below is what the default setting for the embedded web browser used by RSS Bandit will be.

RSS Bandit browser security settings tab