Thursday, 04 March 2004 - Dare Obasanjo's weblog

March 4, 2004

@ 05:51 PM

Different Ways of Dealing with the RSS Information Overload

Two recurring themes have shown up in my development of RSS Bandit and usage of news aggregators in general

There are feeds I'm susbscribed to whose content I never end up reading because there is too much content (e.g. Weblogs @ ASP.NET) thus missing the good stuff.
There is no easy way to find content that I'd find interesting.

I've noticed more and more people complaining about the information overload that comes with being subscribed to too many feeds and wanting some way to sift through the information. I spoke to someone at work yesterday who said he'd stopped using his aggregator to subscribe to individual feeds but instead just subscribed to RSS feeds of search results on Feedster. Similarly some RSS Bandit users subscribe to a lot of feeds and just use Search Folders to sift through them. Both approaches are slight variations of the same thing. The first person would rather read all information in blogs about a certain topic or keyword while the other would like to read all information about a certain topic or keyword from a select list of feeds.

The goal of RSS Bandit is to encourage both approaches. For the former we provide functionality for viewing Feedster [and other search engines that return RSS feeds] search results in the same manner one would view an RSS feed. In the next version we will provide the functionality to directly subscribe to such search results in two clicks (type search term in address bar, click the search button, results come back as an RSS feed, click subscribe to search results). The last piece is currently missing from RSS Bandit but will be in the next version. For the latter scenario where users subscribe to lots of feeds but only read the ones that match the searches in particular search folders I am considering improving the search capabilities by supporting query-like functionality. Currently you can create a Search Folder that shows all items that match a particular key word or key phrase. However sometimes you want to perform searches over multiple terms (e.g. “Microsoft AND Longhorn”) or fine tune certain searches by ignoring posts that may coincidentally match your keyword but are not of interest (e.g. “Java -coffee -indonesia”).

As for finding new interesting content, RSS Bandit already provides a way to search for feeds by keyword on Syndic8 but there is a bunch more that can be done. There are a bunch of other ideas I have about enabling users to manage the deluge of feeds on the Web and finding new interesting content. Including

Only show posts that have been linked to by other feeds you are subscribed to. This would work for news sites like Slashdot or high traffic feeds like Blogs @ MSDN.
Add a way to integrate with Technorati's Interesting Blogs and Interesting Newcomers lists whenever they are implemented.
Only show posts that have a certain threshold of incoming links (e.g. 5 or more) as measured by Technorati. This may be infeasible due to causing high load on Technorati.

I'm supposed to be hanging out with Lili Cheng in the next couple of days, I wonder what she'll think of some of these ideas and perhaps she can set me on the right path.

Categories: RSS Bandit

March 3, 2004

@ 06:48 AM

Comments [1]

RSS Bandit and ATOM (Again)

I just checked in rudimentary support for Mark Nottingham's Atom Syndication Format 0.3 (PRE-DRAFT) and Mark Pilgrim's Atom Feed Autodiscovery (PRE-DRAFT) into the RSS Bandit CVS tree. I expect there to be bugs and incomplete feature support but it is good enough to read the various ATOM feeds being produced by Blogger since I can now subscribe to the ATOM feeds of Evan Williams and Steve Saxon.

The screenshot below shows Mark Pilgrim's ATOM feed.

Categories: RSS Bandit

March 2, 2004

@ 05:17 PM

Comments [0]

How To Opt Out of Credit Card Junk Mail

I just stumbled on a guide on How to Stop Receiving Credit Card Offers on Kuro5hin which I definitely need given the aggravating amount of junk mail I get from credit card companies. It begins

Tired of annoying "pre-approved" credit card offers? I sure am. According to the Fair Credit Reporting Act (FCRA) of 1970 as amended in 1996, the four major credit bureaus have the right to sell your information to companies that want to offer you a credit card. Fortunately, the amendment also stipulated that credit bureaus must provide a way for consumers to have their names excluded from pre-approval lists. If you're a United States citizen sick of getting pre-screened credit card offers, this article will show you how to avoid receiving them

Lots of useful information here including a number you can call to opt out of credit card junk mail. According to the article, 1-888-5-OPTOUT is an automated service run jointly by the four main credit bureaus. With one phone call you can opt out of pre-screened mailings from all four bureaus. Sweet.

A quick Googling confirms the information in the Kuro5hin article which seems to be a summary of the following PDF on the FTC website, Where to Go To "Just Say No" .

Categories: Mindless Link Propagation

March 2, 2004

@ 04:21 PM

Comments [14]

Synchronizing RSS Bandit Across Multiple Machines

This recently appeared on the rssbandit-users mailing list

From: Dare Obasanjo <dareo@mi...>
Synchronizing RSS Bandit Across Multiple Machines
2004-03-02 00:29

 Hello all, 
   One of the big features on TODO list is to implement a way to
 synchronize the state of RSS Bandit across multiple machines so that if
 you use it on one machine and start it up on another it remembers what
 you've read, what you're subscribed to, etc. I've gone as far as writing
 a spec[0] for a format that does this. The main problems I've had now is
 that there needs to be a server where the individual RSS Bandit
 instances can fetch feeds from in the same way your mail reader (e.g.
 Outlook) downloads mail from your mail server (e.g Exchange). Since it
 is unlikely that users will be able to setup servers for this purpose
 here are a couple of ideas I've thought of supporting. I'd love your
 feedback and added suggestions 
 
 1.) FTP support: This is straightforward, an RSS Bandit instance can
 upload and download a SIAM file containing synchronization information
 via FTP. 
 
 2.) Mail support: This is fairly crafty and some would call it a hack.
 RSS Bandit mails the subscription file either as a zipped attachment or
 as inline text content to the user's email address and can download it
 using POP3 if the user's mail server supports POP3. The appropriate mail
 to use for synchronization is identified by an extension header in the
 mail or some similar identifier. There are a large number of free POP3
 services[1] and users can create a throwaway account specificly for RSS
 Bandit synchronization. 
 
 [0] http://www.25hoursaday.com/draft-obasanjo-siam-01.html
 [1] http://www.emailaddresses.com/email_pop.htm
 
 --
 PITHY WORDS OF WISDOM 
 Please all and you please none.

If you are an RSS Bandit user I would appreciate your thoughts on the issue. Most of the responses I've gotten so far is that I should just implement synchronization support via FTP which I'm not sure is that accessible to the average RSS Bandit user.

Categories: RSS Bandit

February 28, 2004

@ 06:14 PM

Comments [0]

A Look at the xml:base attribute and the .NET Framework's XmlReader

The W3C xml:base recommendation describes the attribute xml:base when appearing on an XML element allows one to specify a base URI for the element and its children other than the base URI of the document or external entity. The base URI of a document or entity is the URI from which the document or entity was loaded. For example, the base URI of my RSS feed is http://www.25hoursaday.com/weblog/SyndicationService.asmx/GetRss. The following example taken from the W3C recommendation shows how xml:base processing works.

<?xml version="1.0"?>
<doc xml:base="http://example.org/today/"
     xmlns:xlink="http://www.w3.org/1999/xlink">
<head>
    <title>Virtual Library</title>
</head>
<body>
    <paragraph>See <link xlink:type="simple" xlink:href="new.xml">what's
      new</link>!</paragraph>
    <paragraph>Check out the hot picks of the day!</paragraph>
    <olist xml:base="/hotpicks/">
      <item>
        <link xlink:type="simple" xlink:href="pick1.xml">Hot Pick #1</link>
      </item>
      <item>
        <link xlink:type="simple" xlink:href="pick2.xml">Hot Pick #2</link>
      </item>
      <item>
        <link xlink:type="simple" xlink:href="pick3.xml">Hot Pick #3</link>
      </item>
    </olist>
</body>
</doc>

The URIs in the xlink:href attributes in this example resolve to full URIs as follows:

"what's new" resolves to the URI "http://example.org/today/new.xml"

"Hot Pick #1" resolves to the URI "http://example.org/hotpicks/pick1.xml"

"Hot Pick #2" resolves to the URI "http://example.org/hotpicks/pick2.xml"

"Hot Pick #3" resolves to the URI "http://example.org/hotpicks/pick3.xml"

xml:base exists as a mechanism to mimic HTML's BASE element and bring that functionality to the XML world. This was supposed to be a companion technology to XLink which was supposed to be a generic way to describe links in XML documents. Both XLink and xml:base were expected to be used in XHTML 2.0. However the XHTML working group rejected them and instead proposed HLink which was rejected by the W3C Technical Architecture Group. A lot of this is covered in the XML.com articles Introducing HLink and TAG Rejects HLink by Kendall Clark.

Even though xml:base has been rejected by the designers of the technologies it was primarily intended to be used with it has still made its way into the core of the XML family of technologies. Specifically, xml:base is used by the XML Infoset recommendation to define base URIs. This elevated xml:base and HTML-style base URI processing from being an application-specific construct to being a core part of XML that should be supported by XML parsers. For example, XQuery and XPath 2.0 will have the base-uri() function which returns the base URI of a node and takes into account the xml:base attribute.

The next question is whether the .NET Framework supports the xml:base recommendation. At first glance it looks this way since there is BaseURI property on both the XmlNode and XmlReader classes. However these properties report the BaseURI in the classic sense only (i.e. where the node was loaded from which is either the URI of the document or the URI of the entity it was expanded from). We were planning to add support for xml:base to the core XML parser as part of implementing XInclude but given that that it recently went from being a W3C candidate recommendation to going back to being a W3C working draft (partly due to a number of the architectural issues raised by Murata Makoto) the future of the spec is currently uncertain so we've backed off on our implementation. In the meantime, developers can use XInclude.NET if they need XML Inclusions and its associated support for the xml:base attribute in the .NET Framework.

Categories: XML

February 28, 2004

@ 11:07 AM

Comments [6]

XmlReader and the Factory Design Pattern

Daniel Cazzulino writes in response to Don Demsak's post on Waking Up From A DOM Induced Coma

So, in this regard, I believe SUN is doing a good job at concentrating on pluggable and standard interfaces and specifications, and letting whoever wants to take the time to implement custom stuff.
I don't want to "new XmlTextReader". I want some app/system-wide factory take care of creating the appropriate parser implementation for me based on declarative configuration, and I want my to code to work against a single unified interface/base class always.
Changing the parser shouldn't mean I have to change my working app code. If MS provides the appropriate abstractions, it wouldn't even be necessary to rely on some implementation-specific feature such as XmlTextReader.GetRemainder that is not part of the abstract contract defined by XmlReader.

I both agree and disagree with Daniel. We do have a single unified interface for processing XML which developers can program against, it is called the XmlReader. Unfortunately, we subclassed this class into the XmlTextReader and XmlValidatingReader which are actually what most developers program against including our devs internally. In the next version of the .NET Framework we are moving away from the XmlTextReader and XmlValidating reader. Instead we will emphasize programming directly to the XmlReader and will provide an implementation of the factory design patterns which returns different XmlReader instances based on which features the user is interested. More importantly users will be able to layer different XmlReader implementations on those created by our factory which was always our intention since v1.0 of the .NET Framework. For example, one could layer XSD Validation on top the XIncludingReader from XInclude.NET to combine third party XInclude support with Microsoft's W3C XML Schema validation technologies.

As for whether the Sun's approach of just providing interfaces instead of concrete for XML parsing was such a great thing in Java I'd claim that it's been hit and miss. Most XML developers from the Java world despise the DOM for the reasons described in Chapter 33 of Elliotte Rusty Harold's Effective XML. This is the reason for the existence of extensions and alternatives to the DOM API which extend it such as Oracle's XDK, dom4J, JDOM, Xerces and XOM. Heck, you can't even get the XML as a string out of node or save an XML document object to a file without using extensions since these aren't in the base DOM API. As for SAX, the API just gives you access to regular parsing events nothing fancy. There isn't much difference functionally from programming against the base SAX APIs and programming against XmlReader.

The one point of interest is that Daniel claims that the Java way of not shipping with any XML APIs but just interfaces is somehow better than the .NET way. In Java one can programa against interfaces and loads the XML parser by passing the class name to a factory method. One could put this name in a config file and change it at runtime. The question is whether anyone in the .NET world actually thinks being able to change your XML parser implementation at runtime is anything more than a geek feature. I consider it as geeky as asking why you can't change the implementation of the System.String class to a user defined class that uses less memory at runtime without having to recompile. An interesting idea but one primarily of interest to the ultimate of power users.

The funny thing is that even if we shipped functionality where we looked in the registry or in some config file before figuring out what XML parser to load it's not as if there are an abundance of third party XML parsers targetting the .NET Framework in the first place. There is definitely no intention to ship any functionality like this in future versions of the .NET Framework.

Categories: XML

February 28, 2004

@ 05:20 AM

Comments [0]

Bloggers at Microsoft

Dylan Greene was at Microsoft last week and talks about some observations about blogging and Microsoft in his post My meeting with the Scoblizer

Some interesting things I picked up while at Microsoft:

None of my friends there blog.
None of them had heard of Scoble. (!)
None of them use RSS readers or read blogs with any frequency.
None of them seemed to understand the draw of blogging.

There are about 300 people blogging at Microsoft which sounds like a lot until you realize that at last count Microsoft had 55,000 employees. That means less than 1% of the employees at Microsoft are blogging. When you consider that it isn't that surprising that none of his friends blog when less than one in every hundred Microsoft employees blog or that they didn't know some random evangelist on the Windows team by name.

That said I do agree with Cameron Reilly that Microsoft is “still way ahead of the curve in terms of corporate blogging”.

Categories: Life in the B0rg Cube

February 27, 2004

@ 11:36 PM

Comments [13]

The Gay Marriage Debate

I've been watching the online discussions about the proposed constitutional ammendment to ban gay marriage with bemusement. It is such a classic sleight of hand trick. If I was a sitting president who'd been discovered to have started a war that cost thousands of lives primarily to enrich my defence contractor buddies and had the opposition party's presidential candidates polling better than me I'd want to come up with a way to focus the public discourse away from these issues. Perhaps with controversial proposed legislation that would be a hot button topic but most likely wouldn't get passed anyway? Yeah, probably.

It is unfortunate that such political games end up affecting people's lives and preventing the pursuit of happiness. At least it's not another phony war.

Categories: Ramblings

February 27, 2004

@ 08:20 PM

Comments [6]

RSS Bandit & ATOM

Given the fact that about 15 news aggregators currently support Mark Nottingham's Atom Syndication Format 0.3 (PRE-DRAFT) I'll be adding support for it to RSS Bandit this weekend. This won't be a big deal to implement relative to a number of other features Torsten and I have in mind. As Brent Simmons wrote

This experience was a reminder for me of how unimportant the underlying syndication formats are, in a way. What percent of time does an aggregator developer spend on RSS and Atom parsing code? 50%? 25%? 10%?

I figure it’s somewhere less than 1%.

The rest of the time is taken up with things like data storage, networking, and user interface. But mostly user interface. Not just implementing—which is often easy—but designing user interface, which is difficult.

In other RSS Bandit news Torsten is almost done with some code that fixes our #2 performance problem in RSS Bandit and Phil Haack has started work on official RSS Bandit documentation. Excellent work.

All of the above should show up in the next RSS Bandit release. Phil's documentation will most likely reside on the RSS Bandit Documentation Page on SourceForge and will be linked to from the RSS Bandit help menu.

Categories: RSS Bandit

February 26, 2004

@ 05:54 PM

Comments [1]

Aaron Swartz on Anti-Truths

Aaron Swartz has lots of interesting ideas about politics and copyright in the age of digital media. I disagree with a lot of his ideas on both but they are often well-thought and interesting. This month he continues his trend of interesting posts about politics with two entries Up is Down: How Stating the False Hides the True excerpted below

One of the more interesting Republican strategies is saying things whose opposite is true. They say that the Democratic nominee is bought off by special interests, the Democrats are outspending them, the Democrats are playing dirty, the Democrats don’t care about homeland security, the Democrats hate America, all when this is far more true of the Republicans. They say Joseph McCarthy was a noble man, the media has a liberal bias, affirmative action is bad for equality, Saddam had weapons of mass destruction, and Ronald Reagan was our greatest President, all when the opposite is far more true.

At first glance this seems bizarre — why draw attention to your weaknesses? But it’s actually a very clever use of the media. The media tries hard to be “fair and balanced”, and it generally believes the best way to do this is to present the opinions from both sides and make as few judgement calls as possible (to avoid introducing their own bias). And if there’s a debate on some issue, taking a side is seen as a judgement call.

and Down is Up: What This Stuff Is where he writes

I got a lot of responses to my previous post, Up is Down, along the lines of “oh, the Democrats lie as much as the Republicans”. But the piece was not about lies. For lack of a better term, it was about anti-truths. Anti-truths have two parts:

They’re completely false.
They’re more accurate when directly reversed.

It’s hard to find a completely unobjectionable one, but take “Ronald Reagan was our greatest President.” As for part one, I have seen no evidence that Reagan actually did anything particularly good on purpose and as for two, “Ronald Reagan was our worst President” seems to be a far more accurate statement, since he did lots of things that were quite bad.

My example of an anti-truth would have been “John Ashcroft respects the US constitution”. :)

Categories: Mindless Link Propagation

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for Thursday, 04 March 2004 - Dare Obasanjo's weblog