March 13, 2005
@ 07:48 PM

This time tomorrow I'll be at the O'Reilly Emerging Technology Conference. Checking out the conference program, I saw that Evan Williams will be hosting a session entitled Odeo -- Podcasting for Everyone. I've noticed the enthusiasm around podcasting among certain bloggers and the media but I am somewhat skeptical of the vision folks like Evan Williams have espoused in posts such as How Odeo Happened.

In thinking about podcasting, it is a good thing to remember the power law and the long tail. In his post Weblogs, Power Laws and Inequality, Clay Shirky wrote

The basic shape is simple - in any system sorted by rank, the value for the Nth position will be 1/N. For whatever is being ranked -- income, links, traffic -- the value of second place will be half that of first place, and tenth place will be one-tenth of first place. (There are other, more complex formulae that make the slope more or less extreme, but they all relate to this curve.) We've seen this shape in many systems. What've we've been lacking, until recently, is a theory to go with these observed patterns.
...
A second counter-intuitive aspect of power laws is that most elements in a power law system are below average, because the curve is so heavily weighted towards the top performers. In Figure #1, the average number of inbound links (cumulative links divided by the number of blogs) is 31. The first blog below 31 links is 142nd on the list, meaning two-thirds of the listed blogs have a below average number of inbound links. We are so used to the evenness of the bell curve, where the median position has the average value, that the idea of two-thirds of a population being below average sounds strange. (The actual median, 217th of 433, has only 15 inbound links.)

The bottom line here is that a majority of weblogs will have small to miniscule readership. However the focus of the media and the generalizations made about blogging will be on popular blogs with large readership. But the wants and needs of popular bloggers often do not mirror those of the average blogger. There is a lot of opportunity and room for error when trying to figure out where to invest in features for personal publishing tools such as weblog creation tools or RSS reading software. Clay Shirky also mentioned this in his post where he wrote

Meanwhile, the long tail of weblogs with few readers will become conversational. In a world where most bloggers get below average traffic, audience size can't be the only metric for success. LiveJournal had this figured out years ago, by assuming that people would be writing for their friends, rather than some impersonal audience. Publishing an essay and having 3 random people read it is a recipe for disappointment, but publishing an account of your Saturday night and having your 3 closest friends read it feels like a conversation, especially if they follow up with their own accounts. LiveJournal has an edge on most other blogging platforms because it can keep far better track of friend and group relationships, but the rise of general blog tools like Trackback may enable this conversational mode for most blogs.

The value of weblogging to most bloggers (i.e. the millions of people using services like LiveJournal, MSN Spaces and Blogger) is that it allows them to share their experiences with friends, family & strangers on the Web and it reduces the friction for getting content on the Web when compared to managing a personal homepage which was the state of the art in personal publishing on the Web last decade. In addition, there are the readers of weblogs to consider. The existence of RSS syndication and aggregators such as RSS Bandit & Bloglines have made it easy for people to read multiple weblogs with ease. According to Bloglines, their average user reads just over 20 feeds.

Before going into my list of issues with podcasting, I will point out that I think the current definition of podcasting which limits it to subscribing to feeds of audio files is fairly limiting. One could just as easily subscribe to other digital content such as video files using RSS. To me podcasting is about time shifting digital content, not just audio files.

With this setup out of the way I can list the top three reasons I am not as enthusiastic about podcasting as folks like Evan Williams 

  1. Creating digital content and getting it on the Web isn't easy enough: The lowest friction way I've seen thus far for personal publishing of audio content on the Web is the phone posting feature of LiveJournal but it is still a sub optimal solution. It gets worse when one considers how to create and share richer digital content such as videos. I suspect mobile phones will have a big part to play in the podcast creation if it becomes mainstream. On the other hand, sharing your words with the world doesn't get much easier than using the average blogging tool. 
  2. Viewing digital content is more time consuming than reading text content: I believe it takes the average person less time to read an average blog posting than to listen to an average audio podcast. This automatically reduces the size of the podcast market compared to plain old text blogging.  As mentioned earlier, the average Bloglines user subscribes to 20 feeds. Over the past two years, I've gone from subscribing to about 20 feeds to subscribing to around 160. However it would be impossible for me to find the time to listen to 20 podcast feeds a week, let alone scaling up to 160.
  3. Digital content tends to be opaque and lack metadata: Another problem with podcasting is that there are no explicit or implicit metadata standards forming around syndicating digital media content. The fact that an RSS feed is structured data that provides a title, author name, blog name, a permalink and so on allows one to build rich applications for processing RSS feeds both globally like Technorati & Feedster or locally like RSS Bandit. As long as digital media content are just opaque blobs of data hanging of an item in a feed, the ecosystem of tools for processing and consuming them will remain limited.

This is not to say that podcasting won't go a long way in making it easier for popular publishers to syndicate media content to users. It will, however it will not be the revolution in personal publishing that the combination of RSS and weblogging have been.

I'll need to remember to bring some of these up during Evan Williams' talk. I'm sure he'll have some interesting answers.


 

Charlene Li has a post entitled Bloghercon conference proposed where she writes

Quick – name me five woman bloggers. You probably came up with Wonkette, and if you’re reading this post, you’ve got me on your list. Can you come up with three more?

This is why Lisa Stone’s suggestion to develop Bloghercon is such a great idea. (Elisa Camahort has a follow-up post with more details here .)

It’s not that there are no women bloggers out there – it’s that we haven’t built up a network comparable to the “blog-boy’s club” that dominates the Technorati 100 . This is not to presume that there’s a conspiracy – just the reality that for a number of reasons, woman bloggers have had difficulty gaining visibility.

 

Interestingly enough I actually counted 10 women bloggers I know off of the top of my head without needing to count Charlene or knowing who this Wonkette person is. My list was Shelley Powers, Julia Lerman, Liz Lawley, Danah Boyd, Rebecca Dias, KC Lemson, Anita Rowland, Megan Anderson, Eve Maler and Lauren Wood. As I finished the list lots more came to mind, in fact I probably could have hit ten just counting women at MSN I know who blog but that would have been too easy.

 

I am constantly surprised by the people who read the closed circle of white-male dominated blogs commonly called the A-list who think that this somehow constitutes the entire blogosphere (I do dislike that word) or even a significant part of it.

 

I wonder when the NAACP or Jesse Jackson are going to get in on the act and hold a blaggercon conference for black bloggers. Speaking of which, it's my turn to ask "Quick – name me five black bloggers". Post your answers in the comments.


 

Categories: Ramblings

A bunch of folks at work have been prototyping a server-side RSS reader at http://www.start.com/1/. This isn't a final product but instead is intended to show people some of the ideas we at MSN are exploring around providing a rich experience around Web-based RSS/Atom aggregation.  

The Read/Write Web blog has a post entitled Microsoft's Web-based RSS Aggregator? which has a number of screenshots showing the functionality of the site. The site has been around for a few weeks and I'm pretty surprised it took this long to get widespread attention.

We definitely would love to get feedback from folks about the site. I'm personally interested in where people would like to see this sort of functionality integrated into the existing MSN family of sites and products, if at all.

PS: You may also want to check out http://www.start.com/2/ to test drive a prototype of a Web-based bookmarks manager.


 

Categories: MSN

March 8, 2005
@ 03:39 PM

A couple of days ago I was contacted about writing the foreword for the book Beginning RSS and Atom Programming by Danny Ayers and Andrew Watt. After reading a few chapters from the book I agreed to introduce the book.

When I started writing I wasn't familiar with the format of the typical foreword for a technical book. Looking through my library I ended up with two forewords that gave me some idea of how to proceed. They were Michael Rys's introduction of XQuery: The XML Query Language by Michael Brundage and Jim Miller's introduction of Essential .NET, Volume I: The Common Language Runtime by Don Box. I suspect I selected them because I've worked directly or indirectly with both authors and the folks who wrote the forewords to their books, so felt familiar about both the subjects and the people involved.

From the various forewords I read it seemed the goal of a foreword is twofold

  1. Explain why the subject matter is important/relevant to the reader
  2. Explain why the author(s) should be considered an authority in this subject area

I believe I achieved both these goals with the foreword I wrote for the book. The book is definitely a good attempt to distill what the important things a programmer should consider when deciding to work with XML syndication formats.

Even though I have written an academic paper, magazine articles and conference presentations this was a new experience. I keep getting closer and closer to the process of writing a book. Too bad I never will though.


 

Categories: Ramblings

Shelley Powers has written an amusing post about the Google AutoLink saga entitled Guys Dont Link which like all good satire is funny because it is true. Usually I'd provide an excerpt of the linked post but this post has to be read in its entirety to get the full effect.

Enjoy.


 

The Wolverine release of RSS Bandit has entered its final stretch; the bug count is under 15 from a high of over 50, the codebase is frozen except for critical fixes, and translations have started to trickle in. It is looking like the final version number for Wolverine will be v1.3.0.25 but don't quote me on that just yet.

Torsten and I have started talking about what we'd like to see in the following release, currently codenamed Nightcrawler. Over the next few weeks I'll be sharing some of our thoughts on where we'd like to see RSS Bandit go and eliciting feedback from our users. The first topic I have in mind is building a richer extensibility model. Torsten and Phil have discussed this issue in their blogs in the posts Fighting Ads and  Building a Better Extensibility Model For RSS Bandit respectively. As Phil wrote

Currently, the only plug-in model supported by RSS Bandit is the IBlogExtension interface. This is a very limited interface that allows developers to write a plug-in that adds a menu option to allow the user to manipulate a single feed item.

The ability to interact with the application from such a plug-in is very limited as the interface doesn't define an interface to the application other than a handle. (For info on how to write an IBlogExtension plug-in, see this article.)

Despite the limitations of IBlogExtension, it has led to some interesting plugins such as my plugin for posting links to del.icio.us from RSS Bandit.  This was actually a feature request which I fulfilled without the user having to wait for the next version of RSS Bandit. I'd like to be able to fulfill more complex feature requests without having users wait for the next version. Given the small number of IBlogExtension plugins I've seen come from our user base I'm pretty sure that it is quite likely that a richer plugin model would just end up mostly being used by Phil, Torsten and myself.  However I'd still like to get some feedback from our users about where they'd like to see more extensibility. Below are some of the extensibility points we've discussed along with usage scenarios and possible risks. I'd like to know which ones our users are interested in and would consider writing plugins with.

Feed Preprocessing

Plugins will be able to process RSS items just after they have been downloaded but before they are stored in RSS Bandit's in-memory and on-disk caches. This is basically what Torsten describes in his post Fighting Ads.

Use Case/Scenario: A user can write a plugin that assigns scores to news items according to the user's interests (e.g. a Bayesian filter) then annotates each news item with its score. For example, for me posts containing 'XML' or 'MSN Spaces' would be assigned a score of 5 while every other post could be assigned a score of 3. Then I could create a newspaper style that either grouped posts by their ranking or even filtered out posts that didn't have a certain score.

Another potential use case is pre-processing each news item to filter out ads as Torsten did in his example.  

Risks: My main concern with this approach is that badly written plugins could harm cause problems with the normal functioning of RSS Bandit. For example, if a plugin got stuck in an infinite loop it could hang the entire application since we'd never get back news items from the pre-processing step. Given that this is an instance of the halting problem I know we can't solve it in a general way so I may just have to acept the risks.

Pluggable protocols

Every once in a while, users ask for RSS Bandit to support other data formats and protocols than just RSS & Atom over HTTP. For example, I'd like us to support USENET newsgroups while I've seen a couple of requests that we should be able to support subscribing to POP3 mail boxes.

Ideally here we'd have a plugn infrastructure that allowed one to plugin both the parser and the protocol handler for a given format. The plugin would also specify the URI scheme used by the newly supported format so that RSS Bandit would know to dispatch requests using that plugin

Use Case/Scenario: In the case of USENET support the user would provide a plugin that knew how to parse messages in the RFC 822 format and how to fetch messages using NNTP. The USENET plugin would also register itself as the handler for the nntp and news URI schemes. The user could then subscribe to newsgroups by specifying a URL such as news://news.microsoft.com/microsoft.public.xml in the new feed dialog.  

Risks: Same as with Feed Preprocessing.

Hosted Winforms Applications

A user could add a .NET Winforms application  to RSS Bandit. This application would appear as a tab within the main Window and all its functionality could be used from RSS Bandit. There would also be some hooks for the application to register itself within the RSS Bandit main menu as well as mechanisms to pass information back and forth between the hosted application and RSS Bandit.

Use Case/Scenario: One would be able to host blog posting clients such as IMHO in RSS Bandit. The blog client would be distributed and updated independently of RSS Bandit.

Risks: This would be a great deal of work for questionable pay off.

Pluggable Storage

RSS Bandit currently caches feeds as RSS files on disk, specifically at the location C:\Documents and Settings\[username]\Application Data\RssBandit\Cache. It should be possible to specify other data storage sources and formats such as a relational database or Atom feeds.

Use Case/Scenario: A user can write a plugin that stores all RSS Bandit's feeds in a local Access database so that data mining can be done on the data.

Another use case is writing them to disk but in a different format than RSS. For example, one could write them to disk using the format used by another application so that the user could use both applications but have them share a single feed cache.

Risks: The same as with Feed Preprocessing.


 

Categories: RSS Bandit

March 5, 2005
@ 04:49 PM

Yesterday, in a meeting to hash out some of the details of MSN Spaces API an interesting question came up. So far I've been focused on the technical details of the API (what methods should we have? what protocol should it use? etc) as well as the scheduling impact but completely overlooked a key aspect of building developer platform. I hadn't really started thinking about how we planned to support developers using our API. Will we have a website? A mailing list? Or a newsgroup? How will people file bugs? Do we expect them to navigate to http://spaces.msn.com and use the feedback form?

Besides supporting developers, we will need a site to spread awareness of the existence of the API. After noticing the difference in the media response to the ability to get search results as RSS feeds from MSN Search and the announcement of the Yahoo! Search Developer Network it is clear to me that simply having great functionality and blogging about it isn't enough. To me, getting MSN Search results as RSS feeds gives me at least two things over Yahoo's approach. The first is that developers don't have to register with MSN as they have to with Yahoo! since they need to get application IDs. The second is that since the search results are an RSS feed, they not only can be consumed programmatically but can be consumed by regular users with off the shelf RSS readers. However I saw more buzz about YSDN than about the MSN Search feeds from various corners. I suspect that the lack of "oomph" in the announcement is the cause of this occurence.

Anyway, getting back to how we should support developers who want to use the MSN Spaces APIs, I'd be very interested to hear from developers of blogging tools as to what they'd like to see us do here.

Update: Jeroen van den Bos reminds me that MSN Search RSS feeds are only licensed for personal use. I need to ping the MSN Search folks about that.

Update: Mike Torres points out that both Yahoo (see YSDN FAQ) and Google (see Google API FAQ) have similar restrictions in their terms of use. It would be good to see MSN leading the way here. We've already gone one step better by not requiring developers to register and get application IDs. We should be able to the loosen the terms of use as well.


 

Categories: MSN

In his post Maybe a better posting api is needed  James Robertson writes

I've had harsh words to say about Atom in the past, but that was mostly over the feed format. I haven't looked at the posting API yet - maybe I should. The Blogger API and the MetaqWebLog API are simply nightmares. There doesn't seem to be any standard way for client tools to interact with a server - I was debugging the interaction between a client and my server last night via IRC. Even better - the client was set to use the MetaWebLog api, but was sending requests to blogger.apiNameHere names. Sheesh. There was also an interesting difference in api points - I had implemented 'getUserBlogs', and the client was sending 'getUsersBlogs'. A quick Google search turned up references to both. Sigh.

I implemented both names, pointing to the same method. I had to map blogger names over to MetaWebLog entry points, at least for the tool being tested last night - who knows what oddness will turn up next. What a complete mess...

I've been similarly stunned by the complete and utter mess the state of weblogging APIs is in. As I mentioned in my post What Blog Posting APIs are supported by MSN Spaces? one of my duties at work is to investigate the options and design the blogging API story for MSN Spaces. In doing this, I have discovered all the issues James Robertson brought up and more. Mark Pilgrim has an ApacheCon presentation entitled The Atom API which highlights some of the various issues. One of the lowlights from his presentation is the fact that the MetaWeblog API spec significantly contradicts itself by stating that the data model of structs passed between client and server is based on RSS 2.0 then includes examples of requests and responses that show that it clearly isn't.

My personal favorite bit of information that can only be discovered by trial and error is the existence of the blogger.deletePost method which isn't listed in the Blogger API documentation but is supported by a number of blog posting clients and weblog servers.

I can't believe that anyone who wants to write a client or server that uses the standard weblogging APIs has to go through this crap. It almost makes me want to go join in the atom-protocol discussions. Almost.


 

March 2, 2005
@ 12:40 PM

In the post Another totally obvious factoid Dave Winer writes

Okay, we don't know for a fact that Google is working on an operating system, but the tea leaves are pretty damned clear. Why else would they have hired Microsoft's operating system architect, Mark Lucovsky? Surely not to write a spreadsheet or word processor.

Considering that after working on operating systems Mark Lucovsky went on to become the central mind behind Hailstorm, it isn't clear cut that Google is interested in him for his operating systems knowledge. It will be interesting to see if after pulling of what Microsoft couldn't by getting the public to accept AutoLink when they rejected SmartTags, Google will also get the public to accept their version of Hailstorm.

I can already hear the arguments about how Google's Hailstorm would be just like a beloved butler whose job it was to keep an eye on your wallet, rolodex and schedule so you don't have to worry about them. Positive perception in the market place is a wonderful thing.


 

Categories: Technology

In the article entitled Wonder Why MSN Didn't Think of This?  Mary Jo Foley writes

Or, maybe it has but just has yet to announce it…. On Monday, AOL announced a beta of AIM Sync, a tool that effectively turns Microsoft's Outlook e-mail client into a massive AOL Instant Messaging buddy list.

The implication here is that similar integration of instant messaging presence information does not exist between Outlook and MSN Messenger. This is actually incorrect. This feature exists today in Outlook 2003 and is called the Person Names Smart Tag. Below is a screenshot of my email inbox showing the feature in action.

email inbox shoing mike champion's online status

In fact, this feature is a cause of a common annoyance among users of MSN Messenger and Outlook. Many people have complained that they can't close MSN Messenger if Outlook is running. This is the feature responsible for that behavior. Disabling it removes the dependency between both applications.  

It's good to see yet another of our competitors learning from our innovations.


 

Categories: MSN