August 15, 2005
@ 06:20 PM

It seems there is an open call for participation for the 2006 edition of the O'Reilly Emerging Technology Conference (ETech). Although I'm not in right demographic to be an ETech speaker since I don't work at a sexy Silicon Valley startup, I'm not a VP at a major software company and don't consider myself a Friend of O'Reilly, I plan to toss my hat in the ring and send in two talk proposals anyway.

  1. What's Next for RSS, Atom and Content Aggregation: Currently the primary usage of content syndication technology like RSS has been consuming news and blog postings in desktop or web-based RSS readers. However the opportunities created by syndication technologies go much further than enabling us to keep up with Slashdot and Boing Boing in our RSS reader of choice. Podcasting is one manifestation of the new opportunities that arise once the concept of content syndication and aggregation is applied to domains outside of news sites and blogs. This talk will focus problem areas and scenarios oustde of blogs, news sites and traditional RSS readers that can benefit from the application of syndication technologies like RSS and Atom.

  2. Bringing MSN into Web 2.0: The essence of Web 2.0 is moving from a Web consisting of Web pages and Web sites (Web 1.0) to a Web consisting of Web applications based on open data that are built on Web platforms. MSN hosts a number of properties from social software applications like MSN Spaces, Hotmail, MSN Groups and MSN Messenger which are used by hundreds of millions of people to communicate to software that enables people to find information they need such as MSN Search and MSN Virtual Earth. All of these applications. A number of these web sites are making the transition to becoming web platforms; MSN Virtual Earth has a Javascript API, MSN Search exposes search results as RSS feeds and MSN Spaces will support the MetaWeblog API which uses XML-RPC. This talk will focus on the current and future API offerings coming from MSN and give a behind the scenes look as to how some of these APIs came about from conception and getting sign off from the bean counters to technical details on building web services.

These are my first drafts of both proposals, criticisms are welcome. If they don't get accepted, I'll survive. Now that I've actually written the abstracts I can just keep submitting them to various conferences I'm interested in until they get accepted somewhere. In my experience, it usually takes about 2 or 3 submissions to get a talk accepted anyway.


 

Categories: Web Development

August 15, 2005
@ 03:07 AM

The MSN Mobile team dropped two excellent betas last week. The first was http://mobile.spaces.msn.com/ which is mentioned in Mike Torres's post on the Mobile Spaces (Beta) where we learn

you can:

  1. Create a space from a mobile device.  Pocket PCs, Palms, and most popular mobile phones are supported.  Just browse over to http://mobile.spaces.msn.com from your mobile device (or go to http://spaces.msn.com and you will be redirected to the mobile version)
  2. See a list of your contacts' recently updated spaces.  This feature is really useful for a mobile device and great for catching up with people!  Just "click" on a contact to get to their space and start exploring.
  3. Add blog entries, view your archives, email a link to your space, and even change your settings - all from your itty bitty mobile device.
  4. Read and add new comments (my favorite feature!)  You are now able to stay on top of discussions from wherever you happen to be - in school, on a bus, in a meeting, or in line at Starbucks.

The second beta is http://mobile.msn.com/search/ which brings local search to your mobile device. This is mentioned in the blog post Get Local Search with Maps and Directions on your phone!  from the MSN Search blog where we learn

So what does it do? You can search for a restaurant, store, school, dentist, museum – basically, anything listed in the Yellow Pages and White Pages. Just enter your search term (i.e. "coffee" or "Victrola" ) and location (zip code, city/state or full street address) and hit the Search button. Your recently used locations are even stored and easily accessible the next time you use the service. We’ll return the first handful of results, including name, address, distance from your current location and phone number – which you can dial by clicking!  Select the result name and you’ll see a page with more detail, including a color map. Select "get directions" and we’ll provide turn-by-turn driving directions between your starting location and result address (both editable). All of these features have been specially designed to work on your phone, requiring minimal interaction and optimized for speed and ease of use.

The MSN Mobile crewis definitely shipping some good stuff. Props go out to Michael Smuga and the rest of the gang.


 

Categories: MSN

Today I was working on completing the support for Atom 1.0 in the next version of RSS Bandit and decided to make the changes for parsing out enclosure/podcast elements while I was in that part of the code. RSS 2.0 is pretty straightforward, there is an <enclosure> element that is a child of the <item> element.

On the other hand, the Atom 1.0 specification has two completely different mechanisms for creating podcasts. Both mechanisms are described in the article by James Snell entitled An overview of the Atom 1.0 Syndication Format. From the article

Support for enclosures

Listing 4. Atom 1.0 podcasting example

						
								
										       
<feed xmlns="http://www.w3.org/2005/Atom">
  <id>http://www.example.org/myfeed</id>
  <title>My Podcast Feed</title>
  <updated>2005-07-15T12:00:00Z</updated>
  <author>
    <name>James M Snell</name>
  </author>
  <link href="http://example.org" />
  <link rel="self" href="http://example.org/myfeed" />
  <entry>
    <id>http://www.example.org/entries/1</id>
    <title>Atom 1.0</title>
    <updated>2005-07-15T12:00:00Z</updated>
    <link href="http://www.example.org/entries/1" />
    <summary>An overview of Atom 1.0</summary>
    <link rel="enclosure" 
          type="audio/mpeg"
          title="MP3"
          href="http://www.example.org/myaudiofile.mp3"
          length="1234" />
										
												
														  <link rel="enclosure"
          type="application/x-bittorrent"
          title="BitTorrent"
          href="http://www.example.org/myaudiofile.torrent"
          length="1234" />
												
    <content type="xhtml">
      <div xmlns="http://www.w3.org/1999/xhtml">
        <h1>Show Notes</h1>
        <ul>
          <li>00:01:00 -- Introduction</li>
          <li>00:15:00 -- Talking about Atom 1.0</li>
          <li>00:30:00 -- Wrapping up</li>
        </ul>
      </div>
    </content>
  </entry>
</feed>
								
						
				

Atom enclosures allow you to do more than just distribute audio content. Enclosure links can reference any type of resource. Listing 5, for instance, uses multiple enclosures within a single entry to reference translated versions of a single PDF document that's accessible through FTP. The hreflang attribute identifies the language that each PDF document has been translated into.

Content-by-reference

In addition to support for links and enclosures, Atom introduces the ability to reference entry content by URI. Listing 6, for instance, illustrates how an Atom feed for a photo weblog might appear. The content element references each individual photograph in the blog. The summary element provides a caption for the image.


Listing 6. A simple list of images using Atom 1.0

						
								
										        
<feed xmlns="http://www.w3.org/2005/Atom"
      xml:base="http://www.example.org/">
  <id>http://www.example.org/pictures</id>
  <title>My Picture Gallery</title>
  <updated>2005-07-15T12:00:00Z</updated>
  <author>
    <name>James M Snell</name>
  </author>
  <entry>
     <id>http://www.example.org/entries/1</id>
     <title>Trip to San Francisco</title>
     <link href="/entries/1" />
     <updated>2005-07-15T12:00:00Z</updated>
     <summary>A picture of my hotel room in San Francisco</summary>
     <content type="image/png" src="/mypng1.png" />
  </entry>
  <entry>
    <id>http://www.example.org/entries/2</id>
    <title>My new car</title>
    <link href="/entries/2" />
    <updated>2005-07-15T12:00:00Z</updated>
    <summary>A picture of my new car</summary>
    <content type="image/png" src="/mypng2.png" />
  </entry>
</feed>
								
						
				

This content-by-reference mechanism provides a very flexible means of expanding the types of content that one can syndicate through Atom.

After looking at this from all angles for about 30 minutes the only conclusion I can come to is that Atom provided two completely different mechanisms of achieving the same goal. This is likely a potential gotcha for aggregator authors who might end up supporting one or the other of the mechanisms instead of both.

After this, I still have to add some code to also support Yahoo! Media RSS and then track down some feeds that actually use all the various enclosure techniques so I can test my code with actual real world scenarios. I'd appreciate any pointers to test feeds especially for the Yahoo! Media extensions to RSS [which I'm considering not supporting if there aren't that many feeds that use it].

No rest for the wicked. ;)


 

In recent weeks there have been a number of blog postings critical of the Technorai Top 100 List of popular web logs. The criticisms have primarily been of two flavors; some posts have been critical of the idea of blogging as popularity contests which such lists encourage and others have criticized the actual mechanism of calculating popularity used by Technorati. I agree with both criticisms especially the former. There have been a number of excellent posts arguing both points which I have think are worth sharing.

Mary Hodder, in her post Link Love Lost or How Social Gestures within Topic Groups are More Interesting Than Link, argues that more metrics besides link count should be used for calculating popularity and influence. Some of the additional metrics she suggests include comment counts and number of subscribers to the site's RSS feed. She also suggests creating topic specific lists instead of one ber list for the entire blogosphere. It seems a primary motivation for encouraging this approach is to increase the pool of bloggers that are targetted by PR agencies and the like. Specifically Mary writes

However, I'm beginning to see many reports prepared by PR people, communications consultants etc. that make assessments of 'influential bloggers' for particular clients. These reports 'score' bloggers by some random number based on something: maybe inbound links or the number of bloglines subscribers or some such single figure called out next to each blog's name.

Shelley Powers has a different perspective in her post Technology is neither good nor evil. In arguing against the popularity contests inherent in creating competing A-lists or even just B-lists to complement the A-lists she writes 

Even if we tried to analyze a persons links to another, we cant derive from this anything other than person A has linked to person B several times. If we use these to define a community to which we belong, and then seek to rank ourselves within these communities, all weve done is create a bunch of little Technorati 100s and communities that are going to form barriers to entry. We see this communal behavior all too often: a small group of people who know each other link to each other frequently and to outsiders infrequently; basically shutting down the discussion outside of the community.
...
I think Mary should stop with I hate rankism. I understand the motivations behind this work, but ultimately, whatever algorithm is derived will eventually end up replicating the existing patterns of authority rather than replacing them. This pattern repeated itself within the links to Jay Rosens post; it repeated itself within the speaker list that Mary started for women ("where are the women speakers"), but had its first man within a few hours, and whose purpose was redefined within a day to include both men and women.

Rankings are based on competition. Those who seek to compete will always dominate within a ranking, no matter how carefully we try to 'route' around their own particular form of 'damage'. What we need to challenge is the pattern, not the tools, or the tool results. 

I agree with Shelley that attempts to right the so called "imbalance" created by lists such as the Technorati Top 100 will encourage competition and stratification within certain blogging circles. I also agree that despite whatever algorithms are used, a lot of the same names will still end up on the lists for a variety of reasons. A major one being that a number of the so-called A-list blogs actually work very hard to be "popular" and changing the metrics by which their popularity is judged won't change this fact.

So Shelley has given us some of the social arguments while popularity lists such as the Technorati Top 100 aren't a good idea. But are the technical flaws in Technorati's approach to calculating weblog popularity so bad? Yes, they are.

Danah Boyd has a post entitled The biases of links where she did some research to show exactly how flawed simply counting links on web pages isn't an accurate way to calculate popularity or influence. There are a lot of excellent points in Danah's post and the entire post is worth reading multiple times. Below are some key excerpts from Danah's post

I decided to do the same for non-group blogs in the Technorati Top 100. I hadn't looked at the Top 100 in a while and was floored to realize that most of those blogs are group blogs and/or professional blogs (with "editors" and clear financial backing). Most are covered in advertisements and other things meant to make them money. It's very clear that their creators have worked hard to reach many eyes (for fame, power or money?).
...
Blogrolls:

  • All MSNSpaces users have a list of "Updated Spaces" that looks like a blogroll. It's not. It's a random list of 10 blogs on MSNSpaces that have been recently updated. As a result, without special code (like in Technorati), search engines get to see MSNSpace bloggers as connecting to lots of other blogs. This would create the impression of high network density between MSNSpaces which is inaccurate.
  • Few LiveJournals have a blogroll but almost all have a list of friends one click away. This is not considered by search tools that look only at the front page.
    ...
  • Blogrolls seem to be very common on politically-oriented blogs and always connect to blogs with similar political views (or to mainstream media).
  • Blogrolls by group blogging companies (like Weblogs, Inc.) always link to other blogs in the domain, using collective link power to help all.
    ...
  • Male bloggers who write about technology (particularly social software) seem to be the most likely to keep blogrolls. Their blogrolls tend be be dominantly male, even when few of the blogs they link to are about technology. I haven't found one with >25% female bloggers (and most seem to be closer to 10%).
  • On LJ (even though it doesn't count) and Xanga, there's a gender division in blogrolls whereby female bloggers have mostly female "friends" and vice versa.
  • I was also fascinated that most of the mommy bloggers that i met at Blogher link to Dooce (in Top 100) but Dooce links to no one. This seems to be true of a lot of topical sites - there's a consensus on who is in the "top" and everyone links to them but they link to no one.
    ...

Linking patterns:

  • The Top 100 tend to link to mainstream media, companies or websites (like Wikipedia, IMDB) more than to other blogs (Boing Boing is an exception).
  • Blogs on blogging services rarely link to blogs in the posts (even when they are talking about other friends who are in their blogroll or friends' list). It looks like there's a gender split in tool use; Mena said that LJ is like 75% female, while Typepad and Moveable Type have far fewer women.
  • Bloggers often talk about other people without linking to their blog (as though the audience would know the blog based on the person). For example, a blogger might talk about Halley Suitt's presence or comments at Blogher but never link to her. This is much rarer in the Top 100 who tend to link to people when they reference them.
  • Content type is correlated with link structure (personal blogs contain few links, politics blogs contain lots of links). There's a gender split in content type.
  • When bloggers link to another blog, it is more likely to be same gender.

I began this investigation curious about gender differences. There are a few things that we know in social networks. First, our social networks are frequently split by gender (from childhood on). Second, men tend to have large numbers of weak ties and women tend to have fewer, but stronger ties. This means that in traditional social networks, men tend to know far more people but not nearly as intimately as those women know. (This is a huge advantage for men in professional spheres but tends to wreak havoc when social support becomes more necessary and is often attributed to depression later in life.)

While blog linking tends to be gender-dependent, the number of links seems to be primarily correlated with content type and service. Of course, since content type and service are correlated by gender, gender is likely a secondary effect.
...
These services are definitely measuring something but what they're measuring is what their algorithms are designed to do, not necessarily influence or prestige or anything else. They're very effectively measuring the available link structure. The difficulty is that there is nothing consistent whatsoever with that link structure. There are disparate norms, varied uses of links and linking artifacts controlled by external sources (like the hosting company). There is power in defining the norms, but one should question whether or companies or collectives should define them. By squishing everyone into the same rule set so that something can be measured, the people behind an algorithm are exerting authority and power, not of the collective, but of their biased view of what should be. This is inherently why there's nothing neutral about an algorithm.

There is a lot of good stuff in the excerpts above and it would take an entire post or maybe a full article to go over all the gems in Danah's entry. One random but interesting point is that LiveJournal bloggers are penalized by systems such as the Technorati Top 100. For example, Jamie Zawinski has over 1900 people who link to him from their Friend's page in LiveJournal but he somehow doesn't make the cut for the Technorati Top 100. Maybe the fact that most of his popularity is within the LiveJournal community makes his "authority" less valid than others with less incoming links that are in the Technorati Top 100 list.

Yeah, right.


 

August 13, 2005
@ 05:11 AM

Robert Scoble has a blog post entitled Filtering Out MSN's Filter which seems like a good enough opportunity to state why I think of the newest addition to MSN's family of offerings. Robert wrote

MSN Filter sure is getting some people upset (hi Ross Mayfield).

Personally I wanted to give MSN Filter a few weeks before giving my opinion, but Ross goaded me into it.

Boring. Boring. Boring.

First, what is it? MSN hired five people to do a blog each. There's one on sports. Another on tech. Music. TV. Lifestyle.

I have to agree with Robert, I think the MSN Filter sites are pretty boring. More importantly as a MSFT shareholder and someone that works at MSN, I think it is a bad business investment in its current incarnation. Precedents for professional blogging such as Gawker Media (e.g. Gizmodo)  and Weblogs Inc. (e.g. Engadget) family of sites are supported by topic specific ads including some from Google's AdSense program. On the other hand, if you look at MSN's Technology Filter you don't see any such ads.

I think it is pretty cool that MSN is allowing folks experiment with ventures like MSN Filter. However my personal opinion is that in its current incarnation it's a lame knock off of the stuff coming out of folks like Nick Denton and Jason Calacanis and it doesn't have a chance of making much [if any] money for us since they are eschewing targetted ads.  

Lame. Lame. Lame.


 

Categories: MSN

In every release of RSS Bandit, I spend some time working on performance. Making the application use less memory and less CPU time while adding more features is a challenge. Recently I read a post by Mitch Denny entitled RSS Bandit Performance Problems where after some investigation he found a key culprit for some of our memory consumption issues in the last release. Mitch wrote

Last weekend I was subscribed to over about 1000 RSS feeds and conicidentally last weekend RSSBandit also became unusable. Obviously I had reached some kind of threshold that the architecture of RSSBandit wasn’t designed to cope with.

My first instinct was to ditch and go and find something a bit faster – after all it is a .NET application and we know how much of a memory hog those things are! Errr – hang on. Don’t I encourage our customers to go out and use .NET to build mission critical enterprise applications every day? I really needed to take a closer look at what was going on.

In idle RSSBandit takes up around 120–170MB of RAM on my laptop. Thats more than Outlook and SQL Server, and often more than Visual Studio (except when its in full flight) but to be honest I’m not that surprised because in order for it to give me the unread items count it has to process quite a few files containing lots of unique strings – that means relatively large chunks of being allocated just for data.

I decided to look a bit deeper and run the CLR Allocation Profiler over the code to see where all the memory (and by extension good performance was going). I remembered this article by Rico Mariani which included the sage words that “space is king” and while I waited for the profiler to download tried to guess what the problem would be based on my previous experience.

What I imagined was buckets of string allocations to store posts in their entirety and a significant number of XML related object allocations but when I looked at the allocation graph I saw something interesting.

... [see http://notgartner.com/Downloads/AllocationGraph3.GIF]

As you can see there is a huge amount of traffic between this native function and the NativeWindow class. It was at this point that I started to suspect what the actual problem was and had to giggle at how many times this same problem pops up in smart client applications.

From what I can tell the problem is an excessive amount of marshalling to the UI thread is going on. This is causing threads to synchronise (tell tale DynamicInvoke calls are in there) and quite a bit of short term memory to be allocated over the lifetime of the application. Notice that there is 610MB of traffic between the native function and NativeWindow so obviously that memory isn’t hanging around.

The fix? I don’t know - but I suspect if I went in to the RSSBandit source and unplugged the UI udpates from the UpdatedFeed event the UI responsiveness would increase significantly (the background thread isn’t continually breaking into the main loop to update an unread count on a tree node).

It seems most of the memory usage while running RSS Bandit on Mitch's computer came from callbacks in the RSS parsing code that update the unread item count in the tree view within the GUI whenever a request to fetch an RSS feed is completed. Wow!!!

This is the last place I would have considered looking for somewhere to optimize our code and yet another example of why  "measurement is king" when it comes to figuring out how to improve the performance of an application. Given that a lot of the time a feed is requested there is no need to update the UI since no new items are fetched, there is a lot of improvement that we can gain here.

Yet again I am reminded that writing a rich client application like RSS Bandit using the .NET Framework means that you have to be familiar with a bunch of Win32/COM interop crap even if all you want to do is code in C#. I guess programming wouldn't be so challenging if not for gotchas like this lying around all over the place. :)


 

Categories: RSS Bandit | Technology

The Mini-Microsoft blog has an entry entitled 6% raise? I want to work for Dilbert's company! where he writes

Holy whatsa, Alice got a 6% raise ? I'd seriously consider hanging out in the bushes near Google-Kirkland with my aluminum bat to totally Tonya-Hardin-up some delicate competitive fingers just to get a 6% raise.

If you're a lead, you can bring up the manager review tool and check-in on how your reports are doing within The Model. Maybe some bits and pieces will move around, but the review model is pretty much done now and set to go into effect the 1st of September, with the mid-September paycheck showing any benefits.

One thing I've noticed kvetching with other managers is that once again, pay raises are minimal. I'm talking 2%-ish for a 3.5 review. That's barely keeping up with cost-of-living / inflation for doing more than is expected up of you. And of course, 3.0s, for the most part, get nothing. That's right: you're losing buying power for making a 3.0 - doing what's expected of you.

The poor quality of yearly raises was one of the reasons I decided to leave the XML team last year. I realized that no matter how hard I worked, I wouldn't be significantly financially compensated for going above and beyond what I was required to do. Given that I'm an "above and beyond" kinda guy I saw two choices; I could be underpaid or I could be underpaid and work on stuff I was passionate about. So I moved to MSN.

After doing some thinking during my vacation I concluded that MSFT isn't the kind of place I can see myself working at in 5 years. One of the repurcussions of this conclusion is that I'm going to start working on getting an MBA so I can broaden my options whenever I decide that the time has come for me to leave the B0rg cube. Of course, the nagging from my folks about when I'm going to grow up (get married, finish my education, etc) helped in coming to this decision.

Bah! Going back to school is going to suck.


 

Categories: Life in the B0rg Cube

August 13, 2005
@ 03:30 AM

I'm back in Seattle and may have already beaten jet lag by having never switched my watch from west coast time. It feels good to be back in my apartment. The five flights back were pretty uneventful. The only noteworthy event was that I saw Forest Whitaker in the upper class lounge of Virgin Atlantic at Heathrow airport. I was going to walk up to him and tell him how much I loved The Crying Game and Waiting To Exhale until I realized that would have made me sound like a jerk. I doubt that people in the movie business like being told their stuff rocked...a decade ago.

PS: If you are ever in the UK and you hear someone described as being Asian, it means they are from India not East Asia as is the case in the US.


 

Categories: Ramblings

I'm on the way back from my trip and this is the part of the vacation that sucks. It's going to take a total of 5 flights to get from Abuja back to Seattle as well as about half a day of sitting around in airports as well. Below are a bunch of last minute impressions about Nigeria and London (where I'm currently posting this from).
  • All the male restrooms in Heathrow airport have condoms dispensers. This really has me scratching my head since the only place I usually see them is in night club restrooms which makes sense since a bunch of hooking up goes on at night clubs. So now I have this impression that somewhere in Heathrow there is a bunch of debauchery going on and I'm not a part of it. It must be the first class lounges...

  • If you ask a British bartender for a 'Long Island Iced Tea', don't be surprised if he responds "We don't serve tea at the bar, twit!"

  • It seems I've picked up homophobia by osmosis while in the United States. I kept finding it weird that men could be seen holding hands together either for emphasis in a conversation or while walking without being seen as 'gay' in Nigeria. Similarly having guys sleep in the same bed also gave me a similar vibe. I can't believe I'm getting culture shock from my home country.

  • Do you know who cleans the streets of Lagos & Abuja? The street sweepers, literally. I was freaked out to see people with brooms sweeping the sides of the roads in both Abuja and Lagos without the luxury of safety cones. My memory fails me as to whether this is an improvement from not having street sweepers from a few years ago or this was just the status quo.

  • Soft drinks sold in plastic bottles seems to be gaining popularity in Lagos & Abuja. Back in the day it was all about the glass bottles, which were always redeemed by people. In fact, the price of a bottle of beer or a soft drink always assumed you'd be returning a bottle as well. It took me a while to get used to the 'wastefulness' in the United States where people just threw away the bottles. Of course, there were other places where the wastefulness surprised me as well when I first got here such as using paper towels instead of wash rags or styrofoam silverware & plates instead of reusable plastic ones at fast food places. Now it's the other way around. After doing the dishes at my mom's I was confused to not find paper towels nearby. I am becoming so American...

  • Thanks to a ban on external imports of various consumer goods we now get Heineken and Five Alive brewed locally.  Awesome!!!


 

Categories: Trip Report

In his post entitled Google News and RSS Dave Winer writes

It's the same reason I'm not giddy withdelight that Microsoft decided to call their support of RSS "web feeds"

Considering that the support for XML syndication technologies in IE 7 includes both flavors of RSS (1.0 & 0.91/2.0) and Atom, I personally don't think it is a good idea to call the feature 'RSS'.

Then there's the fact that RSS does sound a bit geeky, after all most people call them web pages and web sites not HTML documents and domains.

Internet Explorer is used by hundreds of millions of regular folks not just geeks. The IE team is simply trying to make the feature approachable to end users.


 

Categories: Life in the B0rg Cube