In recent weeks there have been a number of blog postings critical of the Technorai Top 100 List of popular web logs. The criticisms have primarily been of two flavors; some posts have been critical of the idea of blogging as popularity contests which such lists encourage and others have criticized the actual mechanism of calculating popularity used by Technorati. I agree with both criticisms especially the former. There have been a number of excellent posts arguing both points which I have think are worth sharing.

Mary Hodder, in her post Link Love Lost or How Social Gestures within Topic Groups are More Interesting Than Link, argues that more metrics besides link count should be used for calculating popularity and influence. Some of the additional metrics she suggests include comment counts and number of subscribers to the site's RSS feed. She also suggests creating topic specific lists instead of one ber list for the entire blogosphere. It seems a primary motivation for encouraging this approach is to increase the pool of bloggers that are targetted by PR agencies and the like. Specifically Mary writes

However, I'm beginning to see many reports prepared by PR people, communications consultants etc. that make assessments of 'influential bloggers' for particular clients. These reports 'score' bloggers by some random number based on something: maybe inbound links or the number of bloglines subscribers or some such single figure called out next to each blog's name.

Shelley Powers has a different perspective in her post Technology is neither good nor evil. In arguing against the popularity contests inherent in creating competing A-lists or even just B-lists to complement the A-lists she writes 

Even if we tried to analyze a persons links to another, we cant derive from this anything other than person A has linked to person B several times. If we use these to define a community to which we belong, and then seek to rank ourselves within these communities, all weve done is create a bunch of little Technorati 100s and communities that are going to form barriers to entry. We see this communal behavior all too often: a small group of people who know each other link to each other frequently and to outsiders infrequently; basically shutting down the discussion outside of the community.
...
I think Mary should stop with I hate rankism. I understand the motivations behind this work, but ultimately, whatever algorithm is derived will eventually end up replicating the existing patterns of authority rather than replacing them. This pattern repeated itself within the links to Jay Rosens post; it repeated itself within the speaker list that Mary started for women ("where are the women speakers"), but had its first man within a few hours, and whose purpose was redefined within a day to include both men and women.

Rankings are based on competition. Those who seek to compete will always dominate within a ranking, no matter how carefully we try to 'route' around their own particular form of 'damage'. What we need to challenge is the pattern, not the tools, or the tool results. 

I agree with Shelley that attempts to right the so called "imbalance" created by lists such as the Technorati Top 100 will encourage competition and stratification within certain blogging circles. I also agree that despite whatever algorithms are used, a lot of the same names will still end up on the lists for a variety of reasons. A major one being that a number of the so-called A-list blogs actually work very hard to be "popular" and changing the metrics by which their popularity is judged won't change this fact.

So Shelley has given us some of the social arguments while popularity lists such as the Technorati Top 100 aren't a good idea. But are the technical flaws in Technorati's approach to calculating weblog popularity so bad? Yes, they are.

Danah Boyd has a post entitled The biases of links where she did some research to show exactly how flawed simply counting links on web pages isn't an accurate way to calculate popularity or influence. There are a lot of excellent points in Danah's post and the entire post is worth reading multiple times. Below are some key excerpts from Danah's post

I decided to do the same for non-group blogs in the Technorati Top 100. I hadn't looked at the Top 100 in a while and was floored to realize that most of those blogs are group blogs and/or professional blogs (with "editors" and clear financial backing). Most are covered in advertisements and other things meant to make them money. It's very clear that their creators have worked hard to reach many eyes (for fame, power or money?).
...
Blogrolls:

  • All MSNSpaces users have a list of "Updated Spaces" that looks like a blogroll. It's not. It's a random list of 10 blogs on MSNSpaces that have been recently updated. As a result, without special code (like in Technorati), search engines get to see MSNSpace bloggers as connecting to lots of other blogs. This would create the impression of high network density between MSNSpaces which is inaccurate.
  • Few LiveJournals have a blogroll but almost all have a list of friends one click away. This is not considered by search tools that look only at the front page.
    ...
  • Blogrolls seem to be very common on politically-oriented blogs and always connect to blogs with similar political views (or to mainstream media).
  • Blogrolls by group blogging companies (like Weblogs, Inc.) always link to other blogs in the domain, using collective link power to help all.
    ...
  • Male bloggers who write about technology (particularly social software) seem to be the most likely to keep blogrolls. Their blogrolls tend be be dominantly male, even when few of the blogs they link to are about technology. I haven't found one with >25% female bloggers (and most seem to be closer to 10%).
  • On LJ (even though it doesn't count) and Xanga, there's a gender division in blogrolls whereby female bloggers have mostly female "friends" and vice versa.
  • I was also fascinated that most of the mommy bloggers that i met at Blogher link to Dooce (in Top 100) but Dooce links to no one. This seems to be true of a lot of topical sites - there's a consensus on who is in the "top" and everyone links to them but they link to no one.
    ...

Linking patterns:

  • The Top 100 tend to link to mainstream media, companies or websites (like Wikipedia, IMDB) more than to other blogs (Boing Boing is an exception).
  • Blogs on blogging services rarely link to blogs in the posts (even when they are talking about other friends who are in their blogroll or friends' list). It looks like there's a gender split in tool use; Mena said that LJ is like 75% female, while Typepad and Moveable Type have far fewer women.
  • Bloggers often talk about other people without linking to their blog (as though the audience would know the blog based on the person). For example, a blogger might talk about Halley Suitt's presence or comments at Blogher but never link to her. This is much rarer in the Top 100 who tend to link to people when they reference them.
  • Content type is correlated with link structure (personal blogs contain few links, politics blogs contain lots of links). There's a gender split in content type.
  • When bloggers link to another blog, it is more likely to be same gender.

I began this investigation curious about gender differences. There are a few things that we know in social networks. First, our social networks are frequently split by gender (from childhood on). Second, men tend to have large numbers of weak ties and women tend to have fewer, but stronger ties. This means that in traditional social networks, men tend to know far more people but not nearly as intimately as those women know. (This is a huge advantage for men in professional spheres but tends to wreak havoc when social support becomes more necessary and is often attributed to depression later in life.)

While blog linking tends to be gender-dependent, the number of links seems to be primarily correlated with content type and service. Of course, since content type and service are correlated by gender, gender is likely a secondary effect.
...
These services are definitely measuring something but what they're measuring is what their algorithms are designed to do, not necessarily influence or prestige or anything else. They're very effectively measuring the available link structure. The difficulty is that there is nothing consistent whatsoever with that link structure. There are disparate norms, varied uses of links and linking artifacts controlled by external sources (like the hosting company). There is power in defining the norms, but one should question whether or companies or collectives should define them. By squishing everyone into the same rule set so that something can be measured, the people behind an algorithm are exerting authority and power, not of the collective, but of their biased view of what should be. This is inherently why there's nothing neutral about an algorithm.

There is a lot of good stuff in the excerpts above and it would take an entire post or maybe a full article to go over all the gems in Danah's entry. One random but interesting point is that LiveJournal bloggers are penalized by systems such as the Technorati Top 100. For example, Jamie Zawinski has over 1900 people who link to him from their Friend's page in LiveJournal but he somehow doesn't make the cut for the Technorati Top 100. Maybe the fact that most of his popularity is within the LiveJournal community makes his "authority" less valid than others with less incoming links that are in the Technorati Top 100 list.

Yeah, right.


 

August 13, 2005
@ 05:11 AM

Robert Scoble has a blog post entitled Filtering Out MSN's Filter which seems like a good enough opportunity to state why I think of the newest addition to MSN's family of offerings. Robert wrote

MSN Filter sure is getting some people upset (hi Ross Mayfield).

Personally I wanted to give MSN Filter a few weeks before giving my opinion, but Ross goaded me into it.

Boring. Boring. Boring.

First, what is it? MSN hired five people to do a blog each. There's one on sports. Another on tech. Music. TV. Lifestyle.

I have to agree with Robert, I think the MSN Filter sites are pretty boring. More importantly as a MSFT shareholder and someone that works at MSN, I think it is a bad business investment in its current incarnation. Precedents for professional blogging such as Gawker Media (e.g. Gizmodo)  and Weblogs Inc. (e.g. Engadget) family of sites are supported by topic specific ads including some from Google's AdSense program. On the other hand, if you look at MSN's Technology Filter you don't see any such ads.

I think it is pretty cool that MSN is allowing folks experiment with ventures like MSN Filter. However my personal opinion is that in its current incarnation it's a lame knock off of the stuff coming out of folks like Nick Denton and Jason Calacanis and it doesn't have a chance of making much [if any] money for us since they are eschewing targetted ads.  

Lame. Lame. Lame.


 

Categories: MSN

In every release of RSS Bandit, I spend some time working on performance. Making the application use less memory and less CPU time while adding more features is a challenge. Recently I read a post by Mitch Denny entitled RSS Bandit Performance Problems where after some investigation he found a key culprit for some of our memory consumption issues in the last release. Mitch wrote

Last weekend I was subscribed to over about 1000 RSS feeds and conicidentally last weekend RSSBandit also became unusable. Obviously I had reached some kind of threshold that the architecture of RSSBandit wasn’t designed to cope with.

My first instinct was to ditch and go and find something a bit faster – after all it is a .NET application and we know how much of a memory hog those things are! Errr – hang on. Don’t I encourage our customers to go out and use .NET to build mission critical enterprise applications every day? I really needed to take a closer look at what was going on.

In idle RSSBandit takes up around 120–170MB of RAM on my laptop. Thats more than Outlook and SQL Server, and often more than Visual Studio (except when its in full flight) but to be honest I’m not that surprised because in order for it to give me the unread items count it has to process quite a few files containing lots of unique strings – that means relatively large chunks of being allocated just for data.

I decided to look a bit deeper and run the CLR Allocation Profiler over the code to see where all the memory (and by extension good performance was going). I remembered this article by Rico Mariani which included the sage words that “space is king” and while I waited for the profiler to download tried to guess what the problem would be based on my previous experience.

What I imagined was buckets of string allocations to store posts in their entirety and a significant number of XML related object allocations but when I looked at the allocation graph I saw something interesting.

... [see http://notgartner.com/Downloads/AllocationGraph3.GIF]

As you can see there is a huge amount of traffic between this native function and the NativeWindow class. It was at this point that I started to suspect what the actual problem was and had to giggle at how many times this same problem pops up in smart client applications.

From what I can tell the problem is an excessive amount of marshalling to the UI thread is going on. This is causing threads to synchronise (tell tale DynamicInvoke calls are in there) and quite a bit of short term memory to be allocated over the lifetime of the application. Notice that there is 610MB of traffic between the native function and NativeWindow so obviously that memory isn’t hanging around.

The fix? I don’t know - but I suspect if I went in to the RSSBandit source and unplugged the UI udpates from the UpdatedFeed event the UI responsiveness would increase significantly (the background thread isn’t continually breaking into the main loop to update an unread count on a tree node).

It seems most of the memory usage while running RSS Bandit on Mitch's computer came from callbacks in the RSS parsing code that update the unread item count in the tree view within the GUI whenever a request to fetch an RSS feed is completed. Wow!!!

This is the last place I would have considered looking for somewhere to optimize our code and yet another example of why  "measurement is king" when it comes to figuring out how to improve the performance of an application. Given that a lot of the time a feed is requested there is no need to update the UI since no new items are fetched, there is a lot of improvement that we can gain here.

Yet again I am reminded that writing a rich client application like RSS Bandit using the .NET Framework means that you have to be familiar with a bunch of Win32/COM interop crap even if all you want to do is code in C#. I guess programming wouldn't be so challenging if not for gotchas like this lying around all over the place. :)


 

Categories: RSS Bandit | Technology

The Mini-Microsoft blog has an entry entitled 6% raise? I want to work for Dilbert's company! where he writes

Holy whatsa, Alice got a 6% raise ? I'd seriously consider hanging out in the bushes near Google-Kirkland with my aluminum bat to totally Tonya-Hardin-up some delicate competitive fingers just to get a 6% raise.

If you're a lead, you can bring up the manager review tool and check-in on how your reports are doing within The Model. Maybe some bits and pieces will move around, but the review model is pretty much done now and set to go into effect the 1st of September, with the mid-September paycheck showing any benefits.

One thing I've noticed kvetching with other managers is that once again, pay raises are minimal. I'm talking 2%-ish for a 3.5 review. That's barely keeping up with cost-of-living / inflation for doing more than is expected up of you. And of course, 3.0s, for the most part, get nothing. That's right: you're losing buying power for making a 3.0 - doing what's expected of you.

The poor quality of yearly raises was one of the reasons I decided to leave the XML team last year. I realized that no matter how hard I worked, I wouldn't be significantly financially compensated for going above and beyond what I was required to do. Given that I'm an "above and beyond" kinda guy I saw two choices; I could be underpaid or I could be underpaid and work on stuff I was passionate about. So I moved to MSN.

After doing some thinking during my vacation I concluded that MSFT isn't the kind of place I can see myself working at in 5 years. One of the repurcussions of this conclusion is that I'm going to start working on getting an MBA so I can broaden my options whenever I decide that the time has come for me to leave the B0rg cube. Of course, the nagging from my folks about when I'm going to grow up (get married, finish my education, etc) helped in coming to this decision.

Bah! Going back to school is going to suck.


 

Categories: Life in the B0rg Cube

August 13, 2005
@ 03:30 AM

I'm back in Seattle and may have already beaten jet lag by having never switched my watch from west coast time. It feels good to be back in my apartment. The five flights back were pretty uneventful. The only noteworthy event was that I saw Forest Whitaker in the upper class lounge of Virgin Atlantic at Heathrow airport. I was going to walk up to him and tell him how much I loved The Crying Game and Waiting To Exhale until I realized that would have made me sound like a jerk. I doubt that people in the movie business like being told their stuff rocked...a decade ago.

PS: If you are ever in the UK and you hear someone described as being Asian, it means they are from India not East Asia as is the case in the US.


 

Categories: Ramblings

I'm on the way back from my trip and this is the part of the vacation that sucks. It's going to take a total of 5 flights to get from Abuja back to Seattle as well as about half a day of sitting around in airports as well. Below are a bunch of last minute impressions about Nigeria and London (where I'm currently posting this from).
  • All the male restrooms in Heathrow airport have condoms dispensers. This really has me scratching my head since the only place I usually see them is in night club restrooms which makes sense since a bunch of hooking up goes on at night clubs. So now I have this impression that somewhere in Heathrow there is a bunch of debauchery going on and I'm not a part of it. It must be the first class lounges...

  • If you ask a British bartender for a 'Long Island Iced Tea', don't be surprised if he responds "We don't serve tea at the bar, twit!"

  • It seems I've picked up homophobia by osmosis while in the United States. I kept finding it weird that men could be seen holding hands together either for emphasis in a conversation or while walking without being seen as 'gay' in Nigeria. Similarly having guys sleep in the same bed also gave me a similar vibe. I can't believe I'm getting culture shock from my home country.

  • Do you know who cleans the streets of Lagos & Abuja? The street sweepers, literally. I was freaked out to see people with brooms sweeping the sides of the roads in both Abuja and Lagos without the luxury of safety cones. My memory fails me as to whether this is an improvement from not having street sweepers from a few years ago or this was just the status quo.

  • Soft drinks sold in plastic bottles seems to be gaining popularity in Lagos & Abuja. Back in the day it was all about the glass bottles, which were always redeemed by people. In fact, the price of a bottle of beer or a soft drink always assumed you'd be returning a bottle as well. It took me a while to get used to the 'wastefulness' in the United States where people just threw away the bottles. Of course, there were other places where the wastefulness surprised me as well when I first got here such as using paper towels instead of wash rags or styrofoam silverware & plates instead of reusable plastic ones at fast food places. Now it's the other way around. After doing the dishes at my mom's I was confused to not find paper towels nearby. I am becoming so American...

  • Thanks to a ban on external imports of various consumer goods we now get Heineken and Five Alive brewed locally.  Awesome!!!


 

Categories: Trip Report

In his post entitled Google News and RSS Dave Winer writes

It's the same reason I'm not giddy withdelight that Microsoft decided to call their support of RSS "web feeds"

Considering that the support for XML syndication technologies in IE 7 includes both flavors of RSS (1.0 & 0.91/2.0) and Atom, I personally don't think it is a good idea to call the feature 'RSS'.

Then there's the fact that RSS does sound a bit geeky, after all most people call them web pages and web sites not HTML documents and domains.

Internet Explorer is used by hundreds of millions of regular folks not just geeks. The IE team is simply trying to make the feature approachable to end users.


 

Categories: Life in the B0rg Cube

Over the past couple of months the MSN Spaces team has gotten a bunch of feedback about features users would like to see in the service. Common requests include more flexibility in customizing the look of the space, ability to play videos or music in a module and the ability to add modules containing cutom HTML.

The team has been listening and all of those features were released yesterday as Powertoys. As Powertoys they aren't fully supported features and are only available in English. They are basically cool hacks by some of the developers on the Spaces team which are a prelude to what this functionality might look like in a future release of Spaces.

If you are an MSN Spaces user you should read Mike Torres's posts about how to enable the HTML Module, Windows Media Player module and the Tweak UI Powertoy. They totally jazz up your Space.

Great work from Ryan for being the man with plan on getting these out.

 


 

Categories: MSN

August 8, 2005
@ 01:47 PM

In response to my post Using XML on the Web is Evil, Since When? Tantek updated his post Avoiding Plain XML and Presentational Markup. Since I'm the kind of person who can't avoid a good debate even when I'm on vacation I've decided to post a response to Tantek's response. Tantek wrote

The sad thing is that while namespaces theoretically addressed one of the problems I pointed out (calling different things by the same name), it actually WORSENED the other problem: calling the same thing by different names. XML Namespaces encouraged document/data silos, with little or no reuse, probably because every person/political body defining their elements wanted "control" over the definition of any particular thing in their documents. The <svg:a> tag is the perfect example of needless duplication.

And if something was theoretically supposed to have solved something but effectively hasn't 6-7 years later, then in our internet-time-frame, it has failed.

This is a valid problem in the real world. For example, for all intents an purposes an <atom:entry> element in an Atom feed is semantically equivalent to an <item> element in an RSS feed to every feed reader that supports both. However we have two names for what is effectively the same thing as far as an aggregator developer or end user is concerned.

The XML solution to this problem has been that it is OK to have myriad formats as long as we have technologies for performing syntactic translations between XML vocabularies such as XSLT. The RDF solution is for us to agree on the semantics of the data in the format (i.e. a canonical data model for that problem space) in which case alternative syntaxes are fine and we performs translations using RDF-based mapping technologies like DAML+OIL or OWL. The microformat solution which Tantek espouses is that we all agree on a canonical data model and a canonical syntax (typically some subset of [X]HTML).

So far the approach that has gotten the most traction in the real world is XML. From my perspective, the reason for this is obvious; it doesn't require that everyone has to agree on a single data model or a single format for that problem space.  

Microformats don't solve the problem of different entities coming up with the different names for the same concept. Instead its proponents are ignoring the reasons why the problem exists in the first place and then offering microformats as a panacea when they are not.

I personally haven't seen a good explanation of why <strong> is better than <b>...

A statement like that begs some homework. The accessibility, media independence, alternative devices, and web design communities have all figured this out years ago. This is Semantic (X)HTML 101. Please read any modern web design book like those on my SXSW Required Reading List, and we'll continue the discussion afterwards.

I can see the reasons for a number of the semantic markup guidelines in the case of HTML. What I don't agree with is jumping to the conclusion that markup languages should never have presentational markup. This is basically arguing that every markup language that may be used as a presentation format should use CSS or invent a CSS equivalent. I think that is a stretch.

Finally, one has to seriously cast doubt on XML opinions on a page that is INVALID markup. I suppose following the XML-way, I should have simply stopped reading Dare's post as soon as I ran into the first well-formedness error. Only 1/2 ;)

The original permalink to Tantek's article was broken after he made teh edit. I guess since I couldn't find it, it doesn't exist. ;)


 

Categories: Web Development | XML

August 7, 2005
@ 05:16 PM

I've been doing a bit more travelling around the country this week. The travel high point was a trip by helicopter today to a number of places including a local chief's palace and the village my dad where my dad was born. I took a couple of pics from the helicopter as well as on the ground and hope at least a few of them come out OK.

Below are a couple more random thoughts that have crossed my mind during this trip since my previous post

  • The proliferation of mobile phones is even more significant than I thought. I had assumed it was a city thing since the phones I saw folks with were in Abuja (current capital) and Lagos (former capital). However visiting less developed areas also have shown a high proliferation of mobile technology. In my dad's village I saw both a pay-as-you-go booth for MTN, a local mobile service provider, as well as a kiosk where a enterprising local entrepeneurs were renting out uses of their phones at 20 naira a call (about $0.15)

  • When I was growing up it was common practice for local businessmen to sell products that had been unsafe for public use in developed countries. It seems we now have a new government body called NAFDAC whose job is to act as the Nigerian version of the FDA. NAFDAC has been so effective that there have been multiple attempts on the life of the head of the organization by pissed off business owners whose products she's taken off the market.

  • The only thing scarier than being in a speeding car in typical Lagos or Abuja traffic is being driven in a speeding car in Lagos or Abuja traffic with an in-dashboard DVD player which is showing hip hop videos with half naked chicks dancing seductively. I kept wondering if the driver could keep his eyes on the road. That's it. Next time I come here, I'm walking everywhere.  

  • As I expected the common questions from family and extended family were when I'm going to show up with a future spouse and when I'm going back to school. What I didn't expect was so many people asking when I became such a fat ass. In hindsight, I should have expected it given that I haven't seen some of these folks in almost a decade and I've put on dozens of pounds since then. I definitely need to get back in shape. 

 


 

Categories: Trip Report