There have been a number of recent posts on message queues and publish/subscribe systems in a couple of the blogs I read. This has been rather fortuitous since I recently started taking a look at the message queuing and publish/subscribe infrastructure that underlies some of our services at Windows Live.  It’s been interesting just comparing how different my perspective and those of the folks I’ve been working with are from the SOA/REST blogging set.

Here are three blog posts that caught my eye all around the same time with relevant bits excerpted.

Tim Bray in his post Atom and Pushing and Pulling and Buffering writes

Start with Mike Herrick’s Pub/Sub vs. Atom & AtomPub?, which has lots of useful links. I thought Bill de hÓra’s Rates of decay: XMPP Push, HTTP Pull was especially interesting on the “Push” side of the equation. ¶

But they’re both missing an important point, I think. There are two problems with Push that don’t arise in Pull: First, what happens when the other end gets overrun, and you don’t want to lose things? And second, what happens when all of a sudden you have a huge number of clients wanting to be pushed to? This is important, because Push is what you’d like in a lot of apps. So it doesn’t seem that important to me whether you push with XMPP or something else.

The obvious solution is a buffer, which would probably take the form of a message queue, for example AMQP.

The problems Tim points out are a restating of the same problems you encounter in the Push or HTTP-based polling model favored by RSS and Atom; First, what happens when the client goes down and doesn’t poll the server for a while but you don’t want to lose things? Secondly, what happens when a large number of clients are pilling and server can’t handle the rate of requests? 

The same way there are answers to these questions in the Pull model, so also are there answers in the Push model. The answer to the Tim’s first question is for the server to use a message queue to persist messages that haven’t been delivered and also note the error rate of the subscriber so it can determine whether to quit or wait longer before retrying. The are no magic answers for the second problem. In a Push model, the server gets more of a chance of staying alive because it controls the rate at which it delivers messages. It may deliver them later than expected but it can still do so. In the Pull model, the server just gets overwhelmed.  

In a response to Tim’s post,  Bill de hÓra writes in his post entitled XMPP matters 

I'd take XMPP as a messaging backbone over AMQP, which Tim mentions because he's thinking about buffering message backlog, something that AMQP calls out specifically. And as he said - "I’ve been expecting message-queuing to heat up since late 2003." AMQP is a complicated, single purpose protocol, and history suggests that simple general purpose protocols get bent to fit, and win out. Tim knows this trend. So here's my current thinking, which hasn't changed in a long while - long haul, over-Internet MQ will happen over XMPP or Atompub. That said, AMQP is probably better than the WS-* canon of RM specs; and you can probably bind it to XMPP or adopt its best ideas. Plus, I'm all for using bugtrackers :)

So I think XMPP matters insofar as these issues are touched on, either in the core design or in supporting XEPs, or reading protocol tea leaves. XMPP's potential is huge; I suspect it will crush everything in sight just like HTTP did.

What I’ve found interesting about these posts is the focus on the protocol used by the publish/subscribe or message queue system instead of the features and scenarios.  Even though I’m somewhat of an XML protocol geek, when it comes to this space I focus on features. It’s like source control, people don’t argue about the relative merits of the CVS, Subversion or Git wire protocols. They argue about the actual features and usage models of these systems.

In this case, message buffering or message queuing is one of the most important features I want in a publish/subscribe system. Of course, I’m probably cheating a little here because sometimes I think message queues are pretty important even if you aren’t really publishing or subscribing. For example, the ultimate architecture is one where you have a few hyper denormalized objects in memory which are updated quickly and when updated you place the database write operations in a message queue [to preserve order, etc] to be performed asynchronously or during off peak hours to smooth out the spikes in activity graphs.

Stefan Tilkov has a rather dreadful post entitled Scaling Messaging contains the following excerpt

Patrick may well be right — I don’t know enough about either AMQP or XMPP to credibly defend my gut feeling. One of my motivations, though, was that XMPP is based on XML, while AMQP (AFAIK) is binary. This suggests to me that AMQP will probably outperform XMPP for any given scenario — at the cost of interoperability (e.g. with regard to i18n). So AMQP might be a better choice if you control both ends of the wire (in this case, both ends of the message queue), while XMPP might be better to support looser coupling.

This calls to mind the quote, it is better to remain silent..yadda, yadda, yadda. You’d have thought that the REST vs. SOAP debates would have put to rest the pointlessness that is dividing up protocols based on “enterprise” vs. “Web” or even worse binary vs. XML. I expect better from Stefan.  

As for me, I’ve been learning more about SQL Service Broker (Microsoft’s message queuing infrastructure built into SQL Server 2005) from articles like Architecting Service Broker Applications which gives a good idea of the kind of concerns I’ve grown to find fascinating. For example 


One of the fundamental features of Service Broker is a queue as a native database object. Most large database applications I have worked with use one or more tables as queues. An application puts something that it doesn't want to deal with right now into a table, and at some point either the original application or another application reads the queue and handles what's in it. A good example of this is a stock-trading application. The trades have to happen with subsecond response time, or money can be lost and SEC rules violated, but all the work to finish the trade—transferring shares, exchanging money, billing clients, paying commissions, and so on—can happen later. This "back office" work can be processed by putting the required information in a queue. The trading application is then free to handle the next trade, and the settlement will happen when the system has some spare cycles. It's critical that once the trade transaction is committed, the settlement information is not lost, because the trade isn't complete until settlement is complete. That's why the queue has to be in a database. If the settlement information were put into a memory queue or a file, a system crash could result in a lost trade. The queue also must be transactional, because the settlement must be either completed or rolled back and started over. A real settlement system would probably use several queues, so that each part of the settlement activity could proceed at its own pace and in parallel.

The chapter on denormalizing your database in “Scaling Your Website 101” should have a pointer to the chapter on message queuing so you can learn about one of the options when formulating a strategy for dealing with the update anomalies that come with denormalization.

Now playing: Neptunes - Light Your Ass On Fire (feat. Busta Rhymes)


Categories: Web Development

In a recent comment to one of my blog post Daniele Muscetta lists some of the reasons he no longer uses RSS Bandit as much as he used to...

I have been using it less and less this past year:
- at the beginning because of the troubles having it run on Vista,
- then because I really like the possibility to have my feeds everywhere instead than on one PC (I use multiple computers)
- and finally because with a web based aggregator I can immediately SHARE and MARK what I read and make it available again for other people to see. This one is a killer feature of the various google reader or bloglines... letting people make their own link blog republishing stuff they read and like. Of course I am not complaining... to do such a thing you would need an infrastructure - as opposed to just releasing a program that runs on the PC...

I agree that the ability to access your feeds from anywhere is a great advantage of Web-based feed readers. On the other hand, I haven't found a Web-based feed reader that has all of the features I've come to take for granted in RSS Bandit especially when you consider some of the extra features like downloading podcasts and being able to access newsgroups. I've always wanted to be able to provide the best of both worlds to our users which is why a few years, I added support for the NewsGator API with the intent of enabling our users get to no longer have to choose between a desktop feed reader and a Web-based one, and can use whichever they want whenever they want (i.e. the Outlook and Outlook Web Access model).

However my implementation was incomplete and I incorrectly blamed the API when in truth I didn't do a great job in understanding how the API should be used. Yesterday, I revisited the problem and fixed a number of brain dead misuses of the API in our implementation (e.g. ReplaceSubscriptions isn't a shortcut that can be used in place of multiple calls to AddSubscription and DeleteSbscription) . Now the feature works like a charm and I've created two screen casts showing  how you can get the world's best RSS reading experience for FREE, today.

RSS Bandit & NewsGator (part 1)

RSS Bandit & NewsGator (part 2)

To learn more about using this feature, you can also read the RSS Bandit documentation on using NewsGator and RSS Bandit to synchronize your feeds across multiple computers.

Categories: RSS Bandit

September 17, 2007
@ 04:29 AM

The final version of the ShadowCat release of RSS Bandit is now available. This release isn't about new features but instead about making the existing features work a lot better and fixing a ton of stability issues related to our usage of Lucene for search engine for our full-text search.

This release is available in the following languages; English, German, Polish, French, Simplified Chinese, Russian, Brazilian Portuguese, Turkish, Dutch, Italian, Serbian and Bulgarian.

Download the installer from . A snapshot of the source code will be available later in the week as a source code release.

New Features

  • Newspaper view can be configured to show unread items in a feed as multiple pages instead of a single page of items
  • Feed list and read/unread status of news items can be synchronized with NewsGator Online (and thus FeedDemon and NetNewsWire as well).

Major Bug Fixes

  • Images show up as broken links when viewing the feed for the Facebook blog
  • Application gets lost on second screen in multi-monitor setup (bug 1288729) and incorrect dual-display handling on startup (bug 1479148)
  • Marking search engines in options doesn't register as change (bug 1782946)
  • Application crashes with "InvalidOperationException" or "Index is closed" error message. (bug 1777810)
  • Application crashes with "docs out of order" error message. (bug 1783688)
  • Synchronization with Newsgator Online does not recognize items marked as read from within RSS Bandit (bug 1368567).
  • Automatic Mark as Read behavior while scrolling sometimes stops working.
  • Deleted items show up as unread after upgrading from v1.3.0.42
  • Application displays error dialog with "UnauthorizedAccessException" or "access denied" error at random times during the day. (bug 1777411)
  • Javascript errors on Web pages result in error dialogs being displayed or the application hanging.
  • Search indexing thread takes 100% of CPU and freezes the computer.
  • Crashes related to Lucene search indexing (e.g. IO exceptions, access violations, file in use errors, etc)
  • Crash when a feed has an IRI (i.e. URL with unicode characters such as http://www.bü instead of issuing an error message since they are not supported.
  • Context menus no longer work on category nodes after OPML import or remote sync
  • Crash on deleting a feed which still has enclosures being downloaded
  • Podcasts downloaded from the feed are named "..mp3" instead of the actual file name.
  • Items marked as read in a search folder not marked as read in original feed
  • No news shown when subscribed to
  • Can't subscribe to feeds on your local hard drive (regression since it worked in previous versions)
  • Random crashes due to error renaming file "" to "deletable" or "" to "segments" in search index folder.
  • Items in Atom 0.3 feeds that have a <created> date but no <issued> date show their date as the last modified date of the feed instead of the created date.
  • Images don't show up on certain items when clicking on feed or category view if the feed uses relative links such as in Tim Bray’s feed at
  • Empty pages displayed in newspaper view when browsing multiple feeds under a category node.
  • Newly added feeds do not inherit the feed refresh rate specified in the Options dialog.
  • In certain cases, the following error message is displayed when attempting feed upload via FTP; "Feedlist upload failed with error: Passive mode not allowed on this server.."
  • Application crashes on startup with the COMException "unknown error"
  • None of the options when right-clicking on "This Feed" in feed properties is valid for newsgroups.
  • Crash because the application cannot modify the .treestate.xml configuration file
  • Crash when clicking on enclosure link in toast window
After the final release of ShadowCat, the following release of RSS Bandit (codenamed Phoenix) will contain our next major user interface revamp and will be the version where we move to version 2.0 of the .NET Framework. ShadowCat will be the last version of RSS Bandit that will run on version 1.1 of the .NET Framwork. You can find some of our early Phoenix screen shots on Flickr


Categories: RSS Bandit

September 14, 2007
@ 03:30 PM

Torsten and I just got an email asking whether RSS Bandit is an abandoned project. This is a fair question given how little activity there has been from us over the past couple of months.

The fact is that we're both pretty busy in our personal lives right now. I'm getting married a week from tomorrow and Torsten just had a baby. Since RSS Bandit is a project we maintain in our free time, it has suffered while we go through transitions on the home front. I'm about a month away from being able to dedicate any serious time to the project and Torsten is in a similar boat.

However the currently checked in code is a lot more stable than the current release since we have fixed or worked around a number of issues in Lucene.NET which we use for building the search index.  For that reason, I'll be releasing a new version of RSS Bandit this weekend which should be fix a lot of issues a number of our regular users have had with the last release.

Next month, I'll start thinking about the features I'd like to add in the next release and should be able to start writing new code again. Thanks for your patience and support.


Categories: RSS Bandit

Recently I took a look at CouchDB because I saw it favorably mentioned by Sam Ruby and when Sam says some technology is interesting, he’s always right. You get the gist of CouchDB by reading the CouchDB Quick Overview and the CouchDB technical overview.  

CouchDB is a distributed document-oriented database which means it is designed to be a massively scalable way to store, query and manage documents. Two things that are interesting right off the bat are that the primary interface to CouchDB is a RESTful JSON API and that queries are performed by creating the equivalent of stored procedures in Javascript which are then applied on each document in parallel. One thing that not so interesting is that editing documents is lockless and utilizes optimistic concurrency which means more work for clients.

As someone who designed and implemented an XML Database query language back in the day, this all seems strangely familiar.

So far, I like what I’ve seen but there seems to already be a bunch of incorrect hype about the project which may damage it’s chances of success if it isn’t checked. Specifically I’m talking about Assaf Arkin’s post CouchDB: Thinking beyond the RDBMS which seems chock full of incorrect assertions and misleading information.

Assaf writes

This day, it happens to be CouchDB. And CouchDB on first look seems like the future of database without the weight that is SQL and write consistency.

CouchDB is a document oriented database which is nothing new [although focusing on JSON instead of XML makes it buzzword compliant] and is definitely not a replacement/evolution of relational databases. In fact, the CouchDB folks assert as much in their overview document.

Document oriented database work well for semi-structured data where each item is mostly independent and is often processed or retrieved in isolation. This describes a large category of Web applications which are primarily about documents which may link to each other but aren’t processed or requested often based on those links (e.g. blog posts, email inboxes, RSS feeds, etc). However there are also lots of Web applications that are about managing heavily structured, highly interrelated data (e.g. sites that heavily utilize tagging or social networking) where the document-centric model doesn’t quite fit.

Here’s where it gets interesting. There are no indexes. So your first option is knowing the name of the document you want to retrieve. The second is referencing it from another document. And remember, it’s JSON in/JSON out, with REST access all around, so relative URLs and you’re fine.

But that still doesn’t explain the lack of indexes. CouchDB has something better. It calls them views, but in fact those are computed tables. Computed using JavaScript. So you feed (reminder: JSON over REST) it a set of functions, and you get a set of queries for computed results coming out of these functions.

Again, this is a claim that is refuted by the actual CouchDB documentation. There are indexes, otherwise the system would be ridiculously slow since you would have to run the function and evaluate every single document in the database each time you ran one of these views (i.e. the equivalent of a full table scan). Assaf probably meant to say that there aren’t any relational database style indexes but…it isn’t a relational database so that isn’t a useful distinction to make.

I’m personally convinced that write consistency is the reason RDBMS are imploding under their own weight. Features like referential integrity, constraints and atomic updates are really important in the client-server world, but irrelevant in a world of services.

You can do all of that in the service. And you can do better if you replace write consistency with read consistency, making allowances for asynchronous updates, and using functional programming in your code instead of delegating to SQL.

I read these two paragraph five or six times and they still seem like gibberish to me. Specifically, it seems silly to say that maintaining data consistency is important in the client-server world but irrelevant in the world of services. Secondly, “Read consistency” and “write consistency” are not an either-or choice. They are both techniques used by database management systems, like Oracle, to present a deterministic and consistent experience when modifying, retrieving and manipulating large amounts of data.

In the world of online services, people are very aware of the CAP conjecture and often choose availability over data consistency but it is a conscious decision. For example, it is more important for Amazon that their system is always available to users than it is that they never get an order wrong every once in a while. See Pat Helland’s (ex-Amazon architect) example of a how a business-centric approach to data consistency may shape one’s views from his post Memories, Guesses, and Apologies where he writes

#1 - The application has only a single replica and makes a "decision" to ship the widget on Wednesday.  This "decision" is sent to the user.

#2 - The forklift pummels the widget to smithereens.

#3 - The application has no recourse but to apologize, informing the customer they can have another widget in one month (after the incoming shipment arrives).

#4 - Consider an alternate example with two replicas working independently.  Replica-1 "decides" to ship the widget and sends that "decision" to User-1.

#5 - Independently, Replica-2 makes a "decision" to ship the last remaining widget to User-2.

#6 - Replica-2 informs Replica-1 of its "decision" to ship the last remaining widget to User-2.

#7 - Replica-1 realizes that they are in trouble...  Bummer.

#8 - Replica-1 tells User-1 that he guessed wrong.

#9 - Note that the behavior experienced by the user in the first example is indistinguishable from the experience of user-1 in the second example.

Eventual Consistency and Crappy Computers

Business realities force apologies.  To cope with these difficult realities, we need code and, frequently, we need human beings to apologize.  It is essential that businesses have both code and people to manage these apologies.

We try too hard as an industry.  Frequently, we build big and expensive datacenters and deploy big and expensive computers.   

In many cases, comparable behavior can be achieved with a lot of crappy machines which cost less than the big expensive one.

The problem described by Pat isn’t a failure of relational databases vs. document oriented ones as Assaf’s implication would have us believe. It is the business reality that availability is more important than data consistency for certain classes of applications. A lot of the culture and technologies of the relational database world are about preserving data consistency [which is a good thing because I don’t want money going missing from my bank account because someone thought the importance of write consistency is overstated] while the culture around Web applications is about reaching scale cheaply while maintaining high availability in situations where the occurence of data loss is unfortunate but not catastrophic (e.g. lost blog comments, mistagged photos, undelivered friend requests, etc).

Even then most large scale Web applications that don’t utilize the relational database features that are meant to enforce data consistency (triggers, foreign keys, transactions, etc) still end up rolling their own app-specific solutions to handle data consistency problems. However since these are tailored to their application they are more performant than generic features which may exist in a relational database. 

For further reading, see an Overview of the Flickr Architecture.

Now playing: Raekwon - Guillotine (Swordz) (feat. Ghostface Killah, Inspectah Deck & GZA/Genius)


Categories: Platforms | Web Development

I attended the Data Sharing Summit last Friday and it was definitely the best value (with regards to time and money invested) that I've ever gotten out of a conference. You can get an overview of that day’s activities in Marc Canter’s post Live blogging from the DataSharingSummit. I won’t bother with a summary of my impressions of the day since Marc’s post does a good job of capturing the various topics and ideas that were discussed.

What I will discuss is a technology initiative called OAuth which is being cooked up by the Web platform folks at Yahoo!, Google, Six Apart, Pownce, Twitter and a couple of other startups.

It all started when Leah Culver was showing off the profile aggregation feature of Pownce on her Pownce profile. If you look at the bottom left of that page, you’ll notice that it links to her profiles on Digg, Facebook, Upcoming, Twitter as well as her weblog. Being the paranoid sort, I asked whether she wasn’t worried that her users could fall victim to imposters such as this fake profile of Robert Scoble?  She replied that her users should be savvy enough to realize that just because there are links to profiles in a side bar doesn’t prove that the user is actually that person. She likened the feature to blog *bling* as opposed to something that should be taken seriously.   

I pressed further and asked whether she didn’t think that it would be interesting if a user of Pownce could prove their identity on Twitter via OpenID (see my proposal for social network interoperability based on OpenID for details) and then she could use the Twitter API to post a user’s content from Pownce onto Twitter and show their Twitter “tweets” within Pownce. This would get rid of all the discussions about having to choose between Pownce and Twitter because of your friends or even worse using both because your social circle spans both services. It should be less about building walled gardens and more about increasing the size of the entire market for everyone via interoperability. Leah pointed out something I’d overlooked. OpenID gives you a way to answer the question “Is leahculver @ Pownce also leahculver @ Twitter?” but it doesn’t tell you how Pownce can then use this information to perform actions on Leah’s behalf on Twitter. Duh. I had implicitly assumed that whatever authentication ticket returned from the OpenID validation request could be used as an authorization ticket when calling the OpenID provider’s API, but there’s nothing that actually says this has to be the case in any of the specs.

Not only was none of this thinking new to Leah, she informed me that she had been working with folks from Yahoo!, Google, Six Apart, Twitter, and other companies on a technology specification called OAuth. The purpose of which was to solve the problem I had just highlighted. There is no spec draft on the official site at the current time but you can read version 0.9 of the specification. The introduction of the specification reads

The OAuth protocol enables interaction between a Web Service Provider(SP) and Consumer website or application. An example use case would be allowing, the OAuth Consumer, to access private photos stored on, the OAuth Service Provider. This allows access to protected resources (Protected Resources) via an API without requiring the User to provide their (Service Provider) credentials to (Consumer). More generically, OAuth creates a freely implementable and generic methodology for API authentication creating benefit to developers wishing to have their Consumer interact with various Service Providers.

While OAuth does not require a certain form of user interface or interaction with a User, recommendations and emerging best practices are described below. OAuth does not specify how the Service Provider should authenticate the User which makes the protocol ideal in cases where authentication credentials are not available to the Consumer, such as with OpenID.

This is an interesting example of collaboration between competitors in the software industry and a giant step towards actual interoperability between social networking sites and social graph applications as opposed to mere social network portability.  

The goal of a OAuth is to move from a world where sites collect users credentials (i.e. username/passwords) for other Web sites so they can screen scrape the user’s information to one where users authorize Web sites and applications to act on their behalf on the target sites in a way that puts the user in control and doesn’t require giving up their usernames and passwords to potentially untrustworthy sites (Quechup flavored spam anyone?). The way it does so is by standardizing the following interaction/usage flow


  1. The Consumer Developer obtains a Consumer Key and a Consumer Secret from the Service Provider.


  1. The Consumer attempts to obtain a Multi-Use Token and Secret on behalf of the end user.
    • Web-based Consumers redirect the User to the Authorization Endpoint URL.
    • Desktop-based Consumers first obtain a Single-Use Token by making a request to the API Endpoint URL then direct the User to the Authorization Endpoint URL.
    • If a Service Provider is expecting Consumer that run on mobile devices or set top boxes, the Service Provider should ensure that the Authorization Endpoint URL and the Single Use Token are short and simple enough to remember for entry into a web browser.
  2. The User authenticates with the Service Provider.
  3. The User grants or declines permission for the Service Provider to give the Consumer a Multi-Use Token.
  4. The Service Provider provides a Multi-Use Token and Multi-Use Token Secret or indicates that the User declined to authorize the Consumer’s request.
    • For Web-based Consumers, the Service Provider redirects to a pre-established Callback Endpoint URL with the Single Use Token and Single-Use Authentication Secret as arguments.
    • Mobile and Set Top box clients wait for the User to enter their Single-Use Token and Single-Use Secret.
    • Desktop-based Consumers wait for the User to assert that Authorization has completed.
  5. The Consumer exchanges the Single-Use Token and Secret for a Multi-User Token and Secret.
  6. The Consumer uses the Multi-Use Token, Multi-Use Secret, Consumer Key, and Consumer Secret to make authenticated requests to the Service Provider.

This standardizes the kind of user-centric API model that is utilized by Web services such as the Windows Live Contacts API, Google AuthSub and the Flickr API to authenticate and authorize applications to access a user’s data. I suspect that the reason OAuth does not mandate a particular authentication mechanism is as a way to grandfather in the various authentication mechanisms used by all of these APIs today. It’s one thing to ask vendors to add new parameters and return types to the information exchanged during user authentication between Web sites and another to request that they completely replace one authentication technology with another.

I’d love to see us get behind an effort like this at Windows Live*. I’m not really sure if there is an open way to participate. There seems to be a Google Group but it is private and all the members seem to be folks that know each other personally. I guess I’ll have to shoot some mail to the folks I met at the Data Sharing Summit or maybe one of them will see this post and will respond in the comments.

PS: More details about OAuth can be found in the blog post by Eran Hammer-Lahav entitled Explaining OAuth.

*Disclaimer: This does not represent the intentions, strategies, wishes or product direction of my employer. It is merely wishful thinking on my part.

Now playing: Sean Kingston - Beautiful Girls (remix) (feat. Fabolous & Lil Boosie)


We attended Justin Timberlake FutureSex/LoveShow on Saturday and it was my best concert going experience in the Seattle area to date. A lot had to do with the location. The Tacoma Dome being an enclosed building and a sports arena has the right combination of acoustics and availability of vending services. I attended  Eminem’s Anger Management 3 and Kenny Chesney’s The Road & Radio at the White River Amphitheatre and Qwest Field respectively and both locations were suffered from lousy acoustics because they were open air locations. 

We missed the opening band which didn’t bother me once it was confirmed that it was not going to be Timbaland. JT’s performance was top notch, he not only performed the hits from both albums but also threw in some unexpected surprises like his verses from Gone and Dick in a Box. After the audience was done singing along to the latter, he joked “You guys watch too much YouTube”. I was amused that he didn’t say “You guys watch too much SNL”. He also mentioned that it had just won an Emmy that night. “Only in America,” he laughed. 

I was surprised to notice that audience seemed to have been an older crowd than the crowd from Bumbershoot. That said, all those JT fans in their 20s and 30s still scream like teenage girls so I’d suggest some ear plugs if you’re ever thinking of attending one of his concerts. 

Now playing: Justin Timberlake - What Goes Around.../...Comes Around Interlude


Categories: Music

From the documentation for Directory.GetFiles Method (String, String)

When using the asterisk wildcard character in a searchPattern, such as "*.txt", the matching behavior when the extension is exactly three characters long is different than when the extension is more or less than three characters long. A searchPattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the searchPattern. A searchPattern with a file extension of one, two, or more than three characters returns only files having extensions of exactly that length that match the file extension specified in the searchPattern.

I realize this behavior is probably well known to Windows developers, but seriously, WTF? How do I match on three character file extensions (i.e. the majority of file extensions) without getting cruft as well (e.g. matching on .cpp files without getting Emacs backups with .cpp~ file extensions)?

Now playing: Mystikal - I Smell Smoke


Categories: Programming

Since tomorrow is the Data Sharing Summit, I was familiarizing myself with the OpenID specification because I was wondering how it dealt with people making false claims about their identity.

Scenario: I go to which supports OpenID and claim that my URL is (i.e. I am Brad Fitzpatrick). The site redirects me to along with a query string that has certain parameters specified (e.g. “?openid.mode=checkid_immediate&openid.identity=brad&openid.return_to=”) which is a long winded way of saying “LiveJournal can you please confirm that this user is brad then redirect them back to our site when you’re done?” At this point, I could make an HTTP request to, specify the Referrer header value as and claim that I’ve been validated by LiveJournal as brad.  

This is a pretty rookie example but it gets the idea across. OpenID handles this spoofing problem by requiring an OpenID consumer (e.g. to first make an association request to the target OpenID provider (e.g. LiveJournal) before making performing any identity validation. The purpose of the request is to get back an association handle which is in actuality a shared secret between both services which should be specified by the Consumer as part of each checkid_immediate request made to the Identity Provider.  

There is also a notion of a dumb mode where instead of making the aforementioned association request, the consumer asks the Identity Provider whether the assoc_handle returned by the redirected user is a valid one via a check_authentication request. This is a somewhat chattier way to handle the problem but it leads to the same results.

So far, I think I like OpenID. Good stuff.

Now playing: Gangsta Boo - Who We Be


Categories: Technology

September 6, 2007
@ 07:29 PM

Chris Jones has a blog post on the Windows Live Wire blog entitled Test drive the new Windows Live suite where he writes

You’ve probably already read about some changes we’re making to Windows Live, and have seen some of your services change over the past few weeks. Starting now, you can test out the new suite of Windows Live software at

Windows Live makes it easy to store and manage your communications and information, and share what’s going on in your life with the people who mean the most to you. Many of you have already tried out new versions of our web services – Windows Live Hotmail, Windows Live Spaces, Windows Live SkyDrive beta, and the new Windows Live Home page beta. These have been designed to work together with a common navigation, so it is easy to switch between your e-mail, your space, your files, and your photos—from any browser.

Today we’re releasing beta versions of a new generation of Windows Live software designed for your Windows PC that makes it easier than ever to get connected to Windows Live or other services. This suite of software includes e-mail (Windows Live Mail), photo sharing (Windows Live Photo Gallery), a great publishing tool that lets you post directly to your blog (Windows Live Writer), parental controls (Windows Live OneCare Family Safety), a new version of Windows Live Messenger (8.5), and more.

As you can tell, Windows Live is coming together and there is growing clarity around the brand. All the talk of being a “suite” and unified installers struck me as anachronisms from an executive team that came from the world of Office and Windows but now that I’ve begun to see some of the fruits of their labor it looks like a good thing. An integrated set of desktop and Web apps that play well together makes a lot of sense.

I’ve also surprised myself by liking the more consistent UI across [some of] the various Windows Live sites but would like to see us do more integration of the Web applications. For example, the integration between SkyDrive and Windows Live Spaces is cool  but it seems we are last out of the gate with integration between IM and email (unlike Yahoo! Mail and GMail). I’d also like to see a couple more Windows Live services being available from and sharing the same consistent UI such as Windows Live Expo and Live QnA. I guess I’m hard to please.  :)  Kudos to all the folks that worked on the current releases.  

The unified installer is one of those things that seems weird but after using it I wonder why we didn’t provide one sooner. It’s pretty convenient to be able to grab the latest Windows Live apps at a single go. It’s definitely worth trying out especially if you haven’t tried out Windows Live Photo Gallery or Windows Live Mail yet. So what are you waiting for? Get it now.

Now playing: Three 6 Mafia - Most Known Unknown Hits


Categories: Windows Live