A few years ago Joel Spolsky wrote a widely quoted blog post which stated

A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. "Google uses Bayesian filtering the way Microsoft uses the if statement," he said. That's true. Google also uses full-text-search-of-the-entire-Internet the way Microsoft uses little tables that list what error IDs correspond to which help text. Look at how Google does spell checking: it's not based on dictionaries; it's based on word usage statistics of the entire Internet, which is why Google knows how to correct my name, misspelled, and Microsoft Word doesn't.

This morning I fired up my favorite RSS reader and saw the danah boyd's entry entitled quality of Google searches? where she pointed out

It's also annoying that they've stopped correcting my atrocious spelling. I mean, it's all fine and well that lots of people in the blogosphere can't spell in exactly the same way that i can't spell, but the #1 type of search i do everyday is spell check. I throw something god-awful like Cziskentmihalyi into the engine knowing that it'll return Csikszentmihalyi. This still works quite well for names but it's stopped working for lots of regular words that i just can't spell to save my life. How pathetic is it that i've started opening up Word for the little red squigglies instead of relying on search? Or maybe both practices are weird...

This seems like a predictable problem. There are lots of commonly misspelled words and in many online communities people have simply given up on correct spelling (heck, I've now grown used to the fact that computer geeks have decided that the correct spelling of ridiculous is rediculous). Thus it is quite likely that a frequently misspelled word eventually occurs so much in the wild that it is considered a valid word. Maybe Google needs some if statements in their code after all, instead of blindly trusting the popularity contest that is Bayesian analysis. :)

The interesting thing is that one could argue that if a particular spelling of a word becomes popular then it automatically is "correct" since that is how the English language has evolved over time anyway.


