A few months ago in Joel Spolsky's  How Microsoft Lost the API War  he wrote

There are two opposing forces inside Microsoft, which I will refer to, somewhat tongue-in-cheek, as The Raymond Chen Camp and The MSDN Magazine Camp.
...
The Raymond Chen Camp believes in making things easy for developers by making it easy to write once and run anywhere (well, on any Windows box). The MSDN Magazine Camp believes in making things easy for developers by giving them really powerful chunks of code which they can leverage, if they are willing to pay the price of incredibly complicated deployment and installation headaches, not to mention the huge learning curve. The Raymond Chen camp is all about consolidation. Please, don't make things any worse, let's just keep making what we already have still work. The MSDN Magazine Camp needs to keep churning out new gigantic pieces of technology that nobody can keep up with.

When I first read the above paragraphs I disagreed with them because I was in denial. But as the months have passed and I've looked at various decisions my team has made in recent years I see the pattern. The patterns repeats itself in the actions of other product teams and divisions at Microsoft. I know realize this is an unfortunate and poisonous aspect of Microsoft's culture which doesn't work in the best interest of our customers. A few months ago I found some advice given to Ward Cunningham on joining Microsoft which read

 Take a running start and don't look back

  1. Recognize that your wonderful inventiveness is the most valuable thing you will own in a culture that values its employees solely by their latest contributions. In a spartan culture like this, you will rise quickly.

  2. Keep spewing ideas, even when those ideas are repeatedly misunderstood, implemented poorly, and excised from products for reasons that have nothing to do with the quality of the idea. When you give up on communicating your new ideas, you will just go insane waiting to vest.

The Microsoft culture is about creating the newest, latest greatest thing that 'changes the world' not improving what is already out there and working for customers. When I read various Microsoft blogs and MSDN headlines about how even though we've made paradigm shifts in developer technologies in the recent years we aren't satisfied and want to introduce radically new and different technologies all over again. This bothers me. I hate the fact that 'you have to rewrite a lot of your code' is a common answer to questions a customer might ask about how to leverage new or upcoming functionality in a developer technology.

Our team [and myself directly] has gone through a process of rethinking a number of decisions we made in this light. Up until very recently we were planning to ship the System.Xml.XPath.XPathDocument class as a replacement for the System.Xml.XmlDocument class. One of the driving reasons for doing this was XPath and XSLT performance. The mismatch between the DOM data model and that of XPath meant that XPath queries or XSLT transformations over the XmlDocument would never be as fast as XPathDocument. Another reason we were doing this was that since the XmlDocument is not an interface based design there isn't a way for people who implement their own XML document-like classes to plug-in to our world. So we decided to de-emphasize (but not deprecate) the XmlDocument by not adding any new functionality or performance improvements to it and focused all our energy on XPathDocument.

The problem was that the XPathDocument had a radically different programming model than the XmlDocument meaning that anyone who'd written code using the XmlDocument against our v1.0/v1.1 bits would have to radically rewrite their code to get performance improvements and new features. Additionally any developers migrating to the .NET Framework from native code (MSXML) or from the Java world would already be familiar with the XML DOM API but not the cursor-based model used by the XPathDocument. This was really an untenable situation. For this reason we've reverted the XPathDocument to what it was in v1.1 while new functionality and perf improvements will be made to the XmlDocument. Similarly we will keep the new and improved XPathEditableNavigator XPathNavigator class which will be the API for programming against XML data sources where one wants to abstract away what the underlying store actually is. We've shown the power of this model with examples such as the ObjectXPathNavigator and the DataSetNavigator.

It's good to be back in the Raymond Chen camp.


 

Categories: Life in the B0rg Cube | XML

From a Sports Ilustrated Article entitled Iraqi soccer players angered by Bush campaign ads 

Afterward, Sadir had a message for U.S. president George W. Bush, who is using the Iraqi Olympic team in his latest re-election campaign advertisements. In those spots, the flags of Iraq and Afghanistan appear as a narrator says, "At this Olympics there will be two more free nations -- and two fewer terrorist regimes."

(To see the ad, click here.)

"Iraq as a team does not want Mr. Bush to use us for the presidential campaign," Sadir told SI.com through a translator, speaking calmly and directly. "He can find another way to advertise himself."

Ahmed Manajid, who played as a midfielder on Wednesday, had an even stronger response when asked about Bush's TV advertisement. "How will he meet his god having slaughtered so many men and women?" Manajid told me. "He has committed so many crimes."

"The ad simply talks about President Bush's optimism and how democracy has triumphed over terror," said Scott Stanzel, a spokesperson for Bush's campaign. "Twenty-five million people in Iraq are free as a result of the actions of the coalition."

To a man, members of the Iraqi Olympic delegation say they are glad that former Olympic committee head Uday Hussein, who was responsible for the serial torture of Iraqi athletes and was killed four months after the U.S.-led coalition invaded Iraq in March 2003, is no longer in power.

But they also find it offensive that Bush is using Iraq for his own gain when they do not support his administration's actions. "My problems are not with the American people," says Iraqi soccer coach Adnan Hamad. "They are with what America has done in Iraq: destroy everything. The American army has killed so many people in Iraq. What is freedom when I go to the [national] stadium and there are shootings on the road?"

I find the We had to destroy the village to save it mentality an interesting and perhaps uniquely American perspective.  Based on some of the comments in my post So Why Are You Voting For George W Bush? it seems there are still people who feel the financial and human cost of the war in Iraq was worth it both for selfish reasons (it distracts Islamic terrorists who'd probably be focusing on attacking the American mainland) and for the fact that the war has made the Iraqi people "better off" despite the fact that many of them have gone from living in a relatively stable country to living in a bombed out warzone. This is a very interesting peek into the American psyche.

UPDATE: Based on some of the comments to this entry I decided to further clarify what I find so interesting. After it became clear that there were no WMDs in Iraq or strong ties to Al-Qaeda and 9/11 the reaction to this from the Bush Administration and American people could have been regret and sorrow (”I can't believe we killed thousands of civilians and spent billions of dollars without just cause”). Instead the Bush administration and certain portions of the American public instead have have reacted by stating that this invasion is actually good for Iraq and America should be commended for rescuing the Iraqi people. The reaction and the mentality behind it aren't what I'd expect after such a significant folly. Of course, history will be the judge of whether the current belief that America's interventions in Afghanistan and Iraq will improve the lives of the people of these nations for future generations.


 

August 22, 2004
@ 03:30 AM

Every once in a while I like to post a list of articles I'm either in the process of writing or considering writing to get feedback from people on what they'd like to see or whether the topics are even worthwhile. Below is a list of the next couple of articles I'm either in the process of writing or plan to write over the next few months.

  1. An Introduction to Validating XML Documents with Schematron (MSDN) : An introduction to Schematron including examples showing how one can augment a W3C XML Schema document using Schematron thus creating an extremely powerful XML schema language.  Code samples will use Schematron.NET

  2. Designing XML Formats: Versioning vs. Extensibility (XML 2004 Conference) : This is the presentation and paper for my XML 2004 talk. It will basically be the ideas in my article On Designing Extensible, Versionable XML Formats with more examples and less fluff.

  3. The XML Litmus Test - Deciding When and Why To Use XML (MSDN) : After seeing more and more people at work who seem to not understand what XML is good for or what the decision making process should be for adopting XML I decided to put this article together.  This will basically be an amalgamation of my XML Litmus Test blog post and my Understanding XML article on MSDN.  

  4. XML in Cw (XML.com)  : An overview of the XML based features of Cw. The Cw type system contains several constructs that reduce the impedance mismatch between OO and XSD by introducing concepts such as anonymous types, choices [aka union types], nullable types and constructing classes from XML literals into the .NET world. The ability to process such strongly typed XML objects using rich query constructs based on SQL's select operator will also be covered.

  5. A Comparison of Microsoft's C# Programming Language to Sun Microsystems' Java Programming Language 2nd edition : About 3 years ago I wrote a C# vs. Java comparison while I was still in school which has become the most popular comparison of both languages on the Web. I still get mail on a semi-regular basis from people who've been able to transition between both languages due to the information in my comparison document. I plan to update this article to reflect the proposed changes announced in Java 1.5 and C# 2.0

On top of this I've been approached twice in the past few months about writing a technology book. Based on watching the experiences of others my gut feel is that it isn't worth the effort. I'd be interested in any feedback on the above article ideas or even suggestions for new articles that you'd be interested in seeing on MSDN or XML.com from me.


 

Categories: Technology | XML

August 22, 2004
@ 02:53 AM

It seems a number of people are now using RSS Bandit v1.2.0.114 SP1 release candidate 1 and so far it doesn't seem like we've gotten any reports of crashing related to opening new browser tabs which plagued v1.2.0.114. However there are some bugs, in changing the feed parsing code to use less memory I ended up introducing a bug where Atom 0.3 feeds are no longer updated. This bug has since been fixed along with a few others. Torsten and I will likely produce a release candidate 2 next weekend.

The progress on the Wolverine release is slow and steady. We recently had to reorganize the code due to my pending addition of NNTP support to make the names and behaviors of certain internal classes less focused on RSS and more on processing news items be they RSS items, Atom entries or USENET posts. I should be checking in the infrastructure for NNTP support by the middle of next week. Once that is done the next major piece of work will be adding the ability to delete posts which is made trickier by the fact that we have to make sure that information is passed along when synchronizing instances of RSS Bandit. 

Now that Hotmail has begun to deliver on the announced 250MB of space for free users it is seeming more and more attractive to add Hotmail as a synchronization source for RSS Bandit users. It would be very cool if all you had to do was enter your Hotmail username and password then RSS Bandit simply used one of your Hotmail folders for synchronizing between instances without requiring you to setup an FTP or WebDAV server as is currently needed. I've talked to some people who work in the MSN/Passport part of MSFT such as Omar Shahine, Joshua Allen and Julien Couvreur about the feasibility of using Hotmail in this way from a .NET Framework application. They all implied it won't be easy. Since I spoke to them I've found an an article about talking to Hotmail from C# and a spec for the HTTPMail protocol which Outlook Express uses to talk to Hotmail. It probably won't be easy but it doesn't look like it'll be too hard either.

Back to coding.


 

Categories: RSS Bandit

In his post I Want RELAX NG! Tim Ewald writes

This recent post on Mark Nottingham's site pushed me over the edge. I agree with Sean's comment: I want Relax NG. Can I make systems work with XSD? Yes, sort of. But it adds a ludicrous amount of complexity. First you have to know how it works, then what not to do because it's too complicated (like complicated type or element substitution models), then figure out how to contort your schema to do what you want (like extensibility and versioning). Relax NG is much simpler and much closer to how XML actually works. And yes, you can still map it to /from objects if you want to.

I can't help but wonder why, if WS-* and SOAP 1.2 keep XSD at arms length (referencing simple types only and providing non-normative schema definitions) and WSDL 2.0 defines its own simple types, everyone assumes I want to use XSD to define my Web service interface. Pretty much everyone I know who works in this space agrees that Relax NG is a better choice. What is stopping us from making this change?

This is one of those times where I both agree and disagree with Tim. To explain why, I first need to list the two three reasons people tend to write schemas.

  1. To provide a way to annotate an XML document with type information and thus created a type annotated infoset.
  2. To provide a means to ensure that an XML documents satisfies the constraints of a given message contract
  3. To provide terse, human readable documentation of an XML format.

In most developer scenarios [including XML Web Services] the most popular use case is the first from the list above. An XML Schema is used primarily for mapping the contents of an XML document either into relational tables (e.g. SQLXML, ADO.NET DataSet) or into a set of programming language objects (e.g. System.Xml.Serialization.XmlSerializer). Every XML Web Service toolkit I have encountered emphasizes this scenario and in fact most customers do not use XML schemas for validation of business documents for either performance reasons or the fact that their business rules cannot be adequately described using an XML schema. The main problem with XSD for this use case is that it is actually too expressive and has a richer type system than either the relational model or traditional object oriented programming languages. This leads to impedance mismatches which makes it hard for XML Web Service stacks to map schema declarations to objects thus leading to calls from folks like the WS-I to propose creating a subset or profile of XSD.

On the other hand, XSD is notoriously bad at dealing with the second use case described above. The language makes either makes it hard to describe common XML idioms (see the hoops I have to jump through in my Designing Extensible, Versionable XML Formats article) or impossible (e.g. if an attribute has a certain value then the element should have a certain content model or the providing a choice of attributes). This is where RELAX NG shines. Of course, being more expressive than XSD means that the impedance mismatch between it and the relational and OO models is even more significant. 

In practice today, most XML Web Services need an XML schema language for creating type annotated infosets not for validating message structure. This means that for their use cases XSD is preferable to RELAX NG. Ideally, a simple language that just allowed creating named structures and primitive types such as Microsoft's now-obsolete XML Data Reduced (XDR) would be even more optimal.  

Of course, the XML Web Services world could one day evolve to the point where being able to validate incoming messages against a schema is deemed more important than being able to deserialize the XML into objects and vice versa. In which case, Aaron Skonnard's statement in his post Could RelaxNG Replace XSD? which describe the existing industry inertia around XSD is also a point to consider.


 

Categories: XML

August 18, 2004
@ 10:16 AM

I saw the following excerpt in Shelley Powers's post entitled Differences of Humor where she wrote

Sam Ruby has posted a note about the upcoming Applied XML Conference put on by Chris Sells. When I looked at the agenda and realized that the conference managed to put together two days worth of presentations without one woman speaker,

Knowing the nature of Chris Sells's conferences this is unsurprising. They seem to mostly be an opportunity for Chris's DevelopMentor clique and their buddies to hang out. However Shelley's post did make me start thinking about how many women I knew who worked with XML and just like the time I started to keep a list of Seinfeld episodes in which at least one African or African American appeared in (don't ask) I started tracking down the number of women I knew off who worked on XML technologies who's works I'd rather see present than at least one of the presentations currently on the roster. Here is my list

Non-Microsoft

  • Eve Maler - Sun's most notable XML geek after Jon Bosak and Tim Bray. She's worked on SAML and UBL. I meet her at XML 2003 where we chatted about versioning in UBL and what truly meant by polymorphic XML processing.

  • Jeni Tennison - the most knowledgeable person on the planet about W3C XML Schema. I've lost count of the amount of times I've seen her school members of the W3C XML Schema working group about the technology on various mailing lists. Also an XSLT and XPath guru. She's always pushing boundaries in the XML world such as with her work on layered hierarchies in markup vocabularies with LMNL

  • Priscilla Walmsley - the author of Definitive XML Schema which is probably the best book on W3C XML Schema on the market. She's also co-written a book on XML in Office 2003 which I haven't read but would love to get a presentation on especially with regard to some finer details on how Office uses XML schemas. 

  • Amelia Lewis - a co-author of the WS-ReliableMessaging specification and the author of an excellent critique of the W3C XML Schema primitive types in her article Not My Type: Sizing Up W3C XML Schema Primitives

Microsoft

  • Elena - the Microsoft XML Web Service stack rests on her shoulders. What makes Visual Studio.NET an awesome XML Web Service environment is that there is functionality that lets you point at a WSDL and automatically you get handy dandy .NET classes generated for you. Elena owns the meat of this code, a lot of which resides in the XmlSerializer class

  • Denise Draper - an architect on our team who in a past life has been a member of the XQuery working group, worked an XML data integration suite for Nimble Technology and worked in the AI field.

  • Priya Lakshminarayanan - the developer for the W3C XML Schema validation technology in the .NET Framework. She's the most knowledgeable about the technology at Microsoft, I'm a distant second to her breadth of knowledge about this somewhat arcane and cryptic technology. She the first person I've seen implement a tool for generating sample XML documents from XML schemas that didn't suck.

  • Helena Kupkova - the developer for the XML parser in the .NET Framework. She completely gutted our old implementation and doubled the perf in some scenarios. A totally impressive developer. More impressive is that she ships stuff like the XML Diff and Patch demo on GotDotNet in her spare time.

  • Nithya Sampathkumar - the developer on the XML schema inference technology in the .NET Framework. Once I took over as the program manager for this technology I grew to understand the subtleties involved in trying to infer a schema for arbitrary XML documents. A presentation on the techniques used in her implementation and the limitations of XML schema inference would be quite interesting.

  • Neetu Rajpal - the program manager for XML tools in Visual Studio. I've overheard some interesting conversations involving her discussing some of the trickiness involved in implementing an XSLT debugger. An in-depth presentation about what the XML tools team is planning to ship and the issues they encountered would be killer.

  • Vinita - the program manager for MSXML which is the most widely deployed XML library on the planet. Even without shipping in Internet Explorer, Windows and Office they still get millions of downloads a year.

  • Tejal Joshi - works on the XML tools in Visual Studio. At last year's XML 2003 conference I enjoyed hearing James Clark discuss implementation strategies for his nxml-mode in Emacs. I'm sure Tejal would have similarly interesting stories to tell.

  • Lanqing Dai - used to be developer for the XmlDocument class but has moved on to WinFS. I'd love to hear a her thoughts on how working in an XML-centric world compares to living in the item-centric world of WinFS.

There are more women I know off in the XML field both within and outside Microsoft but these are the ones whose presentations I'd rather see than something like XML as a Better COM (for example). Maybe next time Chris Sells should look around the usual XML hang outs both online (like the xml-dev mailing list) and within Microsoft internally for conference speakers instead of announcing them in his blog. It may lead to a more diverse list of topics and speakers.

I need to go watch Berserk. Talk to you guys later.


 

Categories: XML

My issue of Playboy came in the mail so I got to read the the infamous Google interview. If you don't have a Playboy subscription or balk at buying the magazine from the newstands you can get the interview from Google's amended SEC filings. I didn't read the entire interview but there were no surprises in what I read.

I was recently talking to a coworker who's on the fence about whether to go to Google or stay at Microsoft and it was interesting talking about the pros and cons of both companies. As we talked Google began to remind me of Netscape in its heyday. A company full of bright, young guys who've built a killer application for the World Wide Web and is headed for a monster IPO. The question is whether Google will squander their lead like Netscape did (Yes, I realize my current employer may have had something to do with that) or whether they'll be the next Yahoo!

There are a couple of things Google has done over the past few years that have made me wonder whether the company has enough adult supervision and business acumen to rise above being a one trick pony in the constantly changing Internet landscape. Some of them are touched on by Larry and Sergey in their interview

  1. http://www.google.com is non-sticky: Nothing on the main Google site encourages the user to hang around the site or even return to the website besides the quality of the search results. According to the company's founders this is by design. The problem with this reasoning is that if and when its competitors such as MSN Search and Yahoo! Search get good enough there isn't anything keeping people tied to the site. It seems unfathomable now but there was a time that it seemed unfathomable that anyone would use anything besides AltaVista or Excite to search the Web. It's happened before and it can happen again. Google seems ill-prepared for this occurence.

  2. Inability to tie together disparate offerings: The one thing that has separated Yahoo! from all the Web portals that were all the rage a couple of years ago is that it managed to tie its many offerings into a single cohesive package with multiple revenue streams. The Yahoo! experience seamlessly ties in My Yahoo!, Yahoo! Groups (formerly eGroups), Yahoo! Calendar, Yahoo! Maps, Yahoo! Shopping, Yahoo! Finance, Yahoo! News, Yahoo! Movies, Yahoo! Messenger and the Yahoo! Companion. I use most of these Yahoo sites and tools on a daily basis and use all of them at least once a month. Besides advertising related to search there are several entry points for Yahoo! to get revenue from me.

    Compare this to Google which although has a number of other offerings available from the Google website has a number of offerings they haven't figured out how to make synergistic such as their purchase of Blogger or sites like Orkut. Yahoo! would have gotten a lot more mileage out of either site than Google currently has done. Another aspect of this issue is gleaned from this excerpt from a post by Dave Winer entitled Contact with Google

    Another note, I now have four different logins at Google: Orkut, AdSense, Blogger and Gmail. Each with a different username and password. Now here's an area where Google could be a leader, provide an alternative to Passport, something we really need, a Google-size problem.

    Yahoo! has a significantly larger number of distinct offerings yet I access all of them through a single login. This lack of cohesiveness indicates that either there isn't a unified vision as to how to unite this properties under a single banner or Google has been unable to figure out how to do so.

  3. GMail announced to quickly: Google announced GMail with its strongest selling point being that it gave you 100 times more space than competing free email services. However GMail is still in beta and not available to the general public while it's competitors such as Hotmail and Yahoo! Mail have announced upping their limits to 250MB and 100MB respectively with gigabytes of storage available and other features available to users for additional fees. This has basically stolen Google's thunder and halted a potential exodus of users from competing services while GMail isn't even out of beta yet.

  4. Heavy handed tactics in the Web syndication standards world: Recently Google decided to use a interim draft of a technology specification instead of a de facto industry standard for syndicating content from their Blogger website thus forcing users to upgrade or change their news aggregators as well as ensuring that there would be at least two versions of the Atom syndication format in the wild (the final version and the interim version supported by Google). This behavior upset a lot of users and aggregator developers. In fact, the author of the draft specification of the Atom syndication format that Google supported over RSS has also expressed dismay at the choice Google made and is encouraging others not to repeat their actions.

All of these are examples of less than stellar decision making at Google. Even though in previous entries such as What Is Google Building? and What is Google Building II: Thin Client vs. Rich Client vs. Smart Client I've implied that Google may be on the verge of a software move so bold it could upstage Microsoft the same way Netscape planned to with the browser upstaging the operating system as a development and user platform, it isn't a slam dunk that they have what it takes to get there.

It will be interesting watching the Google saga unfold.


 

Categories: Technology

According to the current version of the Chris Sells XML DevCon page (don't bother bookmarking it, Chris Sells doesn't believe in permalinks so all the content on that page is transient) I noticed that Chris Anderson is presenting the following

Developers Hate XML

Chris Anderson

While everyone is currently infatuated with XML, developers are constantly doing battle with trying to rationalize and leverage XML in their applications. Ill talk about having to balance correct XML-isms vs. usability in XAML, about the preponderance of XML reader/writer/DOM/serialization APIs, and about how all of this throws you into a horrible programming experience of loosely typed runtime errors. This reveals XML for what it is a data encoding. XML is the ASCII text file of the 2000s. While web services are often called "XML Web Services," the reality is that every web service API abstracts the developer from the XML view.

Nothing says vote of confidence like when the chief architects of one of teams you work closely with says your technology sucks. :)

Seriously though, I am curious as to what his presentation actually will be about. Reading the abstract, it seems like it is another iteration of a data-centric user of XML coming to the realization that for their scenarios XML is just CSV on steroids. People's behavior when they realize this usually follows a pattern similar to the five stages of grief. First there is denial, this usually takes the form of an initial disbelief that after all the hype they've heard about XML it isn't working out fantastically for them. Then there is bargaining, this usually manifests itself as attempts to not use XML but still use it. Often you here phrases like "binary XML", "XML subset" or "XML profile" at this point. Then there is anger at XML for being more complex and verbose than they need. At this point you get to either read a rant-filled email, blog post, conference paper or in this case conference presentation about how badly the technology is suited for its purpose. Then there is either despair or acceptance. One doesn't follow the other. If the next stage is despair in this case the person ends up not using XML to solve that particular problem. On the other hand if it is acceptance, XML is still used but in some cases it is in one of the forms that were mentioned in the bargaining stage such as a binary representation of an XML stream or uing some subset of XML.

Hopefully Chris Anderson will post his slides online.


 

Categories: XML

Anders Norås has an interesting blog entry entitled A JavaScript XmlSerializer where he shows how to build a class equivalent to the .NET Framework's System.Xml.Serialization.XmlSerializer class in Javascript. He writes

In ASP.NET 2.0 it is possible to invoke server events from client side script without posting back the page. This is supported through a new mechanism called script callbacks. For more information on this technology, read Dino Espositio’s excellent “Cutting Edge” article on the subject.

Mostly out-of-band calls have been used with fairly simple return values such as strings and numerals. The “advanced” uses have typically been passing arrays as comma separated lists. This has greatly limited the applicability of the technology and created a wide functionality gap between the object oriented programming environment of the server world and the more primitive environment in the browser...

A JavaScript XmlSerializer

One of the most celebrated classes in the .NET framework is the XmlSerializer class. This class enables you to serialize objects into XML documents and deserialize XML documents into objects. As we all know, XML documents are represented as strings, so it is simple to pass an XML document as either a parameter or a return value on an out-of-band call.

By implementing a client side XML based serialization and deserialization it would be possible to pass an object from a client script to a server method and vice versa. There are of course huge differences between the powerful .NET platform and the simple JavaScript language, but these have little impact on a client to server communications channel as it would only make sense to pass data transfer objects.

Definitely an interesting bit of code. What is also very interesting is that he has a previous article entitled Declarative JavaScript programming where he implements metadata annotations (akin to .NET Framework attributes) for Javascript. Excellent stuff.


 

One important lesson I've learned about designing software is that sometimes it pays to smother one's perfectionist engineer instincts and be less ambitious about the problems one is trying to solve. Put more succintly, a technology doesn't have to solve every problem just enough problems to be useful. Two examples come to mind which hammered this home to me; Tim Berners-Lee's World Wide Web and collaborative filtering which sites like Amazon use.

  1. The World Wide Web: Almost every history of the World Wide Web you find online mentions how Tim Berners-Lee was inspired by Ted Nelson's Xanadu. The current Web is a pale imitation of the what Ted Nelson described over forty years ago of what a rich hypertext system should be capable of doing. However you're reading these words of mine over Tim Berners-Lee's Web not Ted Nelson's. Why is this?

    If you read the descriptions of the Xandadu model you'll notice it has certain lofty goals. Some of these include the ability to create bi-directional links, links that do not break, and built-in version management. To me it doesn't seem feasible to implement all these features without ending up building a closed system. It seems Tim Berners-Lee came to a similar conclusion and greatly simplified Ted Nelson's dream thus making it feasible to implement and adopt on a global scale. Tim Berners-Lee's Web punts on all the hard problems. How does the system ensure that documents once placed on the Web are always retrievable? It doesn't. Instead you get 404 pages and broken links. How does the Web ensure that I can find all the pages that link to another page? It doesn't. Does the Web enable me to view old versions of a Web page and compare revisions of it side by side? Nope.

    Despite these limitations Tim Berners-Lee's Web sparked a global information revolution. Even more interestingly over time various services have shown up online that have attempted to add the missing functionality of the Web such as The Internet Archive, Technorati and the Google Cache.

  2. Collaborative Filtering on Amazon: The first place I ever bought CDs online was CDNow.com (now owned by Amazon). One feature of the site that blew my mind was the fact the ability to get a list of recommended CDs to buy based on your purchase history and the ratings you gave various albums. The suggestions were always quite accurate and many times it suggested CDs I already owned and liked a lot.

    This feature always seemed like magic to me. I imagined how difficult it must have been to come up with a categorization and ranking systems for music CDs that could accurately match people up with music based on their tastes. It wasn't until Amazon debuted this feature that I realize the magic algorithms were simply 'people who purchased X aldo purchased Y'. My magic algorithms were just a bunch of not very interesting SQL queries.  

    There are limitations to this approach ,you need a large enough user base and enough purchases of certain albums to make them statistically significant, but the system works for the most part.

Every once in a while I am part of endless discussions about how we need to complicate a technology to satisfy every use case when in truth we don't have to solve every problem. Edge cases should not dictate a software systems design but too often they do.  


 

Categories: Technology