Via a post on Don Box's weblog I noticed that quotes from my weblog have been used to further an incorrect assumption about Microsoft's technological direction with regards to XML technologies in the future versions of Windows (aka LongHorn) and other products.

Steve Gillmor writes

A key inducement for migrating to Longhorn is WinFS. FS means future storage, and the scheme is a new file storage system that will make it easier to store and find data. Instead of leveraging the XSD standard, Microsoft designers rolled a new schema language to handle WinFS' new capabilities
...

Clearly, Microsoft wants developers to create tomorrow's applications on Longhorn and WinFS. Right?  So why did Dare Obasanjo, program manager for .Net Framework XML schema technologies, have this to say: "The W3C XML Schema Definition language is far from being targeted for elimination from Microsoft's actively developed portfolio." Obasanjo listed a dozen Microsoft products using XSD, including "Yukon," Visual Studio .Net, "Indigo," Word, Excel and InfoPath

The last three form the core of Office System 2003, which Bill Gates touted as the strategic development platform for the near future at the New York launch. With Longhorn still far away, Microsoft is asking developers to invest in XSD for now—only to have to unlearn and migrate when Longhorn appears in 2006.

As several people have pointed out WinFS schema and XSD do completely different things. A few people have suggested that Microsoft "embrace and extend" XSD to make it suitable to describe WinFS types but bitter experience has shown that this course of action usually leads to confusion amongst our customers and recrimination from industry watchers. In the words of Chris Rock, "You could drive a car with your feet if ya want to, that doesn't make it a good  idea!".

However Steve Gillmor's piece does point out the fact that the next couple of Microsoft releases targetted at developers will be bringing a number of new technologies for developers to learn and there will be pushback from those who don't see why they have to adjust to the changing landscape. Just today, I got an email from someone who pointed out that users of data access technologies in the .NET Framework will now have almost half a dozen distinct query languages to chose from when retrieving data including OPath, XPath, XQuery, and SQL. There are reasons why each one exists

  • OPath is an object query language
  • SQL is a relational query language
  • XPath is a dynamically typed language for addressing parts of an XML document
  • XQuery is a statically typed language for performing sophisticated queries on one or more XML documents.

However stating it bluntly there are twice as many query languages that will exist whenever the next version of SQL Server & Visual Studio ship than in the last version (OPath & XQuery are the new comers). I suspect that much the same way Steve Gillmor is writing "the sky is falling" style articles about the fact that there will be a schema language for describing WinFS types seperate from that for describing XML documents (yet as Mike Deem points out no seems to be asking why not use SQL 'CREATE TABLE' statements to define WinFS types) there will be similar complaints about the amount of choice we are giving developers with regards to data access technologies and query languages.

Sometimes I wonder whether developers would prefer an Über-language with everything and the kitchen sink integrated into it. Would developers really prefer that instead of having divergent query languages we just had one (i.e. SQL) with proprietary extensions for the different data domains which was used ubiqitously everywhere to query XML documents, in-memory objects, relational databases, text files, etc? If reporters like Jon Udell and Steve Gillmor are to be believed then this is the preferred approach to building software since on the surface people get to reuse their skills except that things will work differently than they expect. I'm actually curious to hear from developers who read my weblog as to which approach they think is preferrable. For example, should one use SQL to query relational databases and XPath/XQuery for XML or should SQL be the universal query language used by all with any additions needed for XML querying being grafted on to it in most likely a proprietary manner? 

This inquiring mind would like to know.