June 15, 2004
@ 04:17 PM

The ongoing conversation between Jeremy Mazner and Jon Udell about the capabilities of WinFS deepen this morning with Jeremy's post Did I misunderstand Udell's argument against WinFS? which was followed up by Jon's post When a journalist blogs. In his post Jon asks

We have standard query languages (XPath, XQuery), and standard ways of writing schemas (XSD, Relax), and applications (Office 2003) that with herculean effort have been adapted to work with these query and schema languages, and free-text search further enhancing all this goodness. Strategically, why not build directly on top of these foundations?

Tactically, why do I want to write code like this:

public class Person
  [XmlAttribute()] public string Title;
  [XmlAttribute()] public string FirstName;
  [XmlAttribute()] public string MiddleName;
  [XmlAttribute()] public string LastName;

in order to consume data like this?

    DisplayName="Woodgrove Bank"
    UserTile=".\user_tiles\Adventure Works.jpg">

I believe two things to be true. First, we have some great XML-oriented data management technologies. Second, the ambitious goals of WinFS cannot be met solely with those technologies. I'm trying to spell out where the line is being drawn between interop and functionality, and why, and what that will mean for users, developers, and enterprises.

Jon asks several questions and I'll try to answer all the ones I can. The first question about why WinFS doesn't build on XML, XQuery and XSD instead of items, OPath and the WinFS schema language is something that the WinFS folks will have to answer. Of course, Jon could also ask why it doesn't build on RDF, RDQL [or any of the other RDF query languages] and RDF Schema which is a related question that naturally follows from the answer to Jon's question.

The second question is why would one want to program against a Person object when they have a <Person> element. This is question has an easy answer which unfortunately doesn't sit well with me. The fact is that developers prefer programming against objects than they do programming with XML APIs. No XML API in the .NET Framework (XmlReader, XPathNavigator, XmlDocument, etc) comes close to the ease of use of  programming against strongly typed objects in the general case. Addressing this failing [and it is a failing] is directly my responsibility since I'm responsible for core XML APIs in the .NET Framework. Coincidentally, we just had a review with our new general manager yesterday and this same issue came up and he asked what we plan to do about this in future releases. I have some ideas. The main problem with using objects to program against XML is that although objects work well for programming against data-centric XML (rigidly structured tabular data such as an the data in an Excel spreadsheet, a database dump or serialized objects) there is a signficant impedance mismatch when trying to use strongly typed objects to program against document-centric XML (semi-structured data such as a Word document). However the primary scenarios the WinFS folks want to tackle are about rigidly structured data which works fine with using objects as the primary programming model.

Jon says that he is trying to draw the line between interop and functionality. I'm curious as to what he means by interop in this case. The fact that WinFS is based on items, OPath and WinFS schema doesn't mean that WinFS data cannot be exchanged in an interoperable manner (e.g. some form of XML export and import) nor does it mean that non-Microsoft applications cannot interact with WinFS. I should clarify that I have no idea what the WinFS folks consider their primary interop scenarios but I don't think the way WinFS is designed today means it cannot interoperate with other platforms or data models.

I suspect that Jon doesn't really mean interop when says so. I believe he is using the word the same way Java people use it where it really means 'One Language, One Programming Model, One Platform' everywhere instead of being able to communicate between disparate end points. In this case the language is XML and the platform is the XML family of technologies.


Categories: Life in the B0rg Cube | XML

In a post entitled My comments on the Infoworld article "Databases flex their XML" Michael Rys writes

Sean McCown wrote this analysis (PDF version) in Apr 2004. In the article, he compares the XML capabilities of the 4 major relational database systems (comparing publicly available versions) both in terms of functionality, ease, flexibility and speed, and adds a sidebar on Yukon. Before I start giving my comments on the article, let me disclose that I talked to Sean during his research for the article and answered his questions on SQL Server 2000 and Yukon. Thus, some of the comments below are just my attempts to make Sean's translation of my answers clearer, because I was not answering his questions clear enough :-).

Michael then goes on to clarify various points around the terminology used in the article, XQuery and SQL Server. Both Sean's article and Michael's followup are excellent reading for anyone interested in the growing trend of XML-enabled relational databases and how the big 3 relational database vendors stack up.


Categories: XML

I was pleasantly surprised today when I logged in to my Yahoo! Mail account and found out they've made good on their promise and now my mailbox size has gone up to 100MB from 6MB. I hope the folks at Hotmail are paying attention and upgrade the measely 2MB of space that they currently allocate to their free users.


Categories: Ramblings

June 14, 2004
@ 09:53 AM

For the past few weeks my friend Chris has had an open invitation for me to play Settlers of Catan with him and couple of other guys in the U-district. Today I finally accepted and when I got there it turned out that one of the guys was Evan Martin, one of the devs for LiveJournal. It turns out Evan just graduated from college and this was his last weekend in Seattle before moving to the Bay Area to start work at Google. Once we were introduced he mentioned that he knew of me and in fact that I was the reason he unsubscribed from the atom-syntax mailing list. It seems in one of the early discussions about Atom I wrote something which he felt was a technically valid point but was delivered in a scathing manner (i.e. punctuated with a flame) so he decided to bow out of further discussions about "RSS with different tag names". This reminded me of a comment by Robert Sayre in Joshua's weblog

OTOH, your post was free of insults, hyperbole, and condescension. Dare is usually right when there is an actual technical issue, but we're talking politics

My level of exasperation with a lot of what was going on with the Atom effort made me more scathing than I tend to be in usual email discourse. This is one of the reasons I unsubscribed from the list but it seems I hurt a couple of people's feelings along the way. Sometimes it is easy to forget that the people on the other end of an email thread aren't former denizens of git.talk.flame who relish technical arguments spiced with flame. My apologies to any others that were as significantly affected by my comments.

Anyway, we all (Jag, Chris, Evan and I) played a three hour game of Catan while partaking of some of the nice Bourbon thoughtfully provided by Chris. Evan seems like he would have been a decent guy to talk to about blogging and syndication related technologies. I hope he enjoys his new job at Google.

TiVo calls...


Categories: Ramblings

June 13, 2004
@ 04:10 PM

I just found out that Lloyd Banks is about to drop an album, Hunger For More, all I can say is G-G-G-G-G-Unit. Cop that shit.

By the way if you haven't copped Twista's Kamikaze, you should. It's not as gangsta as Adrenaline Rush, instead its more radio friendly, but still off the chain. Almost every track sounds good enough to be a single, definitely all killer no filler. 


Categories: Ramblings

I recently read a post by a Jeff Dillon (a Sun employee) entitled .NET and Mono: The libraries were he criticizes the fact that the .NET Framework has Windows specific APIs. Specifically he writes

Where this starts to fall apart is with the .NET and Mono libraries. The Java API writers have always been very careful not to introduce an API which does not make sense on all platforms. This makes Java extremely portable at the cost of not being able to do native system programming in pure Java. With .NET, Microsoft went ahead and wrote all kinds of APIs for accessing the registry, accessing COM objects, changing NTFS file permissions, and other very windows specific tasks. In my mind, this immediately eliminates .NET or Mono from ever being a purely system independent platform.

While I was still digesting his comments and considering a response I read an excellent followup by Miguel De Icaza in his post On .NET and portability where he writes

First lets state the obvious: you can write portable code with C# and .NET (duh). Our C# compiler uses plenty of .NET APIs and works just fine across Linux, Solaris, MacOS and Windows. Scott also pointed to nGallery 1.6.1 Mono-compliance post which has some nice portability rules.
It is also a matter of how much your application needs to integrate with the OS. Some applications needs this functionality, and some others do not.

If my choice is between a system that does not let me integrate with the OS easily or a system that does, I personally rather use the later and be responsible for any portability issues myself. That being said, I personally love to write software that takes advantage of the native platform am on, specially on the desktop.

At first I was confused by Jeff's post given that it assumes that the primary goal of the .NET Framework is to create a Write Once Run Anywhere platform. It's been fairly obvious from all the noise coming out of Redmond about WinFX that the primarily goal of the .NET Framework is to be the next generation Windows programming API which replaces Win32. By the way check out the WinFX overview API as JPG or WinFX API Overview as PDF.  Of course, this isn't to say that Microsoft isn't interested in creating an interoperable managed platform which is why there has been ECMA standardization of C#, the Common Language Infrastructure (CLI) and the Base Class Library (BCL). The parts of the .NET Framework that are explicitly intended to be interoperable across platforms are all parts of the ECMA standardization process. That way developers can have their cake and eat it too. A managed API that takes full advantage of their target platform and a subset of this API which is intended to be interoperable and is standardized through the ECMA process.

Now that I think about it I realize that folks like Jeff probably have no idea what is going on in .NET developer circles and assume that the goals of Microsoft with the .NET Framework are the same as that of Sun with Java. That explains why he positions what many see as a flaw of the Java platform as a benefit that Microsoft has erred in not repeating. I guess one man's meat is another man's poison.  


Categories: Technology

In a recent post entitled 15 Science Street Tim Bray, one of the inventors of XML, writes

Microsoft’s main talking point (I’m guessing here from the public documents) was that their software and format had the advantage that in WordML you can edit documents from arbitrary schemas.

Our pushback on that was that editing arbitrary-schema documents is damn hard and damn expensive and has never been anything more than a niche business.

which seems not to jibe with my experiences. Many businesses have XML formats specific to their target industry (LegalXML, HR-XML, FpML, etc) and many businesses use office productivity suites to create and edit documents. It seems very logical to expect that people would like to use their existing spreadsheet and word processing applications to edit their business documents instead of using XMl editors or specialized tools. More interestingly Tim Bray contradicts his position that editing user-defined schemas is a niche scenario when he writes

As we were winding up, a couple of really smart people (don’t know who they were) put up their hands and asked real good questions. The best was essentially “What would you like to see happen?” After some back and forth, I ended up with “You should have the right to own your own information. It’s your intellectual capital and you worked hard to produce it for your citizens. Sun doesn’t own it, Microsoft doesn’t own it, you own it, and that means it should be living in a nice, long-lived, non-proprietary data format that isn’t anyone’s competitive weapon.”

He took the words right out of my mouth. This is exactly what Microsoft has done with Office 2003 by allowing users to edit documents in XML formats of their choosing. In the letter Bringing the XML Vision to the Desktop with Office 2003 written by Jean Paoli of Microsoft (also a co-inventor of XML) he writes

an even greater and more innovative benefit is the fact that companies can now create their own XML schemas specific to their business, define the structure and type of data that each data element in a document contains and exchange information with customers and business partners more easily. This capability opens up a whole new realm of possibilities, not only for end users, but also for the business itself because now organizations can capture and reuse critical information that in the past has been lost or gone unused. 

Office 2003 is a great step forward in enabling businesses and end users harness the power of XML in typical document interchange scenarios. Arguments about whether you should use Sun's XML format or Microsoft's XML format aren't the point. The point is which tools allow you to use your XML format with the most ease.




Categories: XML

I recently wrote that I want to make RSS Bandit compete more with commercial aggregators which elicited a comment about what exactly this means. Primarily it means that it is my intention that we should support what I consider are the three primary differentiating features of the commercial desktop aggregators I've seen (NetNewsWire, FeedDemon and NewzCrawler). The features are

  1. Newspaper Views: FeedDemon has the ability to display news items in a newspaper view which is a feature that Torsten batted around a few months ago but decided not to do because we didn't think it was that useful. However now that I read a number of feeds that tend to publish 30 - 50 items a day, being able to view the entries in a single page actually would be useful. My goal is for this feature to be 100% compatible with FeedDemon newsjpaper views meaning that you can use existing FeedDemon newspapers such as Radek's newspaper views for FeedDemon with RSS Bandit.

  2. WYSIWYG Weblog Editor: This feature was on my old RSS Bandit wishlist but I never got around to implementing it because of my displeasure with the MetaWeblog API. I've been waiting for the Atom project to produce a SOAP based API with built in authentication that would be widely supported by blogging tools before implementing this feature but it is now clear that such a specification won't be finalized anytime soon.  Since I don't do much GUI work I'll definitely need help from either Torsten or Phil with getting this done.

  3. NNTP Support: The promise of providing a uniform interface to various discussion forums whether they are Web based discussions exposed via RSS or in USENET is too attractive to pass up.

Of course, we will also fix the various bugs and respond to the various feature requests we've gotten from a number of our users. Torsten is currently on vacation and I'll most likely be gone for a week later on this month so development probably won't start in earnest until next month. Until then keep your feedback coming and thanks a lot for using RSS Bandit.


Categories: RSS Bandit

Chris Sells has announced the call for speakers for the Applied XML Developers Conference 5. From his post

Are you interested in presenting a 45-minute talk on some applied XML or Web Services topic? It doesn't matter which platform or OS you're targeting. It also doesn't matter whether you're an author or vendor or professional speaker or a developer in the trenches (in fact, I tend to be biased towards the latter). We're after interesting and unique applications of XML and Web Services technology and if you're doing good work in that area, then I need you to send me a session topic and 2-4 sentence abstract along with a little bit about yourself. I'll be taking submissions 'til the end of June, but don't delay...

...the conference itself is likely to be in Oregon during the 2nd or 3rd week of September, 2004, but we're still working the details out. One of the fun things that we're thinking about this year is to have the Dev.Conf. in Sunriver, Oregon, a resort and spa town in central Oregon where sun is plentiful and rain is scarce.

Previous XML DevCons have had a wide variety of interesting speakers. Unfortunately, the XML DevCon webpage doesn't provide any information on previous conferences. If you are interested in reports on last year's conference just type "XML DevCon" in your favorite Web search engine to locate blog postings from some of the attendees.

I probably won't be at this conference since the focus is usually XML Web Services while my professional interests are in core XML technologies with working with XML syndication formats being a hobby. However there should be lots of interesting presentations on XML Web Services and other leading edge applications of XML from industry experts if last year's conference is anything to go by.


Categories: XML

June 8, 2004
@ 09:22 AM

Jon Udell has started a series of blog posts about the pillars of Longhorn.  So far he has written Questions about Longhorn, part 1: WinFS and Questions about Longhorn, part 2: WinFS and semantics which ask the key question "If the software industry and significant parts of Microsoft such as Office and Indigo have decided on XML as the data interchange format, why is the next generation file system for Windows basically an object oriented database instead of an XML-centric database?" 

I'd be very interested in what the WinFS folks like Mike Deem would say in response to Jon if they read his blog. Personally, I worry less about how well WinFS supports XML and more about whether it will be fast, secure and failure resistant. After all, at worst WinFS will support XML as well as a regular file system does today which is good enough for me to locate and query documents with my favorite XML query language today. On the other hand, if WinFS doesn't perform well or shows the same good-idea-but-poorly-implemented nature of the Windows registry then it'll be a non-starter or much worse a widely used but often cursed aspect of Windows development (just like the Windows registry).

As Jon Udell points out the core scenarios touted for the encouraging the creation of WinFS (i.e search and adding metadata to files) don't really need a solution as complex or as intrusive to the operating system as WinFS. The only justification for something as radical and complex as WinFS is if Windows application developers end up utilizing it to meet their needs. However as an application developer on the Windows platform I primarily worry about three major aspects of WinFS. The first is performance, I definitely think having a query language over an optimized store in the file system is all good but wouldn't use it if the performance wasn't up to snuff. Secondly I worry about security, Longhorn evangelists like talking up what a wonderful world it would be if all my apps could share their data but ignore the fact that in reality this can lead to disasters. Having multiple applications share the same data store where one badly written application can corrupt the entire store is worrisome. This is the fundamental problem with the Windows registry and to a lesser extent the cause of DLL hell in Windows. The third thing I worry about is that the programming model will suck. An easy to use programming model often trumps almost any problem. Developers prefer building distributed applications using XML Web Services in .NET to the alternatives even though in some cases this choice leads to lower performance. The same developers would rather store information in the registry than come up with a robust alternative on their own because the programming model for the registry is fairly straightforward.

All things said, I think WinFS is an interesting idea. I'm still not sure it is a good idea but it is definitely interesting. Then again given that WinFS assimilated and thus delayed a very good idea from shipping, I may just be a biased SOB.

PS: I just saw that Jeremy Mazner posted a followup to Jon Udell's post entitled Jon Udell questions the value and direction of WinFS where he wrote

XML formats with well-defined, licensed schemas, are certainly a great step towards a world of open data interchange.  But XML files alone don't make it easier for users to find, relate and act on their information. Jon's contention is that full text search over XML files is good enough, but is it really?  I did a series of blog entries on WinFS scenarios back in February, and I don't think's Jon full text search approach would really enable these things. 

Jeremy mostly misses Jon's point which is aptly reduced to a single question at the beginning of this post. Jon isn't comparing full text search over random XML files on your file system to WinFS. He is asking why couldn't WinFS be based on XML instead of being an object oriented database.


Categories: Technology | XML