I'm in the end stages of doing the spec work for the various components in the System.Xml namespace  I am responsible for in the Whidbey betas. After the 4th of July holidays we plan to start doing initial brain storming for what feature work we should do in Orcas/Longhorn. I thought it would be valuable to have various users of XML in the .NET Framework suggest what they'd like us to do in the Orcas version of System.Xml. What changes would people like to see? For example, I'm putting Schematron and XPathReader on the 'nice to have' list. No idea is too unconventional since this is the early brainstorming and prototyping phase.

Caveat: The fact that a technology is mentioned as being on our 'nice to have' list or is suggested in a comment to this post is not an indication that it will be implemented in future versions of the .NET Framework.


 

Categories: Life in the B0rg Cube | XML

June 24, 2004
@ 06:21 AM

Few things are as nice as spending all day writing about XML technologies while sipping Belvedere and listening to G-Unit mix tapes interspersed with Bon Jovi's Crossroad album.

Definitely a good day.


 

Categories: Ramblings

June 23, 2004
@ 03:18 PM

From an E! Online article entitled Method Man Raps Fox

The show debuted last Wednesday to an audience of 8 million--a decent showing for a summer series. But critics assailed the sitcom as nothing more than a collection of racial stereotypes.

Method Man says that while the urban comedy retains a certain hip-hop flavor, it's doesn't jibe with the original subversive vision he and Method & Red's producers intended. Aside from the watered-down subject matter, Method Man lashed out at "lame jokes" that have managed to find their way into scripts and he bemoaned the use of a laugh track, which he said he never agreed to.
...
Method launched into his tirade against the Man after hearing an acting coach on local L.A. radio who said the duo's show smacked of "coonery," i.e., the racial stereotypes prevalent in era of Jim Crow, and criticism from such outlets as BET.com, which labeled Method & Red "unfunny" and attacked it for aiding a "downward spiral in black entertainment" by offering "a benign buffoonish broth ready for mainstream consumption."

"I'm no coon," the Soul Plane star vented. "I'm being criticized by people who have never set foot in the ghetto, who have never put up a brick inside the ghetto. I'm from the ghetto. We can't all be the Cosbys. There needs to be a yin and yang as far as what is shown of black people on television. But I don't want us to be used as a scapegoat for their crusades."

Everytime I've seen ads for this shitty show I've cringed. Movies like Soul Plane and Method & Red remind me a lot of Chris Rock's famous Niggas vs. Black People skit. I can't wait for this idiotic show to get cancelled.


 

Dave Winer recently wrote

We added a link to a page of encoding examples for descriptions, under Elements of <item>. The change is also noted on the Change Notes page.

I was one of the people that gave feedback on making this clarification to the RSS 2.0 specification and I'm glad it's made it in. Funny enough, not even a week goes before I've had a need to forward the link to an RSS feed producer explaining how to properly escape the content in <description> elements. In this case it was the Microsoft Research RSS feeds. Its pretty clear that this clarification was needed if the folks at MSR didn't get it right the first time they took a shot at it.


 

From the Showbox calendar

Friday June 25th - House of Blues presents D12 with SLUM VILLAGE and BONE CRUSHER and KING GORDY. $35.00 advance and day of show at TicketsWest and all outlets. Doors at 6PM. All ages. DRESSCODE ENFORCED!

I was planning to be out of town that day but this is a very, very tempting reason to hang around the S-town for an extra day or so. 


 

Categories: Ramblings

The folks at MSDN Chats have organized an online chat session on C# and XML for next month. The participants on the Microsoft side should include myself, Mark Fussell, Erik Meijer, Neetu Rajpal and couple of folks from the C# team. If you'd like to talk to us about topics surrounding XML and C# then log on to the XML and C# chat session at the following time

July 8, 2004
1:00 - 2:00 P.M. Pacific time
4:00 - 5:00 P.M. Eastern time
20:00 - 21:00 GMT

Event Reminders
OutlookAdd to Outlook Calendar
FriendTell a Friend

On a side note, am I the only one that thinks the MSDN Chats site is crying out for an RSS feed? I definitely would love to add it to the subscriptions list in my favorite news aggregator.


 

Categories: Life in the B0rg Cube | XML

I read Joel Spolsky's How Microsoft Lost the API War yesterday and found it pleasantly coincidental. Some of the issues Joel brings are questions I've begun asking myself and others at work so I definitely agree with a lot of the sentiments in the article. My main problem with Joel's piece is that it doesn't have a central theme but instead meanders a little and lumps together some related but distinct issues. From where I sit, Joel's article made a few different major & minor points which bear being teased out and talked about separately. The points I found most interesting were

Major Points

  1. The primary value of a platform is how productive it makes developers, its ubiquity and how much of a stable environment it provides for them over time. Microsoft's actions in how has released both the .NET Framework and its upcoming plans for Longhorn run counter to this conventional wisdom.
  2. Microsoft used to be religious about backwards compatibility, now it no longer is.

Minor Points 

  1. The trend in application development is moving to Web applications instead of desktop applications.
  2. A lot of developers using the .NET Framework use ASP.NET, client developers haven't yet embraced the .NET Framework.
  3. The primary goal of WinFS (making search better)  can be acheived by less intrusive, simpler mechanisms.

So now to dive into his points in more detail.

.NET and Longhorn as Fire & Motion

The primary value of a platform is how productive it makes developers, its ubiquity and how much of a stable environment it provides for them over time. Microsoft's actions in how has released both the .NET Framework and its upcoming plans for Longhorn run counter to this conventional wisdom.

Joel approaches this point from two angles. First of all he rhapsodizes about how the Windows team bends over backwards to achieve backwards compatibility in their APIs even when this means keeping bug compatibility with old versions or adding code to handle specific badly written applications . This means users can migrate applications from OS release to OS release thus widening the pool of applications that can be used per OS. This is in contrast to the actions of competitors like Apple. 

Secondly, he argues that Microsoft is trying to force too many paradigm shifts on developers in too short a time.  First of all, developers have to make the leap from native code (Win32/COM/ASP/ADO) to managed code (ASP.NET/ADO.NET) but now Microsoft has already telegraphed that another paradigm shift is coming in the shape of Longhorn and WinFX. Even if you've made the leap to using the .NET Framework, Microsoft has already stated that technologies in the next release of the .Net Framework (Winforms, ASP.NET Web Services) are already outclassed by technologies in the pipeline (Avalon, Indigo). However to get these later benefits one not only needs to upgrade the development platform but the operating system as well. This second point bothers me a lot and I actually shot a mail to some MSFT VPs about 2 weeks ago raising a similar point with regards to certain upcoming technologies. I expected to get ignored but actually got a reasonable response from Soma with pointers on folks to have followup discussions with. So the folks above are aware of the concerns in this space. Duh!

The only problem I have with Joel's argument in this regard is that I think he connects the dots incorrectly. He agrees that Windows programming was getting too complex and years of cruft  eventually begins to become difficult to manage.  He also thinks the .NET Framework makes developers more productive. So it seems introducing the .NET Framework was the smart thing for Microsoft to do. However he argues that not many people are using it (actually that not many desktop developers are using it) . There are two reasons for this which I know first hand as a developer of a desktop application that runs in the .NET Framework (RSS Bandit)

  • The .NET Framework isn't ubiqitous on Windows platforms
  • The .NET Framework does not expose enough Windows functionality to build a full fledged Windows application with only managed code.

Both of these issues are why Microsoft is working on WinFX. Again, the elepahant in the living room issue is that it seems that Microsoft's current plans are fix these issues for developing on Longhorn not all supported Windows platforms.

Losing the Backwards Compatibility Religion

Microsoft used to be religious about backwards compatibility, now it no longer is.

Based on my experience as a program Manager for the System.Xml namespace in the .NET Framework I'd say the above statement isn't entirely accurate. Granted, the .NET Framework hasn't been around long enough to acquire a lot of cruft we do already have to be careful about breaking changes. In fact, I'm currently in the process of organizing backing out a change we made in Whidbey to make our W3C XML Schema validation more compliant in a particular case because it broke a number of major XML Web Services on the Web.

However I don't think I've seen anyone go above and beyond to keep bug compatibility in the way Raymond Chen describes in his blog. But then again I don't have insight into what every team working on the .NET

WinFS, Just About Search?

The primary goal of WinFS (making search better)  can be acheived by less intrusive, simpler mechanisms.

I'd actually had a couple of conversations this week with folks related to WinFS, including Mike Deem and Scoble.  We talked about the fact that external perceptions of the whats and whys of WinFS don't really jibe with what's being built. A lot of people think WinFS is about making search better [even a number of Longorn evangelists and marketing folks]. WinFS is really a data storage and data access platform that aims to enable a lot of scenarios, one of which just so happens to be better search. In addition, whether you improve full text search and the indexing service used by the operating system is really orthogonal to WinFS.

The main problem is that what the WinFS designers think WinFS should be, what customers and competitors expect it to be and what has actually been shown to developers in various public Longhorn builds are all different. It makes it hard to talk about what WinFS is or should be when your everyone's mental image of it is slightly different.

Disclaimer: The above statements are my personal opinions and do not reflect the intentions, strategies, plans or opinions of my employer.


 

A few weeks ago I mentioned I was considering writing a current and future trends in social software, blogging and syndication as part of a Bill Gates "Think Week" paper. Well, it looks like someone beat me to the punch and he already got one as part of the most recent "Think Week". The person who submitted the paper shared BillG's comments which were pretty insightful about some of the issues facing syndication technologies today and some that will loom once their usage becomes more widespread. I have lunch scheduled with the author of the paper so I'll definitely try and exchange some ideas.

In the short term this means I'll probably put any plans of writing such a document on the back burner at least until the end of the summer. Given my current workload at my dayjob (7 hours of meetings tomorrow, didn't get in until about 11PM tonight)  as well as the fact that there is significant work I want to do for the next release of RSS Bandit this probably is for the best anyway.


 

Categories: Ramblings

June 18, 2004
@ 07:07 AM

My submission on Designing Extensible & Version Resilient XML Formats has been accepted to XML 2004. It looks like I'm going to be in Washington D.C. this fall. Currently I'm in the process of writing an article about the topic of my talk which should show up on MSDN and XML.com in the next month or so. Afterwards I plan to submit a revised version of that article as the paper for my talk.


 

Categories: XML

June 15, 2004
@ 04:17 PM

The ongoing conversation between Jeremy Mazner and Jon Udell about the capabilities of WinFS deepen this morning with Jeremy's post Did I misunderstand Udell's argument against WinFS? which was followed up by Jon's post When a journalist blogs. In his post Jon asks

We have standard query languages (XPath, XQuery), and standard ways of writing schemas (XSD, Relax), and applications (Office 2003) that with herculean effort have been adapted to work with these query and schema languages, and free-text search further enhancing all this goodness. Strategically, why not build directly on top of these foundations?

Tactically, why do I want to write code like this:

public class Person
  {
  [XmlAttribute()] public string Title;
  [XmlAttribute()] public string FirstName;
  [XmlAttribute()] public string MiddleName;
  [XmlAttribute()] public string LastName;
  ....

in order to consume data like this?

<People>
  <Person
    DisplayName="Woodgrove Bank"
    IMAddress="Support@woodgrovebank.com"
    UserTile=".\user_tiles\Adventure Works.jpg">
    <EmailAddresses>
        <EmailAddress
            Type="Work"
            Address="mortgage@woodgrovebank.com"/>
        <EmailAddress
            Type="Primary"
            Address="Support@woodgrovebank.com"/>
   </EmailAddresses>

I believe two things to be true. First, we have some great XML-oriented data management technologies. Second, the ambitious goals of WinFS cannot be met solely with those technologies. I'm trying to spell out where the line is being drawn between interop and functionality, and why, and what that will mean for users, developers, and enterprises.

Jon asks several questions and I'll try to answer all the ones I can. The first question about why WinFS doesn't build on XML, XQuery and XSD instead of items, OPath and the WinFS schema language is something that the WinFS folks will have to answer. Of course, Jon could also ask why it doesn't build on RDF, RDQL [or any of the other RDF query languages] and RDF Schema which is a related question that naturally follows from the answer to Jon's question.

The second question is why would one want to program against a Person object when they have a <Person> element. This is question has an easy answer which unfortunately doesn't sit well with me. The fact is that developers prefer programming against objects than they do programming with XML APIs. No XML API in the .NET Framework (XmlReader, XPathNavigator, XmlDocument, etc) comes close to the ease of use of  programming against strongly typed objects in the general case. Addressing this failing [and it is a failing] is directly my responsibility since I'm responsible for core XML APIs in the .NET Framework. Coincidentally, we just had a review with our new general manager yesterday and this same issue came up and he asked what we plan to do about this in future releases. I have some ideas. The main problem with using objects to program against XML is that although objects work well for programming against data-centric XML (rigidly structured tabular data such as an the data in an Excel spreadsheet, a database dump or serialized objects) there is a signficant impedance mismatch when trying to use strongly typed objects to program against document-centric XML (semi-structured data such as a Word document). However the primary scenarios the WinFS folks want to tackle are about rigidly structured data which works fine with using objects as the primary programming model.

Jon says that he is trying to draw the line between interop and functionality. I'm curious as to what he means by interop in this case. The fact that WinFS is based on items, OPath and WinFS schema doesn't mean that WinFS data cannot be exchanged in an interoperable manner (e.g. some form of XML export and import) nor does it mean that non-Microsoft applications cannot interact with WinFS. I should clarify that I have no idea what the WinFS folks consider their primary interop scenarios but I don't think the way WinFS is designed today means it cannot interoperate with other platforms or data models.

I suspect that Jon doesn't really mean interop when says so. I believe he is using the word the same way Java people use it where it really means 'One Language, One Programming Model, One Platform' everywhere instead of being able to communicate between disparate end points. In this case the language is XML and the platform is the XML family of technologies.


 

Categories: Life in the B0rg Cube | XML

In a post entitled My comments on the Infoworld article "Databases flex their XML" Michael Rys writes

Sean McCown wrote this analysis (PDF version) in Apr 2004. In the article, he compares the XML capabilities of the 4 major relational database systems (comparing publicly available versions) both in terms of functionality, ease, flexibility and speed, and adds a sidebar on Yukon. Before I start giving my comments on the article, let me disclose that I talked to Sean during his research for the article and answered his questions on SQL Server 2000 and Yukon. Thus, some of the comments below are just my attempts to make Sean's translation of my answers clearer, because I was not answering his questions clear enough :-).

Michael then goes on to clarify various points around the terminology used in the article, XQuery and SQL Server. Both Sean's article and Michael's followup are excellent reading for anyone interested in the growing trend of XML-enabled relational databases and how the big 3 relational database vendors stack up.


 

Categories: XML

I was pleasantly surprised today when I logged in to my Yahoo! Mail account and found out they've made good on their promise and now my mailbox size has gone up to 100MB from 6MB. I hope the folks at Hotmail are paying attention and upgrade the measely 2MB of space that they currently allocate to their free users.


 

Categories: Ramblings

June 14, 2004
@ 09:53 AM

For the past few weeks my friend Chris has had an open invitation for me to play Settlers of Catan with him and couple of other guys in the U-district. Today I finally accepted and when I got there it turned out that one of the guys was Evan Martin, one of the devs for LiveJournal. It turns out Evan just graduated from college and this was his last weekend in Seattle before moving to the Bay Area to start work at Google. Once we were introduced he mentioned that he knew of me and in fact that I was the reason he unsubscribed from the atom-syntax mailing list. It seems in one of the early discussions about Atom I wrote something which he felt was a technically valid point but was delivered in a scathing manner (i.e. punctuated with a flame) so he decided to bow out of further discussions about "RSS with different tag names". This reminded me of a comment by Robert Sayre in Joshua's weblog

OTOH, your post was free of insults, hyperbole, and condescension. Dare is usually right when there is an actual technical issue, but we're talking politics

My level of exasperation with a lot of what was going on with the Atom effort made me more scathing than I tend to be in usual email discourse. This is one of the reasons I unsubscribed from the list but it seems I hurt a couple of people's feelings along the way. Sometimes it is easy to forget that the people on the other end of an email thread aren't former denizens of git.talk.flame who relish technical arguments spiced with flame. My apologies to any others that were as significantly affected by my comments.

Anyway, we all (Jag, Chris, Evan and I) played a three hour game of Catan while partaking of some of the nice Bourbon thoughtfully provided by Chris. Evan seems like he would have been a decent guy to talk to about blogging and syndication related technologies. I hope he enjoys his new job at Google.

TiVo calls...


 

Categories: Ramblings

June 13, 2004
@ 04:10 PM

I just found out that Lloyd Banks is about to drop an album, Hunger For More, all I can say is G-G-G-G-G-Unit. Cop that shit.

By the way if you haven't copped Twista's Kamikaze, you should. It's not as gangsta as Adrenaline Rush, instead its more radio friendly, but still off the chain. Almost every track sounds good enough to be a single, definitely all killer no filler. 


 

Categories: Ramblings

I recently read a post by a Jeff Dillon (a Sun employee) entitled .NET and Mono: The libraries were he criticizes the fact that the .NET Framework has Windows specific APIs. Specifically he writes

Where this starts to fall apart is with the .NET and Mono libraries. The Java API writers have always been very careful not to introduce an API which does not make sense on all platforms. This makes Java extremely portable at the cost of not being able to do native system programming in pure Java. With .NET, Microsoft went ahead and wrote all kinds of APIs for accessing the registry, accessing COM objects, changing NTFS file permissions, and other very windows specific tasks. In my mind, this immediately eliminates .NET or Mono from ever being a purely system independent platform.

While I was still digesting his comments and considering a response I read an excellent followup by Miguel De Icaza in his post On .NET and portability where he writes

First lets state the obvious: you can write portable code with C# and .NET (duh). Our C# compiler uses plenty of .NET APIs and works just fine across Linux, Solaris, MacOS and Windows. Scott also pointed to nGallery 1.6.1 Mono-compliance post which has some nice portability rules.
...
It is also a matter of how much your application needs to integrate with the OS. Some applications needs this functionality, and some others do not.

If my choice is between a system that does not let me integrate with the OS easily or a system that does, I personally rather use the later and be responsible for any portability issues myself. That being said, I personally love to write software that takes advantage of the native platform am on, specially on the desktop.

At first I was confused by Jeff's post given that it assumes that the primary goal of the .NET Framework is to create a Write Once Run Anywhere platform. It's been fairly obvious from all the noise coming out of Redmond about WinFX that the primarily goal of the .NET Framework is to be the next generation Windows programming API which replaces Win32. By the way check out the WinFX overview API as JPG or WinFX API Overview as PDF.  Of course, this isn't to say that Microsoft isn't interested in creating an interoperable managed platform which is why there has been ECMA standardization of C#, the Common Language Infrastructure (CLI) and the Base Class Library (BCL). The parts of the .NET Framework that are explicitly intended to be interoperable across platforms are all parts of the ECMA standardization process. That way developers can have their cake and eat it too. A managed API that takes full advantage of their target platform and a subset of this API which is intended to be interoperable and is standardized through the ECMA process.

Now that I think about it I realize that folks like Jeff probably have no idea what is going on in .NET developer circles and assume that the goals of Microsoft with the .NET Framework are the same as that of Sun with Java. That explains why he positions what many see as a flaw of the Java platform as a benefit that Microsoft has erred in not repeating. I guess one man's meat is another man's poison.  


 

Categories: Technology

In a recent post entitled 15 Science Street Tim Bray, one of the inventors of XML, writes

Microsoft’s main talking point (I’m guessing here from the public documents) was that their software and format had the advantage that in WordML you can edit documents from arbitrary schemas.

Our pushback on that was that editing arbitrary-schema documents is damn hard and damn expensive and has never been anything more than a niche business.

which seems not to jibe with my experiences. Many businesses have XML formats specific to their target industry (LegalXML, HR-XML, FpML, etc) and many businesses use office productivity suites to create and edit documents. It seems very logical to expect that people would like to use their existing spreadsheet and word processing applications to edit their business documents instead of using XMl editors or specialized tools. More interestingly Tim Bray contradicts his position that editing user-defined schemas is a niche scenario when he writes

As we were winding up, a couple of really smart people (don’t know who they were) put up their hands and asked real good questions. The best was essentially “What would you like to see happen?” After some back and forth, I ended up with “You should have the right to own your own information. It’s your intellectual capital and you worked hard to produce it for your citizens. Sun doesn’t own it, Microsoft doesn’t own it, you own it, and that means it should be living in a nice, long-lived, non-proprietary data format that isn’t anyone’s competitive weapon.”

He took the words right out of my mouth. This is exactly what Microsoft has done with Office 2003 by allowing users to edit documents in XML formats of their choosing. In the letter Bringing the XML Vision to the Desktop with Office 2003 written by Jean Paoli of Microsoft (also a co-inventor of XML) he writes

an even greater and more innovative benefit is the fact that companies can now create their own XML schemas specific to their business, define the structure and type of data that each data element in a document contains and exchange information with customers and business partners more easily. This capability opens up a whole new realm of possibilities, not only for end users, but also for the business itself because now organizations can capture and reuse critical information that in the past has been lost or gone unused. 

Office 2003 is a great step forward in enabling businesses and end users harness the power of XML in typical document interchange scenarios. Arguments about whether you should use Sun's XML format or Microsoft's XML format aren't the point. The point is which tools allow you to use your XML format with the most ease.

 

 


 

Categories: XML

I recently wrote that I want to make RSS Bandit compete more with commercial aggregators which elicited a comment about what exactly this means. Primarily it means that it is my intention that we should support what I consider are the three primary differentiating features of the commercial desktop aggregators I've seen (NetNewsWire, FeedDemon and NewzCrawler). The features are

  1. Newspaper Views: FeedDemon has the ability to display news items in a newspaper view which is a feature that Torsten batted around a few months ago but decided not to do because we didn't think it was that useful. However now that I read a number of feeds that tend to publish 30 - 50 items a day, being able to view the entries in a single page actually would be useful. My goal is for this feature to be 100% compatible with FeedDemon newsjpaper views meaning that you can use existing FeedDemon newspapers such as Radek's newspaper views for FeedDemon with RSS Bandit.

  2. WYSIWYG Weblog Editor: This feature was on my old RSS Bandit wishlist but I never got around to implementing it because of my displeasure with the MetaWeblog API. I've been waiting for the Atom project to produce a SOAP based API with built in authentication that would be widely supported by blogging tools before implementing this feature but it is now clear that such a specification won't be finalized anytime soon.  Since I don't do much GUI work I'll definitely need help from either Torsten or Phil with getting this done.

  3. NNTP Support: The promise of providing a uniform interface to various discussion forums whether they are Web based discussions exposed via RSS or in USENET is too attractive to pass up.

Of course, we will also fix the various bugs and respond to the various feature requests we've gotten from a number of our users. Torsten is currently on vacation and I'll most likely be gone for a week later on this month so development probably won't start in earnest until next month. Until then keep your feedback coming and thanks a lot for using RSS Bandit.


 

Categories: RSS Bandit

Chris Sells has announced the call for speakers for the Applied XML Developers Conference 5. From his post

Are you interested in presenting a 45-minute talk on some applied XML or Web Services topic? It doesn't matter which platform or OS you're targeting. It also doesn't matter whether you're an author or vendor or professional speaker or a developer in the trenches (in fact, I tend to be biased towards the latter). We're after interesting and unique applications of XML and Web Services technology and if you're doing good work in that area, then I need you to send me a session topic and 2-4 sentence abstract along with a little bit about yourself. I'll be taking submissions 'til the end of June, but don't delay...

...the conference itself is likely to be in Oregon during the 2nd or 3rd week of September, 2004, but we're still working the details out. One of the fun things that we're thinking about this year is to have the Dev.Conf. in Sunriver, Oregon, a resort and spa town in central Oregon where sun is plentiful and rain is scarce.

Previous XML DevCons have had a wide variety of interesting speakers. Unfortunately, the XML DevCon webpage doesn't provide any information on previous conferences. If you are interested in reports on last year's conference just type "XML DevCon" in your favorite Web search engine to locate blog postings from some of the attendees.

I probably won't be at this conference since the focus is usually XML Web Services while my professional interests are in core XML technologies with working with XML syndication formats being a hobby. However there should be lots of interesting presentations on XML Web Services and other leading edge applications of XML from industry experts if last year's conference is anything to go by.


 

Categories: XML

June 8, 2004
@ 09:22 AM

Jon Udell has started a series of blog posts about the pillars of Longhorn.  So far he has written Questions about Longhorn, part 1: WinFS and Questions about Longhorn, part 2: WinFS and semantics which ask the key question "If the software industry and significant parts of Microsoft such as Office and Indigo have decided on XML as the data interchange format, why is the next generation file system for Windows basically an object oriented database instead of an XML-centric database?" 

I'd be very interested in what the WinFS folks like Mike Deem would say in response to Jon if they read his blog. Personally, I worry less about how well WinFS supports XML and more about whether it will be fast, secure and failure resistant. After all, at worst WinFS will support XML as well as a regular file system does today which is good enough for me to locate and query documents with my favorite XML query language today. On the other hand, if WinFS doesn't perform well or shows the same good-idea-but-poorly-implemented nature of the Windows registry then it'll be a non-starter or much worse a widely used but often cursed aspect of Windows development (just like the Windows registry).

As Jon Udell points out the core scenarios touted for the encouraging the creation of WinFS (i.e search and adding metadata to files) don't really need a solution as complex or as intrusive to the operating system as WinFS. The only justification for something as radical and complex as WinFS is if Windows application developers end up utilizing it to meet their needs. However as an application developer on the Windows platform I primarily worry about three major aspects of WinFS. The first is performance, I definitely think having a query language over an optimized store in the file system is all good but wouldn't use it if the performance wasn't up to snuff. Secondly I worry about security, Longhorn evangelists like talking up what a wonderful world it would be if all my apps could share their data but ignore the fact that in reality this can lead to disasters. Having multiple applications share the same data store where one badly written application can corrupt the entire store is worrisome. This is the fundamental problem with the Windows registry and to a lesser extent the cause of DLL hell in Windows. The third thing I worry about is that the programming model will suck. An easy to use programming model often trumps almost any problem. Developers prefer building distributed applications using XML Web Services in .NET to the alternatives even though in some cases this choice leads to lower performance. The same developers would rather store information in the registry than come up with a robust alternative on their own because the programming model for the registry is fairly straightforward.

All things said, I think WinFS is an interesting idea. I'm still not sure it is a good idea but it is definitely interesting. Then again given that WinFS assimilated and thus delayed a very good idea from shipping, I may just be a biased SOB.

PS: I just saw that Jeremy Mazner posted a followup to Jon Udell's post entitled Jon Udell questions the value and direction of WinFS where he wrote

XML formats with well-defined, licensed schemas, are certainly a great step towards a world of open data interchange.  But XML files alone don't make it easier for users to find, relate and act on their information. Jon's contention is that full text search over XML files is good enough, but is it really?  I did a series of blog entries on WinFS scenarios back in February, and I don't think's Jon full text search approach would really enable these things. 

Jeremy mostly misses Jon's point which is aptly reduced to a single question at the beginning of this post. Jon isn't comparing full text search over random XML files on your file system to WinFS. He is asking why couldn't WinFS be based on XML instead of being an object oriented database.


 

Categories: Technology | XML

June 6, 2004
@ 04:41 AM

Tim Bray has a post entitled Whiskey-Bar Economics where he writes

As an added bonus, in the comments someone has posted a pointer to this, which (if even moderately accurate) is pretty astounding.

I'm not sure what is pretty astounding about CostOfWar.com. The Javascript on the site seems pretty basic, the core concept behind the site is opportunity cost which is explained in freshman economics class of the average college or university and the numbers from the site actually seem to be lowballed considering all the headlines I seem to read every month about the Bush administration requesting another couple of billion for the Iraq effort. For example, according to a USA Today article entitled Bush to request $25 billion for Iraq war costs, the US congress had already approved $163 billion for the War on Iraq when the another request for $25 billion showed up. Yet at the current time CostOfWar.com, claims that the war has cost $116 billion.

On the other hand, I think this is pretty astounding.


 

Categories: Ramblings

June 6, 2004
@ 04:18 AM

One of my friends, Joshua Allen, is a fan of RDF and Semantic Web technologies. Given that I respect his opinion a lot I keep trying to delve into RDF and its family of technologies every couple of months to see what it provides to the world of data access and information interchange above and beyond existing technologies. Recently I discovered that there are some in the RDF camp that position it as a "better XML". The first example of this I saw was an old article by Tim Berners-Lee entitled Why RDF model is different from the XML model. According to Tim the note is an attempt to answer the question, "Why should I use RDF - why not just XML?". However instead of answering the question his note just left me with more questions than answers. The pivotal point for me in Tim Berners-Lee's note is the following excerpt

Things you can do with RDF which you can't do with XML include

  • You can parse the semantic tree, which end up giving you a set of (possibly mutually referential) triples and then you can use the ones you want ignoring the ones you don't understand.

Problems with basing you understanding on the structure include

  • Without having gone to the trouble of getting the schema, or having an application hand-programmed to recognise a particular document type, you can't pick up any semantic information from a document;
  • When an XML schema changes, it could typically introduce new intermediate elements (like "details" in the tree above or "div" is HTML). These may or may or may not invalidate any query which has been based on the structure of the document.
  • If you haven't gone to the trouble of making a semantic model, then you may not have a well defined one.

It seems that the point being argued is that with RDF you can get more understanding of the information in the document than with just XML. Being that one could consider RDF as just a logical model layered on top of an XML document (e.g. RDF/XML) I find it hard to understand how viewing some XML document through RDF colored glasses buys one so much more understanding of the data.

Recently I discovered a presentation entitled REST, Self-description, and XML by Mark Baker. This presentation discusses the ideas in Tim Berners-Lee's note in more depth and in a way I finally understand. The first key idea in Mark's presentation is the notion of "self describing" data formats which were also covered in Tim Berners-Lee's presentation at WWW2002 entitled Specs Count. The core tennets of "self describing" data formats are covered in slide 10 and slide 11 of Mark's presentation. A "self describing" data formats contains all the data needed to figure out how to process the format from publically accessible specs. For example, an HTTP response tells you the MIME type of the document which can be used to locate the appropriate RFC which governs how the format should be processed. In the case of XML, Tim Berners-Lee states that an HTTP response which returns an XML document either as application\xml or text\xml should be processed according to the rules of the XML and XML namespaces recommendations which state that the identity of an element is determined based on its namespace name. So when processing an XML document, Tim asserts that it is self describing because one can locate the spec for the format from the namespace URI of the root element. Of course, Mark disagrees with this but his reasons for doing so is pedantic spec lawyering. I disagree with it as well but for different reasons. The main reason I disagree with it is because it puts a stake in the ground and says that any XML format on the Web that doesn't use namespace name for its root element or whose namespace name is not a dereferenceable URI that leads to a spec is broken. This automatically states that XML formats used on the Web today such as RSS 1.0, RSS 2.0, OPML and the Atom 0.3 syndication format are broken.

Mark then goes on to state in slide 20 that a problem with XML formats is that one can't arbitrarily extend an XML document without it's schema or without breaking some application somewhere. It's unclear as to what he means by the document's schema but will grant that it is likely that arbitrary additions to the expected content of an XML document will break certain applications. Getting to slide 24, it is slightly clearer what Mark is getting at. He claims that one although one can add extend a format by adding extra elements from a known namespace using just XML technologies this doesn't tell you how to deal with the extensions. On the other hand, with RDF the extensions are all concepts named with a URI whose meaning can then be looked up using HTTP GET. This is where he lost me. I don't see the difference between seeing a namespaced XML element in an XML format and using HTTP GET on the namespace URI of the element to locate the spec or schema for the namespaced extension and what he describes as the gains of using RDF.

The more I look at how RDF people bag on XML the more it seems that they don't really write applications in today's world. Almost every situation I've seen someone claim that RDF technologies will in the future be able to solve a problem XML cannot, the problem is actually not only solveable with XML technologies but actually is being solved using XML technologies today.  


 

Categories: XML

One of the more annoying aspects of writing Windows applications using the .NET Framework is that eventually you brush up against the limitations in the APIs provided by the managed classes and end up having to use interop to talk to Win32 or COM-based APIs. This process typically involves exposing native code in a manner that makes them look like managed APIs when in fact they are not. When there is an error in this mapping it results in hard-to-track memory corruption errors. All of the fun of annoying C and C++ memory corruption errors in the all new, singing and dancing .NET Framework.

Most recently we were bitten by this in RSS Bandit and probably would never have tracked this problem down if not for a coincidence. As part of forward compatibility testing at Microsoft, a number of test teams run existing .NET applications on current builds of future versions of the .NET Framework. One of these test teams decided to use RSS Bandit as a test application. However it seemed they could never get RSS Bandit to start without the application crashing almost instantly. Interestingly, it crashed at different points in the code depending on whether one compiled and ran the application on the current build of the .NET Framework or just ran an executable compiled against an older version of the .NET Framework on the current build. Bugs were filed against folks on the CLR team and the problem was tracked down.

It turns out that our declaration of the STARTUPINFO struct obtained from PInvoke.NET was incorrect. Specifically the following fields which were declared as

 [MarshalAs(UnmanagedType.LPWStr)] public string  lpReserved;
 [MarshalAs(UnmanagedType.LPWStr)] public string  lpDesktop;
 [MarshalAs(UnmanagedType.LPWStr)] public string  lpTitle;

were declared incorrectly. We should have declared them as

public IntPtr lpReserved; 
public IntPtr lpDesktop; 
public IntPtr lpTitle;

The reason for not declaring them as strings is that the Interop marshaler, after having converted the string to a managed string, will release the native data using CoTaskMemFree. This is clearly not the right thing to do in this case so we need to declare the fields as IntPtrs and then manually marshal them to strings via the Marshal.PtrToStringUni() API.

The problems with errors that occur due to such memory corruption issues is that their results are unpredictable. Some users may never witness a crash, while others witness the crash when their machines are under memory pressure or in some cases it crashes right away. Of course, the crash is never in the same place twice. Not only do these problems waste lots of developer time trying to track them down they lalso lead to negative user experience with the target application.

Hopefully, when Longhorn ships and introduces WinFX this class of problem will become a thing of the past. In the meantime, I need to spend some time going over our code that does all the Win32 interop to ensure that there are no other such issues waiting to rear their head.


 

Categories: Technology

I find it interesting how often developers tend to reinvent because of looking at a problem from only one perspective. Today I read a blog post by Sean Gephardt called RSS and syndication Ideas? where he repeats two common misconception about RSS and syndication technologies. He wrote

What if I only want certain folks to has access to my RSS?

I could require the end user to signin to my site, then provide them access to my RSS feeds, but then they would be required to sign in everytime they tried to update thier view.

More specifically, how could a company track people that have subscribed to a particular RSS feed once they are viewing it in an aggregator? Obviously, if someone actually views the page referenced, then web site tracking applies, but some aggregators I've seen simply render the contents of the description, which if it contains a URL to somewhere, and the user clicks that link, the reader gets taken over to that URL, bypassing the orignal.

Since there is no security around RSS and aggregrators, and no way to prompt users for say, a Passport authentication, should RSS be used only for "public" information? Do you make people sign in once they try to access the “deeper” content? Do you keep the RSS content limited to help drive people to the “real“ content?

Am I missing something glaringly obvious?

Considering that fetching an RSS feed is simply fetching an XML document over the Web using HTTP and there are existing technologies for authenticating and encrypting HTTP requests, I'd have to say "Yes, you have missed something glaringly obvious Sean". In fact, not only can you authenticate and encrypt RSS feeds with the same authentication means used by the rest of the World Wide Web, aggregators like RSS Bandit already support this functionality. In fact, here is a list of aggregators that support private RSS feeds.

As for how to how to track readership of content in RSS feeds. A number of tools already support tracking such statistics using web bugs such as dasBlog and .TEXT. One could also utilize alternate approaches if the feeds are private feeds since one could assign a separate URL to each user.

All of this is stuff that already works on today's World Wide Web when interacting with HTML and HTTP. It is interesting that some people think that once you swap out HTML with XML, entire new approaches must be built from the ground up.

 


 

Josh Ledgard (who along with his wife Gretchen hosted an excellent barbeque this past memorial day weekend) has a post entitled Blogs, Alpha Builds, Customer Community, and Legal Issues where he discusses some of the questions around legal issues some of us have been asking about in the B0rg cube with regards to the growing push to be more open and interact more directly with customers. Josh writes

Blogging Disclaimers

Some Microsofties are now including disclaimer text at the end of each posting in addition to linking to a disclaimer on the sidebar.  A long internal thread went around where “best practice” guidance was given from a member of the legal team that included inserting the disclaimer into every entry as well as in any comment we leave on other blogs. 

The various discussions I've seen around blogging disclaimers often boil down to pointing out that they are unlikely to be useful in preventing real trouble (i.e. some customer who gets pissed at bad advice he gets from a Microsoft employee and decides to sue). Of course, I don't know if this has been tested in court so take my opinions with a grain of salt. I look at disclaimers as a way of separating personal opinion from Microsoft's official position. This leads me to another statement from Josh

From a purely non-legal perspective I would also have to call BS on the standard disclaimer text.

“These postings are provided "AS IS" with no warranties, and confers no rights. The content of this site contains my own personal opinions and does not represent my employer's view in anyway.”

I have no problem with the first sentence, but the second would bother me a bit.  What represents a company better than the collective values and opinions of its employees that are expressed through their blogs. 

I completely disagree with Josh here. I don't believe that my personal opinion and the Microsoft official position are the same thing even though some assume that we are b0rg. Also I want to be able to make it clear when what I am saying is my personal opinion and when what I am saying somewhat reflects an official Microsoft position. For example, I am the program manager responsible for XML schema technologies in the .NET Framework and statements I make around these technologies may be considered by some to be official statements independent of where these statements are made. If I write “RELAX NG is an excellent XML schema language and is superior to W3C XML Schema for validating XML documents in a variety of cases”, some could consider this an indication that Microsoft will start supporting RELAX NG in some of its products. However this would be an inaccurate assumption since that comment is personal opinion and not a reflection of Microsoft's official policy which is unified around using W3C XML Schema for describing the structure and validation of XML documents. I tend to use disclaimers to separate my personal opinions from my statements as a Microsoft employee although a lot of times the lines blur.


 

Categories: Life in the B0rg Cube

As I mentioned yesterday Doug Purdy posted an insightful entry in response to Ted Neward's about the inappropriateness of returning ADO.NET DataSets from XML Web Services. Today Ted Neward has a post entitled  Why Purchase Orders are the root of all evil? which almost entirely misses the point of Doug's post.

Ted writes

Could you tell me what the schema should be? Doug, it's right there in front of you: the class definition itself is a schema definition for objects of its type. The question I think you mean to ask is, "What the XML schema should be for this Purchase Order?", but I can't do that, because you've already stepped way out into la-la land as far as XML/XSD goes by making use of generic types (like Dictionary) for which there is no XSD equivalent; sure, we can rpc-encode one up, but we're back to turning objects into XML and back again, and I thought we didn't like that....?

Could you tell me what each particle of the schema means? Well, the LineItemAddedEvent certainly isn't a schema construct, so I'm guessing that'll have to be the XML-based representation of a .NET delegate.... the IAddress has no implementation behind it that I can see so once again I'll have to punt....

Oh, I get it... Doug's using one of them anti-pattern thingies to show us what not to do when trying to define types in XML/XSD for use in Web services (or WebServices or web services or however we've decided to spell these silly things anyway).

You're absolutely right, Doug--the way that thing is written, Purchase Orders, while perhaps not the root of ALL evil, are certainly evil and therefore should be banned from the WS-* camp immediately.

Seriously, dude, DataSets as return values from Web services are evil. Get over it.

What I find interesting is that Ted Neward is looking at XML Web Services through the perspective of distributed objects. His entire arguments hinge around the fact that his applications convert XML into Java or CLR objects so the XML returned must be something that is condusive to converting to objects easily. Doug accurately points out that there is no one-to-one mapping between an XML schema and a CLR object. Arguing that your favorite platform has one-to-one mappings for some XML schemas and not others thus banning various XML formats from participating in XML Web Services is a very limiting viewpoint. I'd like to ask Ted whether he also would ban XBRL, wordProcessingML or UBL documents from being used in XML Web Services because there aren't easy ways to convert them to a handy, dandy Java object with strongly typed members and all that jazz.  

I don't dispute the practical reasons for discouraging developers from returning ADO.NET DataSets from XML Web Services since most developers trying to access the XML Web Services just use a toolkit that pretends you are building distributed object applications. Usually such toolkits either barf horribly when faced with XML they don't grok or force developers to have deal with scary angle brackets directly instead of the object facade they know & love (ASP.NET XML Web Services included). This is a practical reason to avoid exposing ADO.NET DataSets from XML Web Services that may be accessed from Java platforms especially since such platforms don't make it easy to deal with raw XML.

On the other hand, claiming that there is some philosophical reason not to expose data from an XML Web Service that may be be semi-structured and full of unknown data (i.e XML data) seems quite antithetical to the entire point of XML Web Services and the Service Oriented Architecture fad.

 


 

Categories: XML

We have a few open developer and program manager positions on the WebData XML team at Microsoft. Are you are interested in working on implementing XML technologies that will impact not only significant aspects of Microsoft but the software industry at large? Would like to get a chance to collaborate with teams as diverse as Office, Windows (Avalon, Indigo & WinFS), BizTalk, SQL Server and Visual Studio on building the next generation of XML technologies for the Microsoft platform? Do you get passionate about XML or related technologies? If so take a gander at the following open job descriptions on our team and if you believe you qualify send mail to xmljobATmicrosoft.com

See you soon.


 

Categories: Life in the B0rg Cube

It seems every few months there are a series of blog posts or articles about why returning ADO.NET DataSet objects from XML Web Services.  I saw the most recent incarnation of this perma-debate in Scott Hansellman's post Returning DataSets from WebServices is the Spawn of Satan and Represents All That Is Truly Evil in the World and Ted Neward's More on why DataSets are the Root of all Evil.

I was going to type up a response to both posts until I saw Doug Purdy's amusing response, PurchaseOrders are the root of all evil, which succintly points out the flaws in Scott and Ted's arguments.

Now I'm off to bed.


 

Categories: Mindless Link Propagation | XML

June 3, 2004
@ 07:19 AM

I've been thinking a bit about false goals and software projects. Often decisions are made about the design of a technology or product early in the life of a software project that are based on certain assumptions about the software landscape. However in many cases these design principles lose relevancy as the project goes on but rarely are the original design principles of the project questioned. This leads to members of the project chasing goals that actually aren't beneficial to the product or to its customers and which in fact may be detrimental, these are false goals.  

Always remember to question everything.


 

Categories: Ramblings

I just read Tim Bray's entry entitled SOA Talk where he mentions listening to Steve Gillmor, Doc Searls, Jon Udell, Dana Gardner, and Dan Farber talk about SOA via “The Gillmor Gang” at ITConversations. I tried to listen to the radio show a few days ago but had the same problems Tim had. A transcript would definitely be appreciated.

What I found interesting is this excerpt from Tim Bray's blog post

Apparently a recent large-scale survey of professionals revealed that “SOA” has positive buzz and high perceived relevance, while “Web Services” scores very low. Huh?

This is very unsurprising to me. Regular readers of my blog may remember I wrote about the rise of the Service Oriented Architecture fad a few months ago. Based on various conversations with different people involved with XML Web Services and SOA I tend to think my initial observations in that post were accurate. Specifically I wrote

The way I see it the phrase "XML Web Services" already had the baggage of WSDL, SOAP, UDDI, et al so there a new buzzphrase was needed that highlighted the useful aspects of "XML Web Services" but didn't tie people to one implementation of these ideas but also adopted the stance that approaches such as CORBA or REST make sense as well.

Of the three words in the phrase "XML Web Services" the first two are implementation specific and not in a good way. XML is good thing primarily because it is supported by lots of platforms and lots of vendors not because of any inherrent suitability of the technology for a number of the tasks people utilize it for. However in situations where this interop is not really necessary then XML is not really a good idea. In the past, various distributed computing afficionados have tried to get around this by talking up the The InfoSet which was just a nice way of deprecating the notion of usage of the XML text format everywhere being a good thing. The second word in the phrase is similarly inapllicable in the general case. Most of the people interested in XML Web Services are interested in distributed computing which traditionally and currently is more about the intranet than it is about the internet. The need to justify the Web-like nature of XML Web Services when in truth these technologies probably aren't going to be embraced on the Web in a big way seems to have been a sore point of many discussions in distributed computing circles.

Another reason I see for XML Web Services having negative buzz versus SOA is that when many people think of XML Web Services, they think of overhyped technologies that never delivered such as Microsoft's Hailstorm.  On the other hand, SOA is about applying the experiences of 2 decades of building distributed applications to building such applications today and in the future. Of course, there are folks at Microsoft who are wary of being burned by the hype bandwagon and there've already been some moves by some of the thought leadership to distance what Microsoft is doing from the SOA hype. One example of this is the observation that lots of the Indigo folks now talk about 'Service Orientation' instead of 'Service Oriented Architecture'.

Disclaimer: The above comments do not represent the thoughts, intentions, plans or strategies of my employer. They are solely my opinion.


 

Categories: Technology | XML