I recently read a post by a Jeff Dillon (a Sun employee) entitled .NET and Mono: The libraries were he criticizes the fact that the .NET Framework has Windows specific APIs. Specifically he writes

Where this starts to fall apart is with the .NET and Mono libraries. The Java API writers have always been very careful not to introduce an API which does not make sense on all platforms. This makes Java extremely portable at the cost of not being able to do native system programming in pure Java. With .NET, Microsoft went ahead and wrote all kinds of APIs for accessing the registry, accessing COM objects, changing NTFS file permissions, and other very windows specific tasks. In my mind, this immediately eliminates .NET or Mono from ever being a purely system independent platform.

While I was still digesting his comments and considering a response I read an excellent followup by Miguel De Icaza in his post On .NET and portability where he writes

First lets state the obvious: you can write portable code with C# and .NET (duh). Our C# compiler uses plenty of .NET APIs and works just fine across Linux, Solaris, MacOS and Windows. Scott also pointed to nGallery 1.6.1 Mono-compliance post which has some nice portability rules.
...
It is also a matter of how much your application needs to integrate with the OS. Some applications needs this functionality, and some others do not.

If my choice is between a system that does not let me integrate with the OS easily or a system that does, I personally rather use the later and be responsible for any portability issues myself. That being said, I personally love to write software that takes advantage of the native platform am on, specially on the desktop.

At first I was confused by Jeff's post given that it assumes that the primary goal of the .NET Framework is to create a Write Once Run Anywhere platform. It's been fairly obvious from all the noise coming out of Redmond about WinFX that the primarily goal of the .NET Framework is to be the next generation Windows programming API which replaces Win32. By the way check out the WinFX overview API as JPG or WinFX API Overview as PDF.  Of course, this isn't to say that Microsoft isn't interested in creating an interoperable managed platform which is why there has been ECMA standardization of C#, the Common Language Infrastructure (CLI) and the Base Class Library (BCL). The parts of the .NET Framework that are explicitly intended to be interoperable across platforms are all parts of the ECMA standardization process. That way developers can have their cake and eat it too. A managed API that takes full advantage of their target platform and a subset of this API which is intended to be interoperable and is standardized through the ECMA process.

Now that I think about it I realize that folks like Jeff probably have no idea what is going on in .NET developer circles and assume that the goals of Microsoft with the .NET Framework are the same as that of Sun with Java. That explains why he positions what many see as a flaw of the Java platform as a benefit that Microsoft has erred in not repeating. I guess one man's meat is another man's poison.  


 

Categories: Technology

In a recent post entitled 15 Science Street Tim Bray, one of the inventors of XML, writes

Microsoft’s main talking point (I’m guessing here from the public documents) was that their software and format had the advantage that in WordML you can edit documents from arbitrary schemas.

Our pushback on that was that editing arbitrary-schema documents is damn hard and damn expensive and has never been anything more than a niche business.

which seems not to jibe with my experiences. Many businesses have XML formats specific to their target industry (LegalXML, HR-XML, FpML, etc) and many businesses use office productivity suites to create and edit documents. It seems very logical to expect that people would like to use their existing spreadsheet and word processing applications to edit their business documents instead of using XMl editors or specialized tools. More interestingly Tim Bray contradicts his position that editing user-defined schemas is a niche scenario when he writes

As we were winding up, a couple of really smart people (don’t know who they were) put up their hands and asked real good questions. The best was essentially “What would you like to see happen?” After some back and forth, I ended up with “You should have the right to own your own information. It’s your intellectual capital and you worked hard to produce it for your citizens. Sun doesn’t own it, Microsoft doesn’t own it, you own it, and that means it should be living in a nice, long-lived, non-proprietary data format that isn’t anyone’s competitive weapon.”

He took the words right out of my mouth. This is exactly what Microsoft has done with Office 2003 by allowing users to edit documents in XML formats of their choosing. In the letter Bringing the XML Vision to the Desktop with Office 2003 written by Jean Paoli of Microsoft (also a co-inventor of XML) he writes

an even greater and more innovative benefit is the fact that companies can now create their own XML schemas specific to their business, define the structure and type of data that each data element in a document contains and exchange information with customers and business partners more easily. This capability opens up a whole new realm of possibilities, not only for end users, but also for the business itself because now organizations can capture and reuse critical information that in the past has been lost or gone unused. 

Office 2003 is a great step forward in enabling businesses and end users harness the power of XML in typical document interchange scenarios. Arguments about whether you should use Sun's XML format or Microsoft's XML format aren't the point. The point is which tools allow you to use your XML format with the most ease.

 

 


 

Categories: XML

I recently wrote that I want to make RSS Bandit compete more with commercial aggregators which elicited a comment about what exactly this means. Primarily it means that it is my intention that we should support what I consider are the three primary differentiating features of the commercial desktop aggregators I've seen (NetNewsWire, FeedDemon and NewzCrawler). The features are

  1. Newspaper Views: FeedDemon has the ability to display news items in a newspaper view which is a feature that Torsten batted around a few months ago but decided not to do because we didn't think it was that useful. However now that I read a number of feeds that tend to publish 30 - 50 items a day, being able to view the entries in a single page actually would be useful. My goal is for this feature to be 100% compatible with FeedDemon newsjpaper views meaning that you can use existing FeedDemon newspapers such as Radek's newspaper views for FeedDemon with RSS Bandit.

  2. WYSIWYG Weblog Editor: This feature was on my old RSS Bandit wishlist but I never got around to implementing it because of my displeasure with the MetaWeblog API. I've been waiting for the Atom project to produce a SOAP based API with built in authentication that would be widely supported by blogging tools before implementing this feature but it is now clear that such a specification won't be finalized anytime soon.  Since I don't do much GUI work I'll definitely need help from either Torsten or Phil with getting this done.

  3. NNTP Support: The promise of providing a uniform interface to various discussion forums whether they are Web based discussions exposed via RSS or in USENET is too attractive to pass up.

Of course, we will also fix the various bugs and respond to the various feature requests we've gotten from a number of our users. Torsten is currently on vacation and I'll most likely be gone for a week later on this month so development probably won't start in earnest until next month. Until then keep your feedback coming and thanks a lot for using RSS Bandit.


 

Categories: RSS Bandit

Chris Sells has announced the call for speakers for the Applied XML Developers Conference 5. From his post

Are you interested in presenting a 45-minute talk on some applied XML or Web Services topic? It doesn't matter which platform or OS you're targeting. It also doesn't matter whether you're an author or vendor or professional speaker or a developer in the trenches (in fact, I tend to be biased towards the latter). We're after interesting and unique applications of XML and Web Services technology and if you're doing good work in that area, then I need you to send me a session topic and 2-4 sentence abstract along with a little bit about yourself. I'll be taking submissions 'til the end of June, but don't delay...

...the conference itself is likely to be in Oregon during the 2nd or 3rd week of September, 2004, but we're still working the details out. One of the fun things that we're thinking about this year is to have the Dev.Conf. in Sunriver, Oregon, a resort and spa town in central Oregon where sun is plentiful and rain is scarce.

Previous XML DevCons have had a wide variety of interesting speakers. Unfortunately, the XML DevCon webpage doesn't provide any information on previous conferences. If you are interested in reports on last year's conference just type "XML DevCon" in your favorite Web search engine to locate blog postings from some of the attendees.

I probably won't be at this conference since the focus is usually XML Web Services while my professional interests are in core XML technologies with working with XML syndication formats being a hobby. However there should be lots of interesting presentations on XML Web Services and other leading edge applications of XML from industry experts if last year's conference is anything to go by.


 

Categories: XML

June 8, 2004
@ 09:22 AM

Jon Udell has started a series of blog posts about the pillars of Longhorn.  So far he has written Questions about Longhorn, part 1: WinFS and Questions about Longhorn, part 2: WinFS and semantics which ask the key question "If the software industry and significant parts of Microsoft such as Office and Indigo have decided on XML as the data interchange format, why is the next generation file system for Windows basically an object oriented database instead of an XML-centric database?" 

I'd be very interested in what the WinFS folks like Mike Deem would say in response to Jon if they read his blog. Personally, I worry less about how well WinFS supports XML and more about whether it will be fast, secure and failure resistant. After all, at worst WinFS will support XML as well as a regular file system does today which is good enough for me to locate and query documents with my favorite XML query language today. On the other hand, if WinFS doesn't perform well or shows the same good-idea-but-poorly-implemented nature of the Windows registry then it'll be a non-starter or much worse a widely used but often cursed aspect of Windows development (just like the Windows registry).

As Jon Udell points out the core scenarios touted for the encouraging the creation of WinFS (i.e search and adding metadata to files) don't really need a solution as complex or as intrusive to the operating system as WinFS. The only justification for something as radical and complex as WinFS is if Windows application developers end up utilizing it to meet their needs. However as an application developer on the Windows platform I primarily worry about three major aspects of WinFS. The first is performance, I definitely think having a query language over an optimized store in the file system is all good but wouldn't use it if the performance wasn't up to snuff. Secondly I worry about security, Longhorn evangelists like talking up what a wonderful world it would be if all my apps could share their data but ignore the fact that in reality this can lead to disasters. Having multiple applications share the same data store where one badly written application can corrupt the entire store is worrisome. This is the fundamental problem with the Windows registry and to a lesser extent the cause of DLL hell in Windows. The third thing I worry about is that the programming model will suck. An easy to use programming model often trumps almost any problem. Developers prefer building distributed applications using XML Web Services in .NET to the alternatives even though in some cases this choice leads to lower performance. The same developers would rather store information in the registry than come up with a robust alternative on their own because the programming model for the registry is fairly straightforward.

All things said, I think WinFS is an interesting idea. I'm still not sure it is a good idea but it is definitely interesting. Then again given that WinFS assimilated and thus delayed a very good idea from shipping, I may just be a biased SOB.

PS: I just saw that Jeremy Mazner posted a followup to Jon Udell's post entitled Jon Udell questions the value and direction of WinFS where he wrote

XML formats with well-defined, licensed schemas, are certainly a great step towards a world of open data interchange.  But XML files alone don't make it easier for users to find, relate and act on their information. Jon's contention is that full text search over XML files is good enough, but is it really?  I did a series of blog entries on WinFS scenarios back in February, and I don't think's Jon full text search approach would really enable these things. 

Jeremy mostly misses Jon's point which is aptly reduced to a single question at the beginning of this post. Jon isn't comparing full text search over random XML files on your file system to WinFS. He is asking why couldn't WinFS be based on XML instead of being an object oriented database.


 

Categories: Technology | XML

June 6, 2004
@ 04:41 AM

Tim Bray has a post entitled Whiskey-Bar Economics where he writes

As an added bonus, in the comments someone has posted a pointer to this, which (if even moderately accurate) is pretty astounding.

I'm not sure what is pretty astounding about CostOfWar.com. The Javascript on the site seems pretty basic, the core concept behind the site is opportunity cost which is explained in freshman economics class of the average college or university and the numbers from the site actually seem to be lowballed considering all the headlines I seem to read every month about the Bush administration requesting another couple of billion for the Iraq effort. For example, according to a USA Today article entitled Bush to request $25 billion for Iraq war costs, the US congress had already approved $163 billion for the War on Iraq when the another request for $25 billion showed up. Yet at the current time CostOfWar.com, claims that the war has cost $116 billion.

On the other hand, I think this is pretty astounding.


 

Categories: Ramblings

June 6, 2004
@ 04:18 AM

One of my friends, Joshua Allen, is a fan of RDF and Semantic Web technologies. Given that I respect his opinion a lot I keep trying to delve into RDF and its family of technologies every couple of months to see what it provides to the world of data access and information interchange above and beyond existing technologies. Recently I discovered that there are some in the RDF camp that position it as a "better XML". The first example of this I saw was an old article by Tim Berners-Lee entitled Why RDF model is different from the XML model. According to Tim the note is an attempt to answer the question, "Why should I use RDF - why not just XML?". However instead of answering the question his note just left me with more questions than answers. The pivotal point for me in Tim Berners-Lee's note is the following excerpt

Things you can do with RDF which you can't do with XML include

  • You can parse the semantic tree, which end up giving you a set of (possibly mutually referential) triples and then you can use the ones you want ignoring the ones you don't understand.

Problems with basing you understanding on the structure include

  • Without having gone to the trouble of getting the schema, or having an application hand-programmed to recognise a particular document type, you can't pick up any semantic information from a document;
  • When an XML schema changes, it could typically introduce new intermediate elements (like "details" in the tree above or "div" is HTML). These may or may or may not invalidate any query which has been based on the structure of the document.
  • If you haven't gone to the trouble of making a semantic model, then you may not have a well defined one.

It seems that the point being argued is that with RDF you can get more understanding of the information in the document than with just XML. Being that one could consider RDF as just a logical model layered on top of an XML document (e.g. RDF/XML) I find it hard to understand how viewing some XML document through RDF colored glasses buys one so much more understanding of the data.

Recently I discovered a presentation entitled REST, Self-description, and XML by Mark Baker. This presentation discusses the ideas in Tim Berners-Lee's note in more depth and in a way I finally understand. The first key idea in Mark's presentation is the notion of "self describing" data formats which were also covered in Tim Berners-Lee's presentation at WWW2002 entitled Specs Count. The core tennets of "self describing" data formats are covered in slide 10 and slide 11 of Mark's presentation. A "self describing" data formats contains all the data needed to figure out how to process the format from publically accessible specs. For example, an HTTP response tells you the MIME type of the document which can be used to locate the appropriate RFC which governs how the format should be processed. In the case of XML, Tim Berners-Lee states that an HTTP response which returns an XML document either as application\xml or text\xml should be processed according to the rules of the XML and XML namespaces recommendations which state that the identity of an element is determined based on its namespace name. So when processing an XML document, Tim asserts that it is self describing because one can locate the spec for the format from the namespace URI of the root element. Of course, Mark disagrees with this but his reasons for doing so is pedantic spec lawyering. I disagree with it as well but for different reasons. The main reason I disagree with it is because it puts a stake in the ground and says that any XML format on the Web that doesn't use namespace name for its root element or whose namespace name is not a dereferenceable URI that leads to a spec is broken. This automatically states that XML formats used on the Web today such as RSS 1.0, RSS 2.0, OPML and the Atom 0.3 syndication format are broken.

Mark then goes on to state in slide 20 that a problem with XML formats is that one can't arbitrarily extend an XML document without it's schema or without breaking some application somewhere. It's unclear as to what he means by the document's schema but will grant that it is likely that arbitrary additions to the expected content of an XML document will break certain applications. Getting to slide 24, it is slightly clearer what Mark is getting at. He claims that one although one can add extend a format by adding extra elements from a known namespace using just XML technologies this doesn't tell you how to deal with the extensions. On the other hand, with RDF the extensions are all concepts named with a URI whose meaning can then be looked up using HTTP GET. This is where he lost me. I don't see the difference between seeing a namespaced XML element in an XML format and using HTTP GET on the namespace URI of the element to locate the spec or schema for the namespaced extension and what he describes as the gains of using RDF.

The more I look at how RDF people bag on XML the more it seems that they don't really write applications in today's world. Almost every situation I've seen someone claim that RDF technologies will in the future be able to solve a problem XML cannot, the problem is actually not only solveable with XML technologies but actually is being solved using XML technologies today.  


 

Categories: XML

One of the more annoying aspects of writing Windows applications using the .NET Framework is that eventually you brush up against the limitations in the APIs provided by the managed classes and end up having to use interop to talk to Win32 or COM-based APIs. This process typically involves exposing native code in a manner that makes them look like managed APIs when in fact they are not. When there is an error in this mapping it results in hard-to-track memory corruption errors. All of the fun of annoying C and C++ memory corruption errors in the all new, singing and dancing .NET Framework.

Most recently we were bitten by this in RSS Bandit and probably would never have tracked this problem down if not for a coincidence. As part of forward compatibility testing at Microsoft, a number of test teams run existing .NET applications on current builds of future versions of the .NET Framework. One of these test teams decided to use RSS Bandit as a test application. However it seemed they could never get RSS Bandit to start without the application crashing almost instantly. Interestingly, it crashed at different points in the code depending on whether one compiled and ran the application on the current build of the .NET Framework or just ran an executable compiled against an older version of the .NET Framework on the current build. Bugs were filed against folks on the CLR team and the problem was tracked down.

It turns out that our declaration of the STARTUPINFO struct obtained from PInvoke.NET was incorrect. Specifically the following fields which were declared as

 [MarshalAs(UnmanagedType.LPWStr)] public string  lpReserved;
 [MarshalAs(UnmanagedType.LPWStr)] public string  lpDesktop;
 [MarshalAs(UnmanagedType.LPWStr)] public string  lpTitle;

were declared incorrectly. We should have declared them as

public IntPtr lpReserved; 
public IntPtr lpDesktop; 
public IntPtr lpTitle;

The reason for not declaring them as strings is that the Interop marshaler, after having converted the string to a managed string, will release the native data using CoTaskMemFree. This is clearly not the right thing to do in this case so we need to declare the fields as IntPtrs and then manually marshal them to strings via the Marshal.PtrToStringUni() API.

The problems with errors that occur due to such memory corruption issues is that their results are unpredictable. Some users may never witness a crash, while others witness the crash when their machines are under memory pressure or in some cases it crashes right away. Of course, the crash is never in the same place twice. Not only do these problems waste lots of developer time trying to track them down they lalso lead to negative user experience with the target application.

Hopefully, when Longhorn ships and introduces WinFX this class of problem will become a thing of the past. In the meantime, I need to spend some time going over our code that does all the Win32 interop to ensure that there are no other such issues waiting to rear their head.


 

Categories: Technology

I find it interesting how often developers tend to reinvent because of looking at a problem from only one perspective. Today I read a blog post by Sean Gephardt called RSS and syndication Ideas? where he repeats two common misconception about RSS and syndication technologies. He wrote

What if I only want certain folks to has access to my RSS?

I could require the end user to signin to my site, then provide them access to my RSS feeds, but then they would be required to sign in everytime they tried to update thier view.

More specifically, how could a company track people that have subscribed to a particular RSS feed once they are viewing it in an aggregator? Obviously, if someone actually views the page referenced, then web site tracking applies, but some aggregators I've seen simply render the contents of the description, which if it contains a URL to somewhere, and the user clicks that link, the reader gets taken over to that URL, bypassing the orignal.

Since there is no security around RSS and aggregrators, and no way to prompt users for say, a Passport authentication, should RSS be used only for "public" information? Do you make people sign in once they try to access the “deeper” content? Do you keep the RSS content limited to help drive people to the “real“ content?

Am I missing something glaringly obvious?

Considering that fetching an RSS feed is simply fetching an XML document over the Web using HTTP and there are existing technologies for authenticating and encrypting HTTP requests, I'd have to say "Yes, you have missed something glaringly obvious Sean". In fact, not only can you authenticate and encrypt RSS feeds with the same authentication means used by the rest of the World Wide Web, aggregators like RSS Bandit already support this functionality. In fact, here is a list of aggregators that support private RSS feeds.

As for how to how to track readership of content in RSS feeds. A number of tools already support tracking such statistics using web bugs such as dasBlog and .TEXT. One could also utilize alternate approaches if the feeds are private feeds since one could assign a separate URL to each user.

All of this is stuff that already works on today's World Wide Web when interacting with HTML and HTTP. It is interesting that some people think that once you swap out HTML with XML, entire new approaches must be built from the ground up.

 


 

Josh Ledgard (who along with his wife Gretchen hosted an excellent barbeque this past memorial day weekend) has a post entitled Blogs, Alpha Builds, Customer Community, and Legal Issues where he discusses some of the questions around legal issues some of us have been asking about in the B0rg cube with regards to the growing push to be more open and interact more directly with customers. Josh writes

Blogging Disclaimers

Some Microsofties are now including disclaimer text at the end of each posting in addition to linking to a disclaimer on the sidebar.  A long internal thread went around where “best practice” guidance was given from a member of the legal team that included inserting the disclaimer into every entry as well as in any comment we leave on other blogs. 

The various discussions I've seen around blogging disclaimers often boil down to pointing out that they are unlikely to be useful in preventing real trouble (i.e. some customer who gets pissed at bad advice he gets from a Microsoft employee and decides to sue). Of course, I don't know if this has been tested in court so take my opinions with a grain of salt. I look at disclaimers as a way of separating personal opinion from Microsoft's official position. This leads me to another statement from Josh

From a purely non-legal perspective I would also have to call BS on the standard disclaimer text.

“These postings are provided "AS IS" with no warranties, and confers no rights. The content of this site contains my own personal opinions and does not represent my employer's view in anyway.”

I have no problem with the first sentence, but the second would bother me a bit.  What represents a company better than the collective values and opinions of its employees that are expressed through their blogs. 

I completely disagree with Josh here. I don't believe that my personal opinion and the Microsoft official position are the same thing even though some assume that we are b0rg. Also I want to be able to make it clear when what I am saying is my personal opinion and when what I am saying somewhat reflects an official Microsoft position. For example, I am the program manager responsible for XML schema technologies in the .NET Framework and statements I make around these technologies may be considered by some to be official statements independent of where these statements are made. If I write “RELAX NG is an excellent XML schema language and is superior to W3C XML Schema for validating XML documents in a variety of cases”, some could consider this an indication that Microsoft will start supporting RELAX NG in some of its products. However this would be an inaccurate assumption since that comment is personal opinion and not a reflection of Microsoft's official policy which is unified around using W3C XML Schema for describing the structure and validation of XML documents. I tend to use disclaimers to separate my personal opinions from my statements as a Microsoft employee although a lot of times the lines blur.


 

Categories: Life in the B0rg Cube