Miguel pointed me to an interesting discussion between Havoc Pennington of RedHat and Paolo Molaro, a lead developer of the Mono project. Although I exchanged mail with Miguel about this thread about a week ago I've been watching the discussion as opposed to directly commenting on it because I've been trying to figure out if this is just a discussion between a couple of Open Source developers or a larger discussion between RedHat and Novell being carried out by proxy.

Anyway, the root of the discussion is Havoc's entry entitled Java, Mono, or C++? where he starts of by pointing out that a number of the large Linux desktop projects are interested in migrating from C/C++ to managed code. Specifically he writes

In the Linux desktop world, there's widespread sentiment that high-level language technologies such as garbage collection, sandboxed code, and so forth would be valuable to have and represent an improvement over C/C++.

Several desktop projects are actively interested in this kind of technology:

  • GNOME: many developers feel that this is the right direction
  • Mozilla: to take full advantage of XUL, it has to support more than just JavaScript
  • OpenOffice.org: has constantly flirted with Java, and is considering using Java throughout the codebase
  • Evolution: has considered writing new code and features in Mono, though they are waiting for a GNOME-wide decision

Just these four projects add up to probably 90% of the lines of code in a Linux desktop built around them

Havoc then makes the argument that the Open Source community will have to make a choice between Java/JVM or C#/CLI. He argues against choosing C#/CLI by saying

Microsoft has set a clever trap by standardizing the core of the CLI and C# language with ECMA, while keeping proprietary the class libraries such as ASP.NET and XAML. There's the appearance of an open managed runtime, but it's an incomplete platform, and no momentum or standards body exists to drive it to completion in an open manner...Even if we use some unencumbered ideas or designs from the .NET world, we should never define our open source managed runtime as a .NET clone.

and argues for Java/JVM by writing

Java has broad industry acceptance, historically driven by Sun and IBM; it's by far the most-used platform in embedded and on the UNIX/Linux enterprise server...One virtue of Java is that it's at least somewhat an open standard; the Java Community Process isn't ideal, but it does cover all the important APIs. The barest core of .NET is an ECMA standard, but the class libraries of note are Microsoft-specific...It's unclear that anyone but Microsoft could have significant influence over the ECMA spec in any case...

Also worth keeping in mind, OO.org is already using Java.

Combining Java and Linux is interesting from another standpoint: it merges the two major Microsoft-alternative platforms into a united front.

At this point it is clear that Havoc does agree with what Miguel and the rest of the Mono folks have been saying for years about needing a managed code environment to elevate the state of the art in desktop application development on UNIX-based Open Source platforms. I completely disagree with him that Sun's JCP process is somehow more of an open standard than ECMA. That just seems absurd. He concludes the article with

What Next?

For some time, the gcj and Classpath teams have been working on an open source Java runtime. Perhaps it's time to ramp up this effort and start using it more widely in free software projects. How long do we wait for a proprietary JDK to become GPL compatible before we take the plunge with what we have?

The first approach I'd explore for GNOME would be Java, but supporting a choice of gcj or IKVM or the Sun/IBM JDKs. The requirement would be that only the least common denominator of these three can be used: only the subset of the Java standard completed in GNU Classpath, and avoiding features specific to one of the VMs. Over time, the least common denominator becomes larger; Classpath's goal is to complete the entire Java standard.

There is also some stuff about needing to come up with an alternative to XAML so that GNOME and co. stay competitive but that just seems like the typical Open Source need to clone everything a proprietary vendor does without thinking it through. There was no real argument as to why he thought it would be a good idea, just a need to play catchup with Microsoft.

Now on to the responses. Paolo has two responses to Havoc's call to action. Both posts argue that technically Mono is as mature as the Open Source Java/JVM projects and has niceties such as P/Invoke that make communication between native and managed code straightforward. Secondly, his major point is that there is no reason to believe that while Microsoft will eventually sue the Mono project for violating patents on .NET Framework technologies that Sun would not do the same with Java technologies. Not only has Sun sued before when it felt Java was being threatened (the lengthy lawsuit with Microsoft) but unlike Microsoft it has never given any Java technology to a standards body to administer in a royalty free manner as Microsoft has done with C# and the CLI. Miguel also followed up with his post Java, Gtk and Mono which shows that it is possible to write Java code against Mono which points out that language choice is separate from the choice of which runtime (JVM vs. CLI) you use. He also echoes Paolo's sentiments on Sun and Microsoft's behavior with regards to software patents and their technologies in his post On Software Patents.

Havoc has a number of followup posts where he points out various other options people have mailed him and where he points out that his primary worry is that the current state of affairs will lead to fragmentation in the Open Source desktop world.  Miguel responds in his post On Fragmentation, reply with the followng opening

Havoc, you are skipping over the fact that a viable compromise for the community is not a viable compromise for some products, and hence why you see some companies picking a particular technology as I described at length below.

which I agree with completely. Even if the Open Source community agreed to go with C#/CLI I doubt that Sun would choose anything besides Java for their “Java Desktop System”. If Havoc is saying having companies like Sun on board with whatever decision he is trying to arrive at is a must then he's already made the decision to go with Java and the JVM. Given that Longhorn will have managed APIs (aka WinFX) Miguel believes that the ability to migrate from Windows programming to Linux programming [based on Mono] would be huge.  I agree, one of the reasons Java became so popular was the ease with which one could migrate from platform to platform and preserve one's knowledge since Java was somewhat Write Once Run Anywhere (WORA). However this never extended to building desktop applications which Miguel is now trying to tap into by pushing Linux desktop development to be based on Mono.

I have no idea how Microsoft would react to the outcome that Miguel envisions but it should be an interesting ride.



Categories: Technology

Aaron Skonnard has a new MSDN magazine article entitled All About Blogs and RSS where he does a good job of summarizing the various XML technologies around weblogs and syndication. It is a very good FAQ and one I definitely will be pointing folks to in future when asked about blogging technologies. 


Categories: Mindless Link Propagation | XML

My recent Extreme XML column entitled Best Practices for Representing XML in the .NET Framework  is up on MSDN. The article was motivated by Krzysztof Cwalina who asked the XML team for design guidelines for working with XML in WinFX. There had been and currently is a bit of inconsistency in how APIs in the .NET Framework represent XML and this is the first step in trying to introduce a set of best practices and guidelines.

As stated in the article there are three primary situations when developers need to consider what APIs to use for representing XML. The situations and guidelines are briefly described below:

  • Classes with fields or properties that hold XML: If a class has a field or property that is an XML document or fragment, it should provide mechanisms for manipulating the property as both a string and as an XmlReader.

  • Methods that accept XML input or return XML as output: Methods that accept or return XML should favor returning XmlReader or XPathNavigator unless the user is expected to be able to edit the XML data, in which case XmlDocument should be used.

  • Converting an object to XML: If an object wants to provide an XML representation of itself for serialization purposes, then it should use the XmlWriter if it needs more control of the XML serialization process than what is provided by the XmlSerializer. If the object wants to provide an XML representation of itself that enables it to participate fully as a member of the XML world, such as allow XPath queries or XSLT transformations over the object, then it should implement the IXPathNavigable interface.

A piece of criticism I got from Joshua Allen was that the guidelines seemed to endorse a number of approaches instead of defining the one true approach. The reason for this is that there isn't one XML API that satisfies the different scenarios described above. In Whidbey we will be attempting to collapse the matrix of choices by expanding the capabilities of XML cursors so that there shouldn't be a distinction between situations where an API exposes an API like XmlDocument or one like XPathNavigator.  

One of the interesting design questions we've gone back and forth on is whether we have both a read-only XML cursor and read-write XML cursor (i.e. XPathNavigator2 and XPathEditor)  or a single XML cursor class which has a flag that indicates whether it is read-only or not (i.e. the approach taken by the System.IO.Stream class which has CanRead and CanWrite properties). In Whidbey beta 1 we've gone with the former approach but there is discussion on whether we should go with the latter approach in beta 2. I'm curious as to which approach developers using System.Xml would favor.


Categories: XML

In less than a week we'll be launching the XML Developer Center on MSDN and replacing the site at http://msdn.microsoft.com/xml. The main differences between the XML Developer Center and what exists now will be

  1. The XML Developer Center will provide an entry point to working with XML in Microsoft products such as Office and SQL Server.

  2. The XML Developer Center will have an RSS feed.

  3. The XML Developer Center will pull in content from my work weblog.

  4. The XML Developer Center will provide links to recommended books, mailing lists and weblogs.

  5. The XML Developer Center will have content focused on explaining the fundamentals of the core XML technologies such as XML Schema, XPath, XSLT and XQuery.

  6. The XML Developer Center will provide sneak peaks at advances in XML technologies at Microsoft that will be shipping future releases of the .NET Framework, SQL Server and Windows.

During the launch the feature article will be the first in a series by Mark Fussell detailing the changes we've made to the System.Xml namespaces in Whidbey. His first article will focus on the core System.Xml classes like XmlReader and XPathNavigator. A follow up article is scheduled that will talk about additions to System.Xml since the last version of the .NET Framework such as XQuery. Finally, either Mark or Matt Tavis will write an article about the changes coming to System.Xml.Serialization such as the various hooks for allowing custom code generation from XML schemas such as IXmlSerializable (which is no longer an unsupported interface) and SchemaImporterExtensions.

I'll also be publishing our guidelines for exposing XML in .NET applications as well during the launch. If there is anything else you'd like to see on the XML Developer Center let me know.


Categories: XML

Both Dave Walker and Tim Bray state their aggregators of choice barfed when trying to read a post entitled because their aggregators of choice didn't know how to deal with tags in content. Weird. RSS Bandit dealt with it fine. Click below for the screenshot.

Categories: RSS Bandit

I just noticed that Arve Bersvendsen has written a post entitled 11 ways to valid RSS where he states he has seen 11 different ways of providing content in an RSS feed namely

Content in the description element

I have so far identified five different variants of content in the <description> element:

  1. Plaintext as CDATA with HTML entities - Validate
  2. HTML within CDATA - Validate
  3. HTML escaped with entities - Validate
  4. Plain text in CDATA - Validate
  5. Plaintext with inline HTML using escaping - Validate


I have encountered and identified two different ways of using <content:encoded>:

  1. Using entities - Validate
  2. Using CDATA - Validate

XHTML content

Finally, I have encountered and identified four different ways in which people has specified XHTML content:

  1. Using <xhtml:body> - Validate
  2. Using <xhtml:div> - Validate
  3. Using <body> with default namespace - Validate
  4. Using <div> with default namespace - Validate

At first these seem like a lot until you actually try to program against this using an XML parser. In which case, the first thing you notice is that there is no difference programming against CDATA vs. escaped entities since they are both syntactic sugar.  For example, the XML infoset and data models compatible with it such as the XPath data model do not differentiate character content that is written as character references, CDATA sections or entered directly. So the following

    <test><![CDATA[ ]]>2</test>
    <test> 2</test>

are all equivalent. More directly if you loaded all three into an instance of System.Xml.XmlDocument and checked their InnerText property they'd all return the same result. So this reduces Arve's first two elements to

Content in the description element

I have so far identified five two different variants of content in the <description> element:

  1. HTML
  2. Plain text


I have encountered and identified two different ways one way of using <content:encoded>:

  1. Containing escaped HTML content

If your code makes any distinctions other than these then it is a sign that you have (a) misunderstood how to process RSS or (b) are using a crappy XML parser. When I first started working on RSS Bandit I also was confused by these distinctions but after a while things became clearer. The only problem here is the description element since you can't tell whether it is HTML or not without guessing. Since RSS Bandit always provides the content to an embedded web browser this isn't a problem but I can see how it could be one for aggregators that don't know how to process HTML (although I've never seen one before).

Another misunderstanding by Arve seems to be how namespaces work in XML. A few years ago I wrote an XML Namespaces and How They Affect XPath and XSLT where I wrote

A qualified name, also known as a QName, is an XML name called the local name optionally preceded by another XML name called the prefix and a colon (':') character...The prefix of a qualified name must have been mapped to a namespace URI through an in-scope namespace declaration mapping the prefix to the namespace URI. A qualified name can be used as either an attribute or element name.

Although QNames are important mnemonic guides to determining what namespace the elements and attributes within a document are derived from, they are rarely important to XML aware processors. For example, the following three XML documents would be treated identically by a range of XML technologies including, of course, XML schema validators.

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
        <xs:complexType id="123" name="fooType"/>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
        <xsd:complexType id="123" name="fooType"/>
<schema xmlns="http://www.w3.org/2001/XMLSchema">
        <complexType id="123" name="fooType"/>

Bearing this information in mind this reduces Arve's example to

XHTML content

Finally, I have encountered and identified four two different ways in which people has specified XHTML content:

  1. Using <xhtml:body>
  2. Using <xhtml:div>

Thus with judicious use of an XML parser (which makes sense since RSS is an XML format), Arve's list of eleven ways of providing content in RSS is actually whittled down to five. I assume Arve is unfamiliar with XML processing which led to his initial confusion.

NOTE: Before anyone bothers to start pointing out that Atom somehow frees aggregator author from this myriad of options I'll point out that Atom has more ways of encoding content than these. Even ignoring the inconsequential differences in syntactic sugar in XML (escaped tags vs. unescaped tags in CDATA sections) the various combinations of the <summary> and <content> elements, the mode attribute (escaped vs. xml) and MIME types (text/plain, text/html, application/xhtml+xml) more than double the number of variations possible in RSS.


Categories: XML

March 16, 2004
@ 05:10 PM

While hanging around TheServerSide.com I discovered a series on Naked Objects. It's an interesting idea that eschews separating application layers in GUIs (via MVC) or server applications (presentation/business logic/data access layers) and instead only coding domain model objects which then have a standard GUI autogenerated for them. There are currently five articles in the series which are listed below with my initial impressions of each article provided below.

Part 1: The Case for Naked Objects: Getting Back to the Object-Oriented Ideal
Part 2: Challenging the Dominant Design of the 4-Layer Architecture
Part 3: Write an application in Java and deploy it on .Net
Part 4: Modeling simultaneously in UML, Java, and User Perspectives
Part 5: Fat is the new Thin: Building Rich Internet Applications with Naked Objects

Part 1 points out that in many N-tier server-side applications there are four layers; persistence, the domain model, the controller and presentation. The author points out that object-relational mapping frameworks are now popular as a mechanism for collapsing the domain model and persistence layer. Naked objects comes from the other angle and attempts to collapse the domain model, control and presentation layers. The article also argues that the current practices in application development efforts such as web services and component based architectures which separate data access from the domain model reduce many of the benefits of object oriented programming.

In a typical naked objects application, the framework uses reflection to determine the methods of an object and render them using a generic graphical user interface (screenshot). This encourages objects to be 'behaviourally complete' all significant actions that can be performed on the object must exist as methods on the object.

The author states that there are six benefits of using naked objects

  • Higher development productivity through not having to write a user interface
  • More maintainable systems through the enforced use of behaviourally-complete objects.
  • Improved usability deriving from a pure object-oriented user interface
  • Easier capture of business requirements because the naked objects would constitute a common language between developers and users.
  • Improved support for test-driven development
  • It facilitates cross-platform development

What I found interesting about the first article in the series is that the author rails against separating the domain model from data access layer but it seems naked objects are more about blending the GUI layer with the domain model. There seem to be some missing pieces to the article. Perhaps the implication is that one should use object-relational mapping technologies in combination with naked objects to collapse an application from 4 layers to a single 'behaviorally complete' domain model?

Part 2 focuses on implementing the functionality of a 4 layer application using naked objects. One of the authors had written a tutorial application for a book that which was software for running  an auto servicing shop which performed tasks like booking-in cars for service and billing the customer. The conclusion after the rewrite was that the naked objects implementation took less lines of code and had less classes than the previous implementation which had 4 layers. Also it took less time to add new functionality such as obtaining the customer's sales history to the application in the naked objects implementation than in the 4 layer implementation. 

There are caveats, one was that the user interface was not as rich as the one where the developer had an explicit presentation layer as opposed to relying on a generic autogenerated user interface. Also complex operations such as 'undoing' a business action were not supported in the naked objects implementation.  

Part 3 points out that if you write a naked objects implementation targetting Java 1.1 then you can compile it using J# without modification. Thus porting from Java to .NET should be a cinch as long as you use only Java 1.1. Nothing new here.

Part 4 points out that naked objects encourages “code first design” which the authors claim is a good thing. They also point out if one really wants to get UML diagrams out of a naked objects application they can use tools like Together which can generate UML from source code.

I'm not sure I agree that banging out code first and writing use cases or design documents afterwards is a software development methodology worth encouraging.

Part 5 trots out the old saw about rich internet applications and how much better they are than the limiting HTML-based browser applications. The author points out that with the writing a Java applet which uses the naked objects framework gives a richer user experience than an HTML-based application. However as mentioned in previous articles you could build an even richer client interface with an explicit presentation layer instead of relying on the generic user interface provided by the naked objects framework. 

Interesting ideas. I'm not sure how well they'd scale up to building real-world applications but it is always good to challenge assumptions so developers don't get complacent. 


Categories: Technology

March 15, 2004
@ 04:35 PM

It what seems to be the strangest news story I've read this year I find out Sun snatches up XML guru what I found particularly interesting in the story was the following excerpt

One of the areas Bray expects to work on is developing new applications for Web logs, or "blogs," and the RSS (Resource Description Framework Site Summary) technology that grew out of them. "I think that this is potentially a game-changer in some respects, and there are quite a few folks at Sun who share that opinion," he said.

Though RSS is traditionally thought of as a Web publishing tool, it could be used for much more than keeping track of the latest posts to blogs and Web sites, Bray said. "I would like to have an RSS feed to my bank account, my credit card, and my stock portfolio," he said.

Personally I think it's a waste of Tim Bray's talents having him work on RSS or it's competitor du jour, Atom, but it should be fun seeing whether he can get Sun out of it's XML funk as well stop them from spreading poisonous ideas like replacing XML with ASN.1.

Update: Tim Bray has a post about his new job entitled Sunny Boy where he writes

That aside, I’m comfy being officially a direct competitor of Microsoft. On the technical side, I find the APIs inelegant, the UI aesthetics juvenile, and the neglect of the browser maddening.

Sounds like fighting words. This should be fun. :)


Categories: XML

My homegirl, Gretchen Ledgard (y'know Josh's wife), has helped start the Technical Careers @ Microsoft weblog. According to her introductory post you'll find stuff like

  • Explanation of technical careers Microsoft.  What do people really do at Microsoft?  What does a “typical” career path look like?  What can you do to prepare yourself for a career at Microsoft?
  • Sharing of our recruiting expertise.  Learn “trade secrets” from Microsoft recruiters!  What does a good resume look like?  How can you get noticed on the internet?  How should you best prepare for an interview?
  • Information on upcoming Microsoft Technical Recruiting events and programs. 
  • I hope Gretchen dishes up the dirt on how the Microsoft recruiters deal with competition for a candidate such as when a prospective hire also has an offer from another attractive company such as Google. Back in my college days, the company that was most competitive with Microsoft was Trilogy (what a difference a few years make). 

    I remember when I first got my internship offer and I told my recruiter I also had an offer from i2 technologies, she quickly whipped out a pen and did the math comparing the compensation I'd get at Microsoft to that I'd get from i2. I eventually picked Microsoft instead of i2 for that summer internship which definitely turned out to be a life altering decision. Ahhh, memories.    


    After lots of procrastination we now have online documentation for RSS Bandit. As usual, The current table of contents is just a place holder and the real content is Phil Haack's Getting Started with RSS Bandit. The table of contents for the documentation I plan to write [once the MSDN XML Developer Center launches in about a week or so] is laid out below.

    • Bandit Help
      • Getting Started
        • What is an RSS feed?
        • What is an Atom feed?
        • The RSS Bandit user interface
        • Subscribing to a feed
        • Locating new feeds
        • Displaying feeds
        • Changing the web browser security settings
        • Configuring proxy server settings
      • Using RSS Bandit from Multiple Computers
        • Synchronization using FTP
        • Synchronization using a dasBlog weblog
        • Synchronization using a local or network file
      • Advanced Topics
        • Customizing the Way Feeds Look using XSLT
        • Creating Search Folders
        • Adding Integration with your Favorite Search Engine
        • Building and Using RSS Bandit plugins
      • Frequently Asked Questions
      • How to Give Provide Feedback
      • Contributing to RSS Bandit

    If you are an RSS Bandit user I'd love to get your feedback


    Categories: RSS Bandit