Daniel Cazzulino writes in response to Don Demsak's post on Waking Up From A DOM Induced Coma

So, in this regard, I believe SUN is doing a good job at concentrating on pluggable and standard interfaces and specifications, and letting whoever wants to take the time to implement custom stuff.
I don't want to "new XmlTextReader". I want some app/system-wide factory take care of creating the appropriate parser implementation for me based on declarative configuration, and I want my to code to work against a single unified interface/base class always.
Changing the parser shouldn't mean I have to change my working app code. If MS provides the appropriate abstractions, it wouldn't even be necessary to rely on some implementation-specific feature such as XmlTextReader.GetRemainder that is not part of the abstract contract defined by XmlReader.

I both agree and disagree with Daniel. We do have a single unified interface for processing XML which developers can program against, it is called the XmlReader. Unfortunately, we subclassed this class into the XmlTextReader and XmlValidatingReader which are actually what most developers program against including our devs internally. In the next version of the .NET Framework we are moving away from the XmlTextReader and XmlValidating reader. Instead we will emphasize programming directly to the XmlReader and will provide an implementation of the factory design patterns which returns different XmlReader instances based on which features the user is interested. More importantly users will be able to layer different XmlReader implementations on those created by our factory which was always our intention since v1.0 of the .NET Framework. For example, one could layer XSD Validation on top the XIncludingReader from XInclude.NET to combine third party XInclude support with Microsoft's W3C XML Schema validation technologies.

As for whether the Sun's approach of just providing interfaces instead of concrete for XML parsing was such a great thing in Java I'd claim that it's been hit and miss. Most XML developers from the Java world despise the DOM for the reasons described in Chapter 33 of Elliotte Rusty Harold's Effective XML. This is the reason for the existence of extensions and alternatives to the DOM API which extend it such as Oracle's XDK, dom4J, JDOM, Xerces and XOM. Heck, you can't even get the XML as a string out of node or save an XML document object to a file without using extensions since these aren't in the base DOM API. As for SAX, the API just gives you access to regular parsing events nothing fancy.  There isn't much difference functionally from programming against the base SAX APIs and programming against XmlReader

The one point of interest is that Daniel claims that the Java way of not shipping with any XML APIs but just interfaces is somehow better than the .NET way.  In Java one can programa against interfaces and loads the XML parser by passing the class name to a factory method. One could put this name in a config file and change it at runtime. The question is whether anyone in the .NET world actually thinks being able to change your XML parser implementation at runtime is anything more than a geek feature. I consider it as geeky as asking why you can't change the implementation of the System.String class to a user defined class that uses less memory at runtime without having to recompile. An interesting idea but one primarily of interest to the ultimate of power users.

The funny thing is that even if we shipped functionality where we looked in the registry or in some config file before figuring out what XML parser to load it's not as if there are an abundance of third party XML parsers targetting the .NET Framework in the first place. There is definitely no intention to ship any functionality like this in future versions of the .NET Framework.


Sunday, February 29, 2004 8:17:24 PM (GMT Standard Time, UTC+00:00)
Actually, the situation in JDK 1.4 is a bit different than Daniel describes. The JDK actually ships with a default parser, and it's somewhat challenging to override the default. This is problematic because people have parsers that they like for one reason or another: because the default has a bug that they can't live with, or they have a parser that performs better for their usage scenario, or they want to plug in a parser that has some feature that the default doesn't have. I saw Stuart Halloway give a presentation on the ClassLoader, and the XML APIs were his prime example of why the 1.4 ClassLoader is a bad design.

As for why you'd actually want to change the implementation, how about for hardware acceleration? I know of at least one company creating hardware cards for XML parsing and XSLT transforms, but I've been told that the problem with (seamlessly) integrating this kind of hardware into .NET is difficult with the current architecture. Also, as you noted, the default Java implementation of DOM is just awful to program against, so people create DOM - pluggable APIs that work with DOM-reliant code, but are easier to work with. (BTW, WebMethods' ElectricXML is a pretty nifty XML parsing API for Java as well.)
Monday, March 1, 2004 12:18:42 AM (GMT Standard Time, UTC+00:00)
The question isn't whether would you want to change the implementation of your XML parser but how often do you want to change your XML parser at runtime without recompiling? I really see no difference between this abnd changing the implementation of the String class at runtime via a config file or registry settings instead of recompiling.
Monday, March 1, 2004 7:33:45 PM (GMT Standard Time, UTC+00:00)
I agree and disagree too. Changing a parser is not like changing such a low level feature as a string. And the case for hardware acceleration is certainly a good case. As for Java, I actually wanted to showcase that they focus in interface and specs, and they DO offer a default implementation, that will suffice most users. Whether changing this default to a different one is hard or not, I couldn't say because I'm a .NET programmer ;). But from what friends tell me, the direction is towards full developer transparency in the face of implementation swaping. That's what JAXP is about, isn't it?

A more .NET-friendly use case are the new features of ASP.NET: personalization and membership, as well as the new customizable Session state. You have a built-in implementation that I'm willing to bet 95% developers will use. You don't even need a config to setup the defaults. They come built-in machine.config. But if I want to go and change them, I just have to modify a config. It's equally valid to argue Session is equally such a lower-level and intrinsic feature of ASP.NET that it doesn't make sense to provide a configurable way to change its implementation.

XML parsing, IMO, should be just like that. It's not only that I have to recompile/deploy everything againt, but I have to actually find all places where I'm "newing" the parser and change it. How does this make it any easy at all to test several ones and see which is the best around?

And you're right there aren't any third party parses now: the .NET way to use it doesn't encourage this, not to mention .NET's lack of "age" ;). Time will bring these, however, almost for sure. For one, I'm personally evaluating doing so :o)
Monday, March 1, 2004 10:38:39 PM (GMT Standard Time, UTC+00:00)
The GoF factory design pattern is explicitly about replacing new() with calls to special methods that create the right instance of a class for you. In Whidbey we will be moving to encouraging people to program against XmlReader instead of directly against subtypes like XmlValidatingReader or XmlTextReader. People who feel they want to control which XML parser their application uses can create a factory method that returns XmlReader instances based on config file settings.

You are basically asking Microsoft to write this factory method for you and I'm replying that it is a niche use case whose opportunity cost is too high from where I sit. Also based on the experiences of the Java folks (as confirmed by Gordon's post and Bill Dehora's blog entry at http://www.dehora.net/journal/archives/000383.html) it isn't like its been a resounding success in the Java world which you are asking us to copy.

Thursday, March 4, 2004 8:42:56 PM (GMT Standard Time, UTC+00:00)
Well, you didn't speficy what "Factory" GoF pattern you were referring to, but I could guess from the link you used. It's the Factory *Method*. This pattern is usually about creating a single entity (that's why it's a *method*) and generally to avoid an otherwise complex creation or tightly coupled (WRT user class) initialization code. The "factory" you're referring to for v2 (unless this has changed since the PDC bits) is the XmlFactory, which is actually a regular class implementing a host of Factory Method creational methods.
Now, do you really think it is useful for someone looking for extensibility? I mean, if I inherit from it, there's nothing virtual to override (i.e., NO way to change ANY actual instance creation)! And it lives in MS.Internal.Xml ?! I assume anything from that namespace will come with the usual documentation "This supports the .NET infrastructure, good luck!", right?
I'd like to point you to a more interesting GoF pattern for a real pluggable approach: http://www.exciton.cs.rice.edu/JavaResources/DesignPatterns/FactoryPattern.htm. The *Abstract* factory pattern is what an extender would need. Not a base factory with nothing to override, which is all but useless to this extent.
The ADO.NET team did a great job at implementing this in Whidbey. A real example of how to achieve practical, flexible, configurable extensibility with this pattern. An I don't think there's ANY significant perf. impact in doing it so.
BTW, arguing there aren't third-party parsers is such a weak argument that I almost feel compelled to ignore it. If you have everything closed and hard-wired in the framework (for the reasons I explained above, your factory really nothing but a helper), how do expect to be third-party implementations? I can't find the link now, but there's a project porting Xalan to .NET, and in the mvp-xml SF project I'm going to push for a new implementation too. As a framework designer, you can do nothing but be prepared to changes. Arguing there's no viable choice today is not a valid reason. There may be tomorrow, and it will already be too late for API changes (once more...).
Thursday, March 4, 2004 9:10:04 PM (GMT Standard Time, UTC+00:00)
BTW, the Factory Method intent is:
"Define an interface for creating an object, but let the subclasses decide which class to instance. Factory method lets a class defer instantiation to subclasses" (GoF).
If subclasses can't override the base implementation I guess it's not quite an accurate implementation of the pattern... that's why I called it more of a "helper".
Comments are closed.