I Missed My Flight - Dare Obasanjo's weblog

May 26, 2003

@ 12:58 AM

XML Data Access Patterns and Programming Models

Doug Purdy: The Slashdot Problem ( part 1, part 2, part 3)
Doug's series of articles on the appropriately named "SlashDot" problem provide a very interesting look at how people want to interact with XML and XML-based technologies. The Slashdot Problem is the fact that people want dots over XML (i.e. once parsed access XML elements and attributes as if they were the fields or properties of an object) and also want to treat objects like XML (i.e. hence utilize XML technologies like XPath for querying). One can get the former in the .NET Framework via XML Serialization and the latter via custom XPathNavigator's like Steve Saxon's ObjectXPathNavigator
The only nit I have to pick with Doug's series is that in part 3 he doesn't show how using XPath to access objects is any better than directly accessing properties. Steve Saxon nailed it in his guest appearance in my column that it is all about the ability to perform complex queries over object graphs.
The interesting perspective from Doug's series is that providing some of this functionality in a dynamically typed language is actually far easier than doing it in a statically typed one. Of course, there are a number of issues in part 2 that Doug doesn't get into such as how to deal with XML namespaces or how to get around the fact that SmallTalk is strongly typed. Given that this was a blog post and not an actual technical article or a paper I didn't expect Doug to exhaustively cover the topic but I would love to see his ideas on how to deal with either of those issues.
Harry Pierson: XML Is Not Just a Deserialized Object Graph
Harry Pierson is on the road to XML Zen. However he has some misconceptions. The first is that he states
Given an object graph, I can serialize it into XML and then deserialize it into an identical object graph. However, I don't think I can take an arbitrary XML document and always be able to deserialize it into an object graph.
which is actually the opposite of the general case. First of all, an XML document which is basically a tree of nodes while an object which is basically a graph of nodes. A tree is a connected acyclic graph so it stands to reason that an XML should be fully representable as objects but not that objects can be fully represented as XML without hacks. Secondly an object is represents the state and behavior of an entity while an XML document typically represents only state. There is no generally accepted way to represent the methods of an object as XML that I have come across in my experiences. I've seen a couple of proposals but nothing that has gotten any sort of significant mind share even among the XML geekdom.

The bigger misconception I'd like to clear up is
XML concepts like derivation by restriction are difficult to model in a strongly typed object model like CLR's
Concepts like derivation by restriction are XSD concepts not XML concepts. XSD is just one lens you can use to view XML not the lens through which you must view XML.
Justin Rudd: How Do You Pass XML between layers/tiers? ( part 1, part 2)
Since between Joshua and I lies responsibility for the current and future XML programming model for the .NET Framework this is something I'd like to officially address with some sort of best practices document or white paper. All the feedback in the discussion threads is stuff that I've noted and will try to ensure that we try to address in the next release of the .NET Framework.
Until then this post by Justin Rudd is a must-read for anyone who is considering working with the .NET Framework and considering how to pass XML between application layers.
Ted Neward: Effective Enterprise Java (Persistence): Use a "hierarchical-first" approach to model in documents
Overall, I liked the theme of this post but there were a number of specifics I took issue with. For instance he says
The problem with the hierarchical model at the time was that attempting to find data within it was difficult, since users had to navigate the elements of the tree manually, leaving users to figure out "how", instead of focusing on "what"--that is, how to get to the data, rather than what data they were interested in.
although this is true it merely points at a limitation of the primary hierarchical database query languages at the time like DL/I. There are more pertinent reasons to prefer a relational model to a hierarchical model for data storage chief of which is the fact that hierarchical models lead to data redundancy and tight coupling between data points. Consider a database of college students and their class schedules. If each student's class information is stored as part of the student then this leads to massive duplication of information whereas in a relational database one could just normalize the data then use joins as needed when querying. Similarly if no student is currently enrolled in a particular class does that mean no information for that class is available in the database since the class segment always exists as a child of the student segment?

I also like
While the industry currently doesn't recognize it, the mapping of objects to XML (the most common hierarchical storage model today) is not a simple thing, leading one to wonder if an object-hierarchical impedance mismatch is just around the corner*.
The impedance mismatch already exists today for anyone who is using XML Web Services and depends on automatic de-serialization of XML into objects and vice versa. Examples of some of the kind of impedance mismatches that exist today between object oriented programming languages and the primary schema language used to define XML structures (XSD) are discussed in my article on XML Serialization in the .NET Framework and Aaron Skonnard's Advanced Type Mappings piece on MSDN.

Besides those two very minor points, Ted's post is very insightful reading.

#

Impressions of Vancouver

I was in Vancouver for about for hours and left with the distinct feeling that Vancouver seems to be a cleaner version of Seattle with a lot less black people and a lot more Asians. When i say a lot less black people, I mean I was walking around the middle of a busy business district for 30 minutes and didn't see any black people and eventually spotted one when I stopped at the food court of some mall to grab lunch. Actually I saw two black people in the mall. So, in the total four hours I was in Vancouver I never saw one other black person on the street. Wild.

I saw a number of kids in souped up Honda Civics and other cars you'd expect to see lampooned on Beaterz. I was half hoping one of them would try and outrun my lil' Decepticon.

#

CrossRoads MeetUp v2.0

Ted Leung's description of the evening is quite accurate and picks up stuff I would have forgotten if I'd posted my summary of the evening. Ted is the first non-B0rg I've spoken to about XML without either having to explain what it is (it's like HTML but different) or defend my technical chops for working with such a "simple technology". I left the discussion with the idea that I really need to publish more of my opinion pieces on XML technologies through more formal mechanisms than just my blog. Food for thought.

I got to meet another B0rg blogger, John Porcaro. Speaking of B0rg bloggers, this seems the best place to announce that some more folks on my team have started their semi-official blogs on Microsoft XML technologies. Say hello to Arpan Desai and Andrew Conrad. This is actually Andy's second blog, he also has a personal blog which I've linked to from time to time.

#

RSS Bandit

It looks like Torsten and I just picked up another active developer. Viewers at home say hello to Michael Earls. He's made a couple of nice changes including a refactoring of the code that allows one to import feed list formats which got me off my ass and implementing support for the OCS format used by NewzCrawler.

Speaking of the RSS Bandit, it seems that I'm beginning to get an online user and developer community. We're already up to 49 posts on the public RSS Bandit forums and we just crossed a hundred for both number of posts and Workspace members.

#

Jayson Blair

There are a number of things running through my mind about the Jayson Blair affair. However I'm hungry and need to grab breaksfast so I'll just note one thing. This incident proves without a shadow of a doubt that the only difference between an article in the New York Times and the self important posts of the average blogger is that the blogger probably at least Googled for some fucking facts before posting his brain farts.

Fuck the New York Times for ripping away the last shreds of faith and respect I had for professional journalism.

#

--
Get yourself a News Aggregator and subscribe to my RSS feed

Disclaimer: The above comments do not represent the thoughts, intentions, plans or strategies of my employer. They are solely my opinion.

Categories:

« I Lied | Home | We're Comin' Ta Getcha »

Dare Obasanjo's weblog

"You can buy cars but you can't buy respect in the hood" - Curtis Jackson

Navigation for I Missed My Flight - Dare Obasanjo's weblog