I just finished writing last month's Extreme XML column* entitled The XML Litmus Test: Understanding When and Why to Use XML. The article is a more formal write up from my weblog post The XML Litmus Test expanded to contain examples of appropriate and inappropriate uses of XML as well as with some of the criteria for choosing XML fleshed out. Below is an excerpt from the article which contains the core bits that I hope everyone who reads it remembers  

XML is the appropriate tool for the job if the following criteria are satisfied by choosing XML as the data representation format for a given application.

1.      there is a need to interoperate across multiple software platforms

2.      one or more of the off-the-shelf tools for dealing with XML can be leveraged when producing or consuming the data

3.      parsing performance is not critical

4.      the content is not primarily binary content such as a music or image file

5.      the content does not contain control characters or any other characters that are illegal in XML

If the expected usage scenario does not satisfy most or all of the above criteria then it doesn't make much sense to use XML as the data representation format for the situation in question.

As the program manager responsible for XML programming models and schema validation in the .NET Framework I've seen lots and lots of inappropriate usage of XML both from internal teams and our customers. Hopefully once this article is published I can stop repeating myself and just send people links to it next time I see someone asking how to escape control characters in XML or see another online discussion of "binary" XML.

* Yes, it's late


 

Monday, October 11, 2004 4:09:32 AM (GMT Daylight Time, UTC+01:00)
You can hope, but I recall saying pretty much those things in every talk I gave about XML for IBM back in 1998 and, as I recall, a few articles and interviews too (not that I can find anything online to prove it). Not that I want to depress you or anything ;-)
Tuesday, October 12, 2004 1:48:14 PM (GMT Daylight Time, UTC+01:00)
Well, since I foolishly signed up to talk about "binary" XML at XML 2004, and will probably get flamed from all sides I have one question: is there some fuzzy middle ground where you get some of the advantages of XML even if you don't meet all the criteria? That is, what if one's XML litmus test says that a scenario is weakly XML-ish: someone has a need to interoperate across platforms, wants to leverage the tools based tools on XML data models such as XPath/XQuery/XSLT, the content itself is not binary, BUT parsing performance or bandwidth limitation is a critical constraint. What is an architect in that situation supposed to do?

As near as I can tell, they're likely to invent their own serialization of the XML infoset that is more compact or easily parseable (or finds a better size/speed tradeoff than XML itself offers). That's what the XML accelerator people have done (Sarvega for example, but I don't know the details) and that's what the XML database people have done (I recall Michael Rys saying that the SQL Server folks have done this).

Do you disagree? I think most people faced with the dilemma (e.g. the wireless industry) don't want throw out the entire XML toolset and its network effects with the parsing performance bathwater. I think reasonable people can disagree on whether there ought to be one or more *standard* serializations for the infoset (or XQuery data model, whatever ...), that's not my question, and I don't think the industry is in any position to answer the question of standardization yet.
Mike Champion
Monday, December 27, 2004 9:32:51 PM (GMT Standard Time, UTC+00:00)
Excellent article and conversation.

I’m writing this since we are looking for technical enthusiasts interested in an elegant alternative to XML for dealing with data arrays and computational models.

Our technology is the result of 20 years of R&D. While it can consume and generate XML when necessary, its unique benefit is in the way it uses spreadsheet templates to create, parse and render CSV files.

We claim with confidence that nothing can more efficient since the CSV files used to transport the data are in the smallest possible files due to its minimal overhead, as well as the quickest and least resource-hungry to way to parse and visualize a dataset. Our proprietary solution (based on a patented process) is up to 1,000% more efficient than XML.

Our goal is to establish a community of developers and users who would help evolve the technology. I welcome any feedback and advice.
Comments are closed.