February 6, 2004
@ 05:00 PM

A few days ago XML 1.1 became an official W3C recommendation. Mark Pilgrim, contrary to W3C guidelines, has celebrated by converting his RSS feed to XML 1.1 which means it currently cannot be processed by any Microsoft XML technologies from the XML parsers in the .NET Framework to MSXML which is used in a host of products from Internet Explorer to Office 2003.

This is the first step in fragmenting the interoperability on the Web gained by XML. It seems the next step will be W3C sanctioned binary XML. Anyway let's get back to XML 1.1. What exactly is wrong with it one might ask? The biggest thing wrong with it is that it is backwards incompatible with XML 1.0. A good summary of all the things you need to know about XML 1.1 is covered in Chapter 3 of Elliote Rusty Harrold's Effective XML

Everything you need to know about XML 1.1 can be summed up in two rules:

  1. Don't use it.

  2. (For experts only) If you speak Mongolian, Yi, Cambodian, Amharic, Dhivehi, Burmese or a very few other languages and you want to write your markup (not your text but your markup) in these languages, then you can set the version attribute of the XML declaration to 1.1. Otherwise, refer to rule 1.

XML 1.1 does several things, one of them marginally useful to a few developers, the rest actively harmful.

  • It expands the set of characters allowed as name characters

  • The C0 control characters (except for NUL) such as form feed, vertical tab, BEL, and DC1 through DC4 are now allowed in XML text provided they are escaped as character references.

  • C1 control characters (except for NEL) must now be escaped as character references

  • NEL can be used in XML documents, but is resolved to a line feed on parsing.

  • Parsers may (but do not have to) tell client applications that Unicode data was not normalized

  • Namespace prefixes can be undeclared

XML is a lousy format for most of the things it is used for. The one benefit it has is that it is widely supported and a guaranteed way to interoperate in a cross-platform manner. By tampering with this the W3C is effectively diluting one of the few benefits of using XML. This is an regrettable occurence. Unfortunately it looks like things will get worse now that the W3C also wants to dabble in “binary XML”.