Tim Bray has a post entitled Thought Experiments where he writes

To keep things short, let’s call OpenDocument Format 1.0 "ODF" and the Office 12 XML File Formats "O12X".

Alternatives · In ODF we have a format that’s already a stable OASIS standard and has multiple shipping implementations. In O12X we have a format that will become a stable ECMA standard with one shipping implementation sometime a year or two from now, depending on software-development and standards-process timetables. ODF is in the process of working its way through ISO, and O12X will apparently be sent down that road too, which should put ISO in an interesting situation.

On the technology side, the two formats are really more alike than they are different. But, there are differences: O12X's design center, Microsoft has said repeatedly, is capturing the exact semantics of the billions of existing Microsoft Office documents. ODF’s design center is general-purpose reusability, and leveraging existing standards like SVG and MathML and so on.

Which do you like better? I know which one I’d pick. But I think we’re missing the point.

Why Are There Two? · Almost all office documents are just paragraphs of text, with some bold and some italics and some lists and some tables and some pictures. Almost all spreadsheets are numbers and labels, with some sums and averages and pivots and simple algebra. Almost all presentations are lists of bullet points with occasional pictures.

The capabilities of ODF and O12X are essentially identical for all this basic stuff. So why in the flaming hell does the world need two incompatible formats to express it? The answer, obviously, is, "it doesn’t".

I find it extremely ironic that one of the driving forces behind creating a redundant and duplicative XML format for website syndication would be one of the first to claim that we only need one XML format to solve any problem. For those who aren't in the know, Tim Bray is one of the chairs of the Atom Working Group in the IETF whose primary goal is to create a competing format to RSS 2.0 which does basically the same thing. In fact Tim Bray has written a decent number of posts attempting to explain why we need multiple XML formats for syndicating blog posts, news and enclosures on the Web.

But let's ignore the messenger and focus on the message. Tim Bray's question is quite fair and in fact he answers it later on in his blog entry. As Tim Bray writes, "Microsoft wants there to be an office-document XML format that covers their billions of legacy documents". That's it in a nutshell. Microsoft created XML versions of its binary document formats like .doc and .xls that had full fidelity with the features of these formats. That way a user can convert a 'legacy' binary Office document to a more interoperable Office XML document without worrying about losing data, formatting or embedded rich media. This is a very important goal for the Microsoft Office team and very different from the goal of the designers of the OpenDocument format. 

Is it technically possible to create a 'common shared office-XML dialect for the basics' as Tim Bray suggests? It is. It'll probably take several years (e.g. the Atom syndication format which is simply a derivative of RSS has taken over two years to come to fruition) and once it is done, Microsoft will have to 'embrace and extend' it to meet its primary goal of 100% backwards compatibility with its legacy formats. And that doesn't answer the question of what Microsoft should ship in the meantime with regards to file formats in its Office products. After all, Office 12 is scheduled to ship in the second half of 2006.

There is no simple technical solution on the horizon that will change the fact that there are be multiple XML formats for Office documents. What we need to agree on is the best way forward, not attempt to demonize each other for trying to do what's best for our customers.

Disclaimer: I work at Microsoft. However I do not work in any area related to the Office XML formats. The above is my personal opinion and should not be construed as an expression of the opinions, intents or strategies of my employer.


Categories: XML
Tracked by:
http://randyh.wordpress.com/2005/11/28/more-on-competing-xml-formats-for-office-... [Pingback]
"4.5 Tremor in Redmond, WA" (franklinmint.fm) [Trackback]
"The Office Productivity Standard Wars Heat Up" (tecosystems) [Trackback]
http://www.douglasp.com/blog/PermaLink.aspx?guid=2b4f6366-be14-42ca-b24a-0235e99... [Pingback]
http://silauma.info/carolina/sitemap1.html [Pingback]
http://zabivwn.net/colorado/sitemap1.html [Pingback]
http://weujmru.net/childcare/sitemap1.html [Pingback]
http://vy3i7wz.net/freebsd/index.html [Pingback]
http://yftbsy1.net/nascar/sitemap1.html [Pingback]
http://restablog.com/electronics/sitemap1.html [Pingback]
http://tb9wlm3.net/04/index.html [Pingback]
http://lx2rnws.net/sublets/sitemap1.html [Pingback]
http://lx2rnws.net/activities/sitemap1.html [Pingback]
http://gator393.hostgator.com/~rocata/sitemap2.html [Pingback]
http://rogents.netfirms.com [Pingback]
http://tulanka.readyhosting.com/online/sitemap1.php [Pingback]
http://enxi0fw.net/bedroom/index.html [Pingback]
http://anubis.sslcatacombnetworking.com/~rocata/flowers/sitemap1.html [Pingback]
http://biggest-hosting10.com/~rocata/table/index.html [Pingback]
http://host239.hostmonster.com/~blogford/sitemap4.html [Pingback]
http://host239.hostmonster.com/~blogford/sitemap1.html [Pingback]
http://gator413.hostgator.com/~digital/music/sitemap1.html [Pingback]
http://da7fcil.net/07/index.html [Pingback]
http://ezjhep4.net/sitemap1.html [Pingback]
"meet christian singles" (meet christian singles) [Trackback]
http://tyzzlpp.net/musicians/sitemap1.html [Pingback]
http://mwtobvc.net/06/index.html [Pingback]
http://hrxc1zr.net/childcare/sitemap1.html [Pingback]
http://hrxc1zr.net/pets/sitemap1.html [Pingback]
http://d579737.u108.floridaserver.com/sitemap1.html [Pingback]
http://box432.bluehost.com/~zbloginf/sitemap1.html [Pingback]
http://gator442.hostgator.com/~hockteam/lottery/sitemap1.html [Pingback]
http://gator442.hostgator.com/~hockteam/university/sitemap1.html [Pingback]

Monday, 28 November 2005 20:49:30 (GMT Standard Time, UTC+00:00)
Violent agreement.

Two is almost always better than one because it leads to competition.

And the hypocrisy would be funny if it weren't so alarming.
Monday, 28 November 2005 23:24:10 (GMT Standard Time, UTC+00:00)
I don't think Tim's guilty of hypocrisy. Atom is a technical improvement on RSS, the failings of which have been holding innovation back (check back in a year or so, if I'm wrong, I'll eat my hat).

But the two office formats are in a well-known space, the core of the applications they're designed to support haven't changed significantly in the last decade or so. The problem isn't innovation, more a matter of consolidation. Having the XML base will no doubt enable further innovation, but that's further down the line. The office projects have different goals, and different starting points. ODF doesn't have the legacy to deal with; O12 has the advantage of experience.

The political aspect here does make things messy. Competition isn't a useful end in itself, in a case like this cooperation would probably be a lot more productive. Violent agreement is probably a good way to your respective arguments. Both your conclusions make sense.

There are two distinct products, and for now there are two formats, but that doesn't mean that both parties can't benefit from where they overlap. Neither side can dominate outright in the foreseeable future. If I was a strategy person at MS I'd be looking to places like MSN and services, not the old shrink-wrap profit centres that are rapidly eroding. If I was a strategy person at one of the companies that back OpenOffice, hopefully I'd realise that MS isn't quite the monolith it used to be, and that yapping at a weaker point will have little or no impact on their stronger points.

Whatever, long term it'll be surprising if there isn't convergence of the office formats. Short term, some hard work with XSLT might be needed ;-)

Meantime Google are having a good crack at powning the net. Hopefully Yahoo! will scuttle them amidships...
Tuesday, 29 November 2005 04:19:49 (GMT Standard Time, UTC+00:00)
RSS 2.0 and Atom do not do the same thing. RSS 2.0 was written for bloggers by a blogger to support blogging. Atom was developed by developers for developers to do much more in the syndication of content and beyond. The RSS 2.0 spec is frozen and specifically states that a new format with a new name is required to address new functionality. Just because Atom supports blogging as one of its applications does not make them comparable. The problem was that Atom could not be made to be a superset of RSS 2.0. Can the same be said for MS Office XML and ODF?
Tuesday, 29 November 2005 06:21:01 (GMT Standard Time, UTC+00:00)
Did I miss something? Tim Bray complains of childish name-calling in this post. I don't see any such thing. Was something deleted? If not, I have to say Tim seems to be confusing disagreement with insult.
Tuesday, 29 November 2005 07:23:17 (GMT Standard Time, UTC+00:00)
Uche: Dare calls Tim a hypocrite. I'd be insulted. I wouldn't care though.
Tuesday, 29 November 2005 11:27:38 (GMT Standard Time, UTC+00:00)

If it takes years to convert an XML piece into another, then may be XML is not any better than the badly criticized "legacy" binary file formats. Who is lying here?

Microsoft wants you to believe that the file format war is over and it does not matter anymore. Of course, it does not matter anymore as long as you are using theirs. As such, it is part of the integrated initiative, which leads to a Windows-only world. Soon a Windows Vista-only world.

Let's take a look at the XML tags in the Office 2006 reference preview doc. None of the semantics between tags is documented. Take a look at the old binary Office file format doc. None of the semantics between records is documented. Yeah, XML is going to make things so much better than in the old world....
Stephane Rodriguez
Tuesday, 29 November 2005 14:24:00 (GMT Standard Time, UTC+00:00)
Ok, Spy Vs Spy bit notwithstanding, following the experiment:

If there is a common subset of word processing formats, how much of that subset is represented in HTML or XHTML?

Maybe the answer is staring you in the face... right now.
Tuesday, 29 November 2005 17:25:02 (GMT Standard Time, UTC+00:00)
I understand why Tim Bray is upset. You call him a hypocrite and accuse him of demonizing (though you don't specify who he is supposed to have demonized - Microsoft in general? Office developers? Brian Jones?). I think he was pretty careful to look at the other side of the issue, even though his opinion is a pretty strong one.

Bottom line is that your response to his criticism was personal, not merely a critique of his arguments.
Andrew Shebanow
Tuesday, 29 November 2005 20:15:37 (GMT Standard Time, UTC+00:00)
I actually followed Tim's link, and thus didn't originally see the title, and didn't pay enopugh attention when I got over here. I was looking for the name-calling in the body of the post. Yes: "hypocrisy" is name-calling. The contents may be less inflammatory, but headlines are always shriller than the story, aren't they?
Tuesday, 29 November 2005 21:51:07 (GMT Standard Time, UTC+00:00)

I think you owe Tim Bray an apology. This truly is childish.
Wednesday, 30 November 2005 04:28:01 (GMT Standard Time, UTC+00:00)
You call Tim Bray a hypocrite and kiss Ray Ozzie's butt in other posts, wow I'm losing respect for you.
Wednesday, 30 November 2005 18:58:48 (GMT Standard Time, UTC+00:00)
Tim's post defines the word hypocrisy.
Sunday, 04 December 2005 00:23:27 (GMT Standard Time, UTC+00:00)
Well, I don't think so. Going to a neutral source (the Encarta dictionary), I find that "hypocrite" is described as someone who gives a false appearance of having admirable principles.

It is certainly possible to disagree with Tim and the tendency, shared with many others, to underplay that the O12X work began with the Office 2003 XML Office Schemas, along with the magical presumption that ODF works for Microsoft Office and that Microsoft just needs to get over it.

Nevertheless, ascribing a quality to a person, rather than their ideas, is basically what name-calling is all about.

I didn't want to get into this. I wanted to link to this article in one I am creating. I won't do that because I am unwilling to cite the title of this article and assume what the post is about without following my link.
Monday, 05 December 2005 19:39:08 (GMT Standard Time, UTC+00:00)
Fair enough. If you say that the MSXML format is the way it is because that makes it better able to support previous Microsoft formats that seems logical to me.

And the fact that you want people to use _your_ document format rather than another one is only logical for a business.

But please don't _force_ us to use MSXML if we use your suite - why not give us a choice? If MSXML is _really_ the best thing for us mortals then time will decide and it will become accepted on it's own merits. Microsoft is supposed to listen to people and give customers what they want - no?

So PLEASE, even if you don't like ODF or even think it's bad for us, why not let _us_ decide the format that we want to use. No one is asking you to toss away your own format or reveal it's innermost secrets, but _please_ allow us peasants to open and save documents in ODF if we should want to and do it properly with a Save As. It isn't as though ODF belongs to a rival or competitor after all.

If as Scoble says, it's too hard for you chaps to add the necessary filter then at least allow someone outside to do it for you. It isn't as though they'd need to see your source or anything.

Sorry if I'm being naive, but I don't really see why you shouldn't and there'd be a lot of grateful people if you did!

Andrew BC
Comments are closed.