About two and half years ago, I was hanging out with several members of the Office team as they gave the details about how Office 2003 would support XML file formats at XML 2002. Now that it's 2005, juicy information like that is now transmitted using blogs.

Brian Jones has a blog post entitled New default XML formats in the next version of Office were he reveals some of the details of XML support in the next version of Office. He writes

  Open XML Formats Overview

To summarize really quickly what’s going on, there will be new XML formats for Word, Excel, and PowerPoint in the next version of Office, and they will be the default for each. Without getting too technical, here are some basic points I think are important:

  1. Open Format: These formats use XML and ZIP, and they will be fully documented. Anyone will be able to get the full specs on the formats and there will be a royalty free license for anyone that wants to work with the files.
  2. Compressed: Files saved in these new XML formats are less than 50% the size of the equivalent file saved in the binary formats. This is because we take all of the XML parts that make up any given file, and then we ZIP them. We chose ZIP because it’s already widely in use today and we wanted these files to be easy to work with. (ZIP is a great container format. Of course I’m not the only one who thinks so… a number of other applications also use ZIP for their files too.)
  3. Robust: Between the usage of XML, ZIP, and good documentation the files get a lot more robust. By compartmentalizing our files into multiple parts within the ZIP, it becomes a lot less likely that an entire file will be corrupted (instead of just individual parts). The files are also a lot easier to work with, so it’s less likely that people working on the files outside of Office will cause corruptions.
  4. Backward compatible: There will be updates to Office 2000, XP, and 2003 that will allow those versions to read and write this new format. You don’t have to use the new version of Office to take advantage of these formats. (I think this is really cool. I was a big proponent of doing this work)
  5. Binary Format support: You can still use the current binary formats with the new version of Office. In fact, people can easily change to use the binary formats as the default if that’s what they’d rather do.
  6. New Extensions: The new formats will use new extensions (.docx, .pptx, .xlsx) so you can tell what format the files you are dealing with are, but to the average end user they’ll still just behave like any other Office file. Double click & it opens in the right application.

...

Whitepapers

The Microsoft Office Open XML Formats: New File Formats for "Office 12"

http://download.microsoft.com/download/c/2/9/c2935f83-1a10-4e4a-a137-c1db829637f5/Office12NewFileFormatsWP.doc

The Microsoft Office Open XML Formats: Preview for Developers

http://download.microsoft.com/download/c/2/9/c2935f83-1a10-4e4a-a137-c1db829637f5/Office12FileFormatDevPreviewWP.doc

This is totally awesome news. I remember asking, back in 2002, why Powerpoint didn't have an XML file format and the answer was that it was due to schedule constraints but it would be fixed in the next version. Not only did the Office guys keep their word but they went above and beyond.

This should make Sam Ruby happy.