Microsoft Office file formats' specifications

You've probably heard about Office Open XML File formats. It is zipped folder which includes content in XML format as well as the code and included files. This means you can take a look inside them by simply changing the extension:


This is not new and there are plenty of article about this on the web.

The news here is that Microsoft published Microsoft Office Binary (.doc, .xls, .ppt) file formats specifications. It could be very interesting... But it could be boring too :) because according Joel Spolsky:

A normal programmer would conclude that Office’s binary file formats:

  • are deliberately obfuscated
  • are the product of a demented Borg mind
  • were created by insanely bad programmers
  • and are impossible to read or create correctly.

There is a good article from Joel Spolsky who is former PM @ Microsoft Excel team that analyze the specification.

Read it to find out how complex data are handled in limited CPU power and memory back in 80386 at 20 MHz