5 Replies Latest reply: Apr 28, 2012 9:02 AM by Dale Gillard
AGagravarr Level 1 Level 1 (0 points)

I've been hunting around the iWorks resources section (http://www.apple.com/iwork/resources/), but I can't seem to find any details anywhere on documenting the iWorks file formats. Does anyone know where I can find this?

 

(I'm trying to write something to read iWorks 09 files, such as pages and numbers, without needing a copy of iWorks. For the other Office file formats, such as ODF/ODP, DOC/XLS/PPT and DOCX/XLSX/PPTX, you can get file format documentation which defines what that file structure is like, what individual parts and elements mean, what are allowed values and structures etc. I'm after something similar for the iWorks family of formats, but I've thus far been unable to track it down)


iWorks 09
  • 1. Re: iWorks File Format Documentation?
    Level 8 Level 8 (41,760 points)

    (1) As there is no iWorks product, it's logical that you find nothing.

     

    (2) As far as I know, it's very rare to see Apple describing its files formats.

    Such description was available for the very first Keynote.

     

    (3) The main component of an iWork document is the embedded index.xml file.

    I know somebody which deciphered the index files describing Numbers documents but if he want to give details, he is able to do it by himself

     

    Yvan KOENIG (VALLAURIS, France) vendredi 27 avril 2012

    iMac 21”5, i7, 2.8 GHz, 12 Gbytes, 1 Tbytes, mac OS X 10.6.8 and 10.7.3

    My Box account  is : http://www.box.com/s/00qnssoyeq2xvc22ra4k

  • 2. Re: iWorks File Format Documentation?
    AGagravarr Level 1 Level 1 (0 points)

    After not being able to find anything, I'd begun to suspect (2) as well. It'd be annoying and frustrating if there really isn't any documentation, as it'd make it very tricky to do anything with the files...

     

    Apache Tika (which is what I'm working on for this) has some support for the iWorks 09 formats. We can get some structured text out (eg turn a numbers file into a XHTML table), but we're looking for more information to improve the support. Today's fun challenge was detecting password protected files - they seem to use some entirely non-standard encryption at the zip file level.

     

    For anyone who's interested, the Apache Tika code for handling iWorks files is available at http://svn.apache.org/repos/asf/tika/trunk/tika-parsers/src/main/java/org/apache /tika/parser/iwork

  • 3. Re: iWorks File Format Documentation?
    Tom Gewecke Level 9 Level 9 (71,715 points)

    If you consider that iWork was first issued over 7 years ago, and in all that time apparently no one has bothered or been able to do what you are proposing.....

  • 4. Re: iWorks File Format Documentation?
    Level 8 Level 8 (41,760 points)

    AGagravarr wrote:

     

    Today's fun challenge was detecting password protected files - they seem to use some entirely non-standard encryption at the zip file level.

    (1) I repeat that there is no iWorks product !

     

    (2) I repeat that it's very rare to get format specs from Apple.

    AppleWorks files formats were never published. As far as I know, iWork ones which are modified by every major update were never published too.

     

    (3) Encrypting with a non-standard scheme may be a good way to protect confidentiality. Isn't it ?

     

    Yvan KOENIG (VALLAURIS, France) samedi 28 avril 2012

    iMac 21”5, i7, 2.8 GHz, 12 Gbytes, 1 Tbytes, mac OS X 10.6.8 and 10.7.3

    My Box account  is : http://www.box.com/s/00qnssoyeq2xvc22ra4k

  • 5. Re: iWorks File Format Documentation?
    Dale Gillard Level 5 Level 5 (4,365 points)

    developer.apple.com is the place to look for technical information relating to Apple.

     

    Apple documented the file format for all the iWork apps when they were first released (2005?) and made it available on developer.apple.com. However, this documentation is no longer available as the file format has changed considerably.

     

    The file format was originally XML, but it's now binary. You'll need to do your own research if you want to reverse engineer the file format.