Today the W3C published the Second Public Working Draft of the Widgets 1.0 Specification. It’s been nearly a year since we published the first public working draft (11 Nov, 2006) and much has changed and been added to the spec (…and it still has a long long way to go yet before it will be finished!). The most notable addition to this version of the spec are in the attempt to standardize a subset of the Zip specification and support for digital signatures using XML Digital Signatures. Unfortunately, a lot of exciting things that are under discussion by those participating in the standardization effort have not made it into this latest draft. For example, we are still trying to work out a nice model for automatic updates, but we should have something drafted up fairly soon.
The main problem I’ve been working on over the last two months is trying to specify a subset of Zip that should be used by widgets. My goal has been to define a subset that is interoperable across all platforms and devices in such a way that it also ensures longevity. As you might imagine, this has proven to be quite a challenge…
The issues with Zip
The Zip file format  is what is commonly referred to as  a de facto standard: it is not formally specified by any standards body, but of it is so widely implemented that it is interoperable across OSs and devices. This seems great on the surface, but when you try to standardize it, it becomes quite a nightmare. The main issues are these:
- There are competing Zip specifications and there are many versions of each of the Zip specifications.
- Different version of the Zip specification are implemented across different platforms and OSs.
- There are many features in Zip that are desirable (eg. UTF-8 support), but are not widely implemented.
- Zip is not an “open standard”, it is the property of PKWARE.
- Zip is periodically updated and PKWARE does not provide any links to previous versions of their specs.
Competing Zip specifications
There are essentially two Zip Specifications that applications make use of: the “official” PKWARE Zip Application Notes and the “unofficial”Info-Zip Application Notes (mostly on Unix).  The unofficial notes basically take whatever PKWARE has officially published, and gets modified, or otherwise clarified, by the guys at  Info-Zip. In this sense, much of what one finds in the Info-Zip specs is identical to the PKWARE Zip spec. But, because PKWARE actually maintains  the official spec, the PKWARE spec is always more up-to-data than what Info-Zip has on its website (for instance, the latests version of Info-zip covers version 6.2.0 of the official Zip spec (26 April 2004); the latest version of Zip is version 6.3.2 which came out in September 2007!, so InfoZip is three years behind PKWARE!).
Problem: Info-zip contains details that pertain to how info-zip works and may not be compatible/interoperable with the PKZip Spec. For example, Info-zip contains details about how to handle Unix permissions, while PKWARE’s Zip spec does not. This might not make the file formats incompatible, but it does make them physically different. You can try this out yourself: zip up a file using Info-Zip’s zip implementation and then zip up the same file using Windows’ Compressed Folders. The results will be different, but you should still be able to decompress the Info-Zip file using Windows’ native Zip implementation.
Different version of the Zip specification are implemented across different platforms, OSs, Specs
Another significant issue form a standardization perspective is that packaging formats are making use of either some Info-Zip spec or some PWARE spec. Significant examples include:
- Java/JAR (including WAR and EAR) : 
- Info-ZIP Application Note 19970311
- Open Document Format (ODF): 
- Info-ZIP Application Note 19970311
- Open Office XML – Open Packaging Convention (OOXML-OPC):
- PKWARE Zip Application Note (version 6.2.1), but with a bunch of clarifications.   
- OEBPS Container Format 1.0: 
- PKWARE Zip Application Note (no explicit version, but at least version 2.0 needed to extract and version 4.5 needed to extract Zip64).  
I still have little idea as to what version of the Zip specification is actually implemented on each OS, let alone on mobile devices (information that seems to be quite difficult to come by!). As a result, and after some discussion with Jon Ferraiolo of IBM, I decided to base the Widget Spec on the OEBPS-OCF’s conformance requirements for Zip packages. I was tempted to make the widgets specification conform to the OOXML-OPC spec (put away your tomatoes!) because, in my opinion, the container aspects and conformance requirements are well specified (even if the rest of OOXML is “evil”).
Desirable features in Zip (6.3.2)
There are a number of really cool features in Zip that would make specifying a container format for widgets much better. They include:
-  Strong Encryption (using x.509 digital certificates): basically solves the digital signature problem, I think.
- UTF-8 support: solves a significant part of the internationalization problem.
- Zip64: future proofing.
To require widget engines to actually support these features puts a fair bit of strain on makers of widget engines. At this point, we have required that implementers support UTF-8 and Zip64.
Zip is not an open standard
The fact that Zip is proprietary might be something that comes back to bite us on the ass. I’m no lawyer, but there of patents/IPR issues surrounding Zip. I’m also not sure about how PKWARE will feel about WAF specifying a subset of their specification. I’ve emailed PKWARE and informed them of what we are doing and requested that they review the spec. They have responded and said that they will look into it.
Where to from here…
Looking forward, I’d really like to get all the physical and logical packaging stuff done. That includes:
- Anything Zip related
- The inter-package addressing model
- How to handle decompression
- How to name files in ASCII and UT-8
I’d also really like to nail down the auto-updates model and make sure that the manifest language we are specifying is covers all the common use cases. The security model is the elephant in the room 🙂 No one wants to touch it at this point; but we know its a massive issue. Another massive issue is the APIs… but that’s not something I want to get into now. A big issue for me is internationalization. I’ve been blocked a number of times when I’ve proposed doing internationalization using folders… every widget engine except Opera does it, so I think we should do it too.