Hard disk crash

The hard drive in my laptop decided it had had enough today and decided to crash (with only two days before I depart to Boston for a W3C meeting!). Luckly I was able to recover all my PhD stuff and work I had been doing today on Widgets. I’m currently in the process of reformatting my drive with Windows XP. I was able to recover almost everything using BartPE, which creates a cd-bootable stripped down version of Windows. BartPE is very useful as it allows you to map network drives. To get my data, I just copied all the stuff that I could onto one of our development servers. It took me about 2 hours, as BartPE kept crashing trying to copy files.

Tomorrow I’ll have to waste time reinstalling all my apps and testing the system to see if it is stable enough to take to the US… otherwise, it’s “off to the shop” to for a new hard-drive 🙁 If all else fails, my girlfriend has offered to lend me her new MacBook which I will happily take over my PC any day 🙂

Update: went to get a new 160GB hard drive, but once I started reinstalling Windows the installer kept crashing with IRQL_LESS_THAN_OR_EQUAL (new  BSoD error, which I had not seen before). Did a google search and all evidence pointed to either that the RAM or CPU was overheating. One of the IT guys here at QUT ran a memory tester and we discovered that it was infact one of the RAM chips that was fried. Sucks, as I only bought the new RAM  about  one week ago 🙁 . Anyway, all seems semi-stable now… currently reinstalling Windows XP. I made a 40Gig partition to install Windows Vista  so I can agian play with SideBar Gadgets. I previously unistalled Vista because I found it so shockingly bad to use and unstable.

Annoying Word 2007 Citations Styles

Yesterday I started writing a paper for WWW2008 about widgets (and given the highly competitive nature of the WWW conferences, I doubt it will be accepted). Anyway, the conference mandates that citations conform to ACM’s referencing style (eg. smith [1] says, “bla bla”), which is not currently supported by Microsoft Word. My immediate thought was, “Right! Word’s style files are just (OO)XML so it should just be a simple matter of changing some angled brackets to create the ACM style!”. My plan was to base the ACM style on the already supported ISO 690 style, which is similar except it uses parenthesis “(1)” instead of brackets “[1]”. So I went into MS Word’s program file directory, and located the bibliographical styles. To my shock, the reference style file was an impenetrable XSLT file (7093 lines long and completely uncommented!). I spent about 20 minutes trying to work out what the hell the file was doing… but eventually I gave up :(. I compared ISO 690 XSLT style file to the ACM Bibtex sytle file. The bibtex style file is only around 1700 lines long, and nicely commented I might add.

Widgets 1.0 (v2)

Today the W3C published the Second Public Working Draft of the Widgets 1.0 Specification. It’s been nearly a year since we published the first public working draft (11 Nov, 2006) and much has changed and been added to the spec (…and it still has a long long way to go yet before it will be finished!). The most notable addition to this version of the spec are in the attempt to standardize a subset of the Zip specification and support for digital signatures using XML Digital Signatures. Unfortunately, a lot of exciting things that are under discussion by those participating in the standardization effort have not made it into this latest draft. For example, we are still trying to work out a nice model for automatic updates, but we should have something drafted up fairly soon.

The main problem I’ve been working on over the last two months is trying to specify a subset of Zip that should be used by widgets. My goal has been to define a subset that is interoperable across all platforms and devices in such a way that it also ensures longevity. As you might imagine, this has proven to be quite a challenge…

The issues with Zip

The Zip file format is what is commonly referred to as a de facto standard: it is not formally specified by any standards body, but of it is so widely implemented that it is interoperable across OSs and devices. This seems great on the surface, but when you try to standardize it, it becomes quite a nightmare. The main issues are these:

  • There are competing Zip specifications and there are many versions of each of the Zip specifications.
  • Different version of the Zip specification are implemented across different platforms and OSs.
  • There are many features in Zip that are desirable (eg. UTF-8 support), but are not widely implemented.
  • Zip is not an “open standard”, it is the property of PKWARE.
  • Zip is periodically updated and PKWARE does not provide any links to previous versions of their specs.

Competing Zip specifications

There are essentially two Zip Specifications that applications make use of: the “official” PKWARE Zip Application Notes and the “unofficial”Info-Zip Application Notes (mostly on Unix). The unofficial notes basically take whatever PKWARE has officially published, and gets modified, or otherwise clarified, by the guys at Info-Zip. In this sense, much of what one finds in the Info-Zip specs is identical to the PKWARE Zip spec. But, because PKWARE actually maintains the official spec, the PKWARE spec is always more up-to-data than what Info-Zip has on its website (for instance, the latests version of Info-zip covers version 6.2.0 of the official Zip spec (26 April 2004); the latest version of Zip is version 6.3.2 which came out in September 2007!, so InfoZip is three years behind PKWARE!).

Problem: Info-zip contains details that pertain to how info-zip works and may not be compatible/interoperable with the PKZip Spec. For example, Info-zip contains details about how to handle Unix permissions, while PKWARE’s Zip spec does not. This might not make the file formats incompatible, but it does make them physically different. You can try this out yourself: zip up a file using Info-Zip’s zip implementation and then zip up the same file using Windows’ Compressed Folders. The results will be different, but you should still be able to decompress the Info-Zip file using Windows’ native Zip implementation.

Different version of the Zip specification are implemented across different platforms, OSs, Specs

Another significant issue form a standardization perspective is that packaging formats are making use of either some Info-Zip spec or some PWARE spec. Significant examples include:

Java/JAR (including WAR and EAR) :
Info-ZIP Application Note 19970311
Open Document Format (ODF):
Info-ZIP Application Note 19970311
Open Office XML – Open Packaging Convention (OOXML-OPC):
PKWARE Zip Application Note (version 6.2.1), but with a bunch of clarifications.
OEBPS Container Format 1.0:
PKWARE Zip Application Note (no explicit version, but at least version 2.0 needed to extract and version 4.5 needed to extract Zip64).

I still have little idea as to what version of the Zip specification is actually implemented on each OS, let alone on mobile devices (information that seems to be quite difficult to come by!). As a result, and after some discussion with Jon Ferraiolo of IBM, I decided to base the Widget Spec on the OEBPS-OCF’s conformance requirements for Zip packages. I was tempted to make the widgets specification conform to the OOXML-OPC spec (put away your tomatoes!) because, in my opinion, the container aspects and conformance requirements are well specified (even if the rest of OOXML is “evil”).

Desirable features in Zip (6.3.2)

There are a number of really cool features in Zip that would make specifying a container format for widgets much better. They include:

  • Strong Encryption (using x.509 digital certificates): basically solves the digital signature problem, I think.
  • UTF-8 support: solves a significant part of the internationalization problem.
  • Zip64: future proofing.

To require widget engines to actually support these features puts a fair bit of strain on makers of widget engines. At this point, we have required that implementers support UTF-8 and Zip64.

Zip is not an open standard

The fact that Zip is proprietary might be something that comes back to bite us on the ass. I’m no lawyer, but there of patents/IPR issues surrounding Zip. I’m also not sure about how PKWARE will feel about WAF specifying a subset of their specification. I’ve emailed PKWARE and informed them of what we are doing and requested that they review the spec. They have responded and said that they will look into it.

Where to from here…

Looking forward, I’d really like to get all the physical and logical packaging stuff done. That includes:

  • Anything Zip related
  • The inter-package addressing model
  • How to handle decompression
  • How to name files in ASCII and UT-8

I’d also really like to nail down the auto-updates model and make sure that the manifest language we are specifying is covers all the common use cases. The security model is the elephant in the room 🙂 No one wants to touch it at this point; but we know its a massive issue. Another massive issue is the APIs… but that’s not something I want to get into now. A big issue for me is internationalization. I’ve been blocked a number of times when I’ve proposed doing internationalization using folders… every widget engine except Opera does it, so I think we should do it too.