Zip files and Encoding – I hate you.

I’ve written about some of the issues with depending on zip as a packaging format in the past. As people know, Web Apps is depending on Zip as the packaging format for Widgets.

Zip the good

Zip has a lot going for it. It is ubiquitous and dependable… so long as you don’t want to share files across cultures.

Zip the bad

The Zip spec does not seem to know that there are normalization models for UTF-8, when there are actually 4 (or more, because there is some non-standard ones too!). The Zip file gives no guidance as to how file names inside zip files are to be normalized.

Consider, when a zip file is created on Linux, it just writes the bytes for the file name in the encoding of the underlying file system. So, if the file system is in ISO-8859-1, the bytes are written in ISO-8859-1. This may seem ok, but when you decompress the zip file on Windows, which runs on encoding Windows-1252, the file names get all mangled. If the underlying encoding of the file system on Linux is something else, you won’t be able to share files with other systems at all. So in this case, it is not Window’s fault.

The Zip spec says that the only supported encodings are CP437 and UTF-8, but everyone has ignored that. Implementers just encode file names however they want (usually byte for byte as they are in the OS… see table below).

It gets worst! because MacOS runs on some weird non-standard decomposed Unicode mode, you can only share zip files with other MacOs users. According to this email, the LimeWire guys also ran into a similar problem with regards to encodings in MacOS:

“for example a French, German or Spanish Windows user cannot exchange files that contain [file names with] French, German or Spanish accents with a French, German or Spanish Macintosh users”

The following table illustrates the problem:

Bytes that represent ñ in a Zip file (in hex)
File name Zip in Windows Zip in Linux Zip in Mac OS
ñ a4 (Extended US-ASCII/CP437) C3 B1 (UTF-8 NFC) 6E CC 83 (UTF-8 NFD)

Yes! holly crap! three different byte sequences corresponding to different character encodings.

The only way around this would be a *special* custom-built widget zipping tool that normalizes file name strings to NFC. If the widget engine needs to decompress the widget to disk, then it would take the NFC and convert them to the operating system’s native encoding (or store the files in memory, and reference them that way). This affects the URI scheme and DOM normalization of Widgets, so Web Apps will have to deal with it eventually… but not sure exactly how.

Position on Widget Security

I recently submitted a paper to the W3C’s Workshop on Security for Access to Device APIs from the Web. The workshop is being held in London on the 10th of December, 2008. The website for the workshop informs us that:

With the emergence of the Web as a compelling alternative to locally installed applications, security issues are an increasing obstacle for realizing the full potential of the Web, in particular when Web applications developers need to get access to features not traditionally available in the browsing environment: cameras, GPS systems, connectivity and battery levels, external applications launch, access to personal data (e.g. calendar or addressbook), etc.

The goal of this workshop is to bring together people from a wide variety of backgrounds (API designers, security experts, usability experts, …) to discuss the security challenges involved in allowing Web applications and widgets to access the APIs that allow to control these features, and to advise the W3C on appropriate next steps for any gap that needs to be addressed with new technical work.

As all the papers of the workshop are now available on-line, I thought I would republish my position paper here. Being a child of the social networking era, my paper focuses on securing widgets using a community-driven approach to control rogue software. I propose that control security policies be handed over to trusted communities. Anyway, I won’t spoil the fun by rewriting the paper here. I also include the comments I got back from the panel reviewers at the bottom of the post.

Towards a community-controlled security policy for Widgets
Marcos Caceres, W3C Invited Expert

Traditional software security models have had limited success in
halting the spread of malware. Some studies suggest that roughly 1/10
of all computers connected to the Internet are running some sort of
malware. Software security firms have effectively given up trying to
keep track of the number of infected computers.

The Web Apps working group is currently working towards standardizing
a class of client-side Web application colloquially referred to as a
"Widget".At the time of writing, widgets currently lack a standardized
security model that would make user's less susceptible to widgets as
malware. A new standardized approach is needed to protect users and
limiting widgets from further exacerbating the software security
problem. This is  particularly important on mobile devices (one of the
target platforms for Widgets), which already outnumber PC; A number
that is only set to increase as prices on mobile devices continue to
decrease and the social value of  convergence between internet and
device capabilities increasingly add value to the social dimensions of
the Web.

The mobile industry has largely avoided the software security problems
that plague PCs through (a) strict control over what software can go
on devices, or (b) by requiring that developers have their software
digitally signed by a vendor or some other trusted third party to
access device features. However, the stringent controls have resulted

 o Lack of innovation within the mobile software application space,
where only those who can afford to pay for a digital certificate are
allowed to access APIs on devices.
 o High barrier of entry for developers, because of the cost of buying
a code signing certificate.
 o Developers circumventing security policies and distribution models
(eg. jail-breaking the iPhone).
 o The creation of an extremely closed and anti-competitive
environment, where a single vendor can "kill" applications for any
arbitrary reason (as Apple has done a number of times on the iPhone).

At the same time as malware has grown exponentially, a different
phenomenon has shown itself resilient to vandalism and attacks on the
Web. Namely, Wikipedia has sustained a high level of quality of
content by leveraging a community's interest to keep the quality of
content high.

In this paper, I take the position that a different approach to
software security is needed for widgets: one that builds on
traditional ways of keeping software secure, but also attempts to
leverage the community of users that use widgets. In other words, a
community-driven security model for widgets whereby the social layer
of the Web form "trusted authorities" that have the ability, through
existing Web protocols, determine and dynamically adapt the security
privileges of Widgets running on end-user's devices (or even recommend
the removal of Widgets from an end-user's device, if the trusted
authority deems that a widget has become malicious).

Standardization of Widgets through Web Apps Working Group
Widgets are client-side applications that make use of Web
technologies, such as HTML, CSS, and JavaScript instead of compiled
programming languages such as Java or C#.  Although not formally
specified at the time of writing, the security model that underpins
widgets is one of almost total lockdown: by default, a widget is only
allowed to access resources within it's own package (widgets are
always packaged in a Zip file). To get access to the network, a widget
must explicitly request it through a configuration document . Widgets,
however, rely on the limited functionality provided by browser's APIs
in order to do anything remotely useful. For example, widgets from
various vendors rely on the XmlHttpRequest object to make asynchronous
requests to fetch data from the Web. The Web Applications Working
Group seeks to change this by allowing widget authors to have access
to APIs beyond those provided by today's Web browsers. To this effect,
the Working Group formulated the following requirement in the Widgets
1.0 Requirements document [1].

"R21. Feature Access Declarations
A conforming specification MUST specify or recommend a means to allow
authors to declare that an instantiated widget will require access to
device specific standardized features or proprietary features (e.g. a
proprietary API to access the camera on a device). A conforming
specification MUST be specified in such a way that fallback
relationships can be declared so that if one feature is unavailable,
another can be declared as a possible substitute. In addition, a
conforming specification MUST provide authors with a means of stating
which features are optional and which features are mandatory for a
widget to run."

Feature access, in this context, generally refers to accessing APIs to
device capabilities. The current proposal for standardization of
feature access is to declare, within a widget's configuration
document, a <feature> element. The feature element, as specified at
the time of writing, has two attributes: name and required. Name is a
URI that identifies the feature, and required means that the feature
is required for the widget to run. Feature elements can be nested,
forming a fallback relationship. So, if the outer most feature is
unavailable, then the widget user agent will attempt to use the next
inner feature like so:

<widget xmlns=""
   <feature name="uri:tryMeFirst">
       <feature name="uri:tryMeSecond"/>

It also needs to be stated at this point that widgets can be digitally
signed. Creating a digital signature for widgets involves hashing all
the resources inside a widget package to produce a digital signature
resource. This resource is stored inside the widget as
"signature.xml". Multiple signatures by different vendors may be
included in a widget package.

Securing feature access
Consider the following hypothetical example. A developer has created a
widget that requires the use of the W3C's Geolocation API. In the
example below, the Geolocation API is identified by URI, which the
widget engine is able to recognize.

<widget xmlns=""
   <feature name="" required="yes" />

Then, at runtime, there are essentially four ways that a feature
widget can be allowed to access the feature:

1. The end user is prompted for permission to use the feature.
2. If the widget was digitally signed by a trusted source, a vendor
may grant the widget access to the feature.
3. The widget was packaged and digitally signed with a resource that
grants it permissions.
4. The widget engine acquires a list of features that a widget is
allowed to access from one or more trusted sources on the Web.

In the case of 1, prompting the user, it is generally known that
end-users will click "yes" without fully understanding the
consequences of what they are doing. Hence, leaving security decisions
solely to end-users does not generally help with security. In the case
of 2, relying of digital certificates, is costly for both
vendors/publishers and developers because it requires that every
widget be checked and signed by a single source authority. It also
requires that the root certificate of the authorizing signer be on
every device, which is economically infeasible and likely technically
impossible. Case 3, including signed permissions, suffers from the
same problems as 2. It also suffers in that once the widget has been
released into the wild, the permissions are effectively baked into the
widget package. In the case of 4, dynamically acquiring permissions to
access features from a trusted source, overcomes many of the problems
with 2 and 3, but requires an infrastructure that would offload the
quality assurance and privacy management and device capability access
to some trusted authority on the Web. However, I would argue that
model 4 is a natural evolution of community run Widget
galleries/Software review sites on the Web, which already provide
ratings for widgets by communities of users.  What is currently
lacking is the mechanisms that would allow such a model to emerge.

Of course, there are problems with the community driven security
approach. For instance, the permissions server may be unavailable or
the policy delivered for a widget may be incorrect. The system could
be circumvented by members of the community to grant a widget access
to more features. However, as with Wikipedia, community protocols can
be established to limit such things from happening. Another problem
with this approach is that it requires the developer to have their
widget verified by as many trusted authorities as possible, to make
sure that their widget runs on as many devices as possible. Despite
it's limitations, I would still argue that, together with traditional
software security models,  this proposed  model may further assist in
reducing widgets becoming malware.

How it would work in practice
A developer creates a widget and submits it to one or more trusted
authorities for review. As part of the widget, they declare the
features they require at runtime.
The trusted authority may grant the widget access to the features it
requested by providing some sort of downloadable permissions file.
If the widget starts misbehaving, members of the community of the
trusted authority may reduce the feature privileges of a widget, or
even send a warning to end-users that the widget has been deemed
The widget engine periodically verifies each widget with one ore more
trusted authorities, dynamically adjusting the security policies as
If a malicious developer submits a widget, and it is granted feature
rights, the widget can only do limited damage before it is discovered
and disabled by the community.

And the panel said:

Reviewer 1:
The notion of declaring in advance what feature access is needed is a good one. It’s similar to Necula’s notion of proof-carrying code ( This is a good idea, regardless of how the authorization is done.

W00T! +1 to me! 🙂

I’m less enthusiastic about the multiple authorities notion. The paper suggests that developers pick the authorities, a notion that is ripe with potential for abuse. Even if all of the authorities are honest — itself quite an assumption — users may have different values. Some, for example, may trust the EFF’s notion of privacy and legitimate monitoring; others might prefer the RIAA. Why should the developer decide?

No no no! Users or operators can pick the authority! As if I would let the developer pick the authority, that is stupid. The developer is the hacker so of course they don’t pick the authority!

I’m also leery of community-based control of automated decisions. Wikipedia works (when it does work; the abuses are, of course, well known) because people read it and update it. What is to stop a botnet from flooding an authority with bogus “yes” votes for a malicious widget? At the least, this question should be addressed.

“When in works?!” WTF? it works just great as far as I can tell. What’s to stop the bogus bot, the same mechanisms and interests that stop the bots from destroying Wikipedia. It’s also the same mechanism that stop P2P networks from falling apart: the community. The community filters the content. And yeah, some bad stuff gets through, but it’s quickly filtered out before it does any damage. Ask the movie and media industry, they’ve made no progress in stopping piracy of movies and TV shows through bit torrents. Now apply the same to widgets.

Reviewer 2
It’s pretty notional, and the author doesn’t really seem to understand how the analogy between wikipedia and web trust breaks down in a number of ways. But the discussion could be fun and good for brainstorming.

I wish the reviewer would have elaborated and what I don’t get.

Get off the soap box and dive on how this would really work in the face of attempts to subvert it. That’s the hard problem. Right now, the paper has a bit too much hand waving.

Maybe this reviewer missed the “how it works in practice” section.

Remember how little time it takes for a spammer to get what they want. How will you deal with time lag exposures? (for example)

I proposed polling white/black lists. It’s better than the crap security models we have at the moment. It’s how current anti-virus software currently does it. I’m proposing a simple solution. I don’t think there is some magic notification system that could be implemented.

Reviewer 3
This is an interesting paper but I think it introduces more problems than it solves. Nevertheless, I think this would be a good topic of discussion for the workshop.

It be nice also if reviewer three listed what those problems where. Guess I’ll hear about them at the workshop.

Anyway, lots of great papers have been submitted to the workshop. I’ve not read all of them, but from what I’ve read so far, I recommend the following papers:

Climbing Mt. Kilimanjaro and Mt. Meru

Made it! Uhuru Peak!
Marcos and Arusha, top of Kili

I recently had the opportunity to climb Mt. Kilimanjaro and Mt. Meru with my friend Anne and his dad. For Kilimanjaro, we hiked the Machame route, know also as the Whisky route. The hike took six days, and we successfully reached the summit! I want to share some thoughts about the experience here with regards to equipment, the mountains, Tanzania and human matters. However, before I get to that, I thought of some general rules for climbing Kilimanjaro.

Three General Rules

  1. Don’t skimp on equipment: buying cheap equipment will likely come back and bite you hard. Equipment is intended to keep you alive, so if you value your health, then get good equipment (see my equipment list below).
  2. The mountain will try to kill you: ok, this is a little over dramatic, but if you pretend it is, you will have much more fun. The mountain does, in fact, kill a lot of people per year. According to Wikipedia, nearly 35 people die on average on the mountain per annum (around 15 tourist, 20 porters).
  3. Always carry your wet weather gear with your because nothing dries on the mountain: Kili is a massive mountain surrounded by tropical rainforest. It creates it’s own weather system, which is extremely unpredictable. Expect rain, snow, high winds and blistering hot sun – all in one day! Despite all the crazy weather, there is a general pattern however: nights and mornings are usually clear and quite cold, particularly the higher you get. As the day heats up, the forests below releases moisture and soon you will find yourself in cloud. By mid afternoon it will likely be raining. When it rains, you need to have your wet weather gear handy. Don’t be an idiot and leave it in your backpack, as it is unlikely you will see your porter throughout the day. You and your equipment will get wet. But how wet you get is up to you. If you don’t protect your backpack and your sleeping bag gets wet, you risk pneumonia or worst. If your boots get wet, you risk blisters and infection. Also, you need to be prepared to drink 3-4 of water liters a day to keep hydrated. And, on the final ascend, protect your water from the cold (which can be extreme, we had at least -10c to -15c). If exposed, your water will freeze or you will end up drinking extremely cold water, which will cool your body temperature and make you feel unwell.
  4. “Polé Polé,” is the key: Polé Polé means something like “slowly slowly”, which is the key to getting up the mountain (but seems to be a way of life in Tanzania). If you go too quickly, you will quickly be overcome with altitude sickness and you’ll find yourself with you head between your legs vomiting up your lunch and with a massive altitude headache. Not nice. Quite a few keen hikers found themselves in this predicament and promptly had to turn back. Also, don’t let anyone rush you. Go at your own pace, particularly as you get closer to the summit. People experience hallucinations as they get closer to 5000 meters. One of the groups we were with started seeing hands on their shoulders and the felling they were being followed by people that were not there… perhaps Kili is haunted by the hundreds of people that have died there over the years; Alas, as an atheist, I’m not allowed to believe in such things 😉
  5. Don’t be an asshole, your are there as a team. Porters are not your slaves! Just because you pay the tour company to provide you with porters, it does not mean you should not help them carry things. Remember, porters get paid to what basically amounts to less then US$5 per days. The first day is the hardest for everyone, so do everyone a favor and carry your own pack up the mountain. If you see a porter struggling to carry food, then help them! don’t just stand there like an asshole saying “oh! they should not carry so much stuff! that is really sad.” If it starts raining, don’t be an asshole and just stand there while the porters get wet. Help them put up the tent. Help them with the dishes. Help them with whatever you can. And give them a good tip (see end of this post for tipping info) 🙂

What we paid

We went with a company called Victoria Tours. The team they assigned to use were ok for Mt. Meru and excellent for Kilimanjaro. More on that later. We paid US$4000 for both walks for 3 people, excluding tips for porters and guides (which totaled around 10%). The money also paid for 3 nights accommodation at a 1 star hotel (they cost about TZ$15,000 per night). We stayed at the Mt. Meru House, where Victoria Tour’s office is located.

Mt. Meru

Mt Meru
Mt Meru

I strongly recommend you hike Mt. Meru before you hike Kilimanjaro.

Not only is Mt. Meru more challenging than kili, it also offers a great opportunity to see wildlife (particularly on day 1 – we saw all the usual suspects: giraffes, zebra, buffalo, gnus, baboons, etc. which you don’t see on Kili), get fit and acclimatized, and provides some amazing views of Kilimanjaro, which just makes you want to climb it more.

Equipment for Kilimanjaro

Prior to going on the trip, I spent a lot of time trying to find the appropriate equipment to take. I spent nearly 1000 Euros (AU$2000) on new equipment, which complimented some old equipment I had. If you are considering getting into hiking, you probably need close to 2000 Euros worth of equipment (see list below). Yes! hiking equipment is expensive, but there is a good reason for that: it’s purpose is to keep you alive in extreme conditions. This is not so relevant on Kilimajaro, where you are usually hiking with a lot of people; but more so if you are planning to hike at other locations alone (as I sometimes like to do).

If you intend to climb Kilimanjaro, tour companies will tell you that they will provide you with the equipment you need. DON’T USE THEIR EQUIPMENT, IT’S MOSTLY CRAP! You SHOULD take your own equipment. The equipment that the companies in Tanzania have are mostly unsuitable for Kilimanjaro and will likely fail you. By fail, I mean, for instance, that any water proof jacket they give you will not be water proof or the tent they give you will leak, etc.

Equipment (MUST take, and by MUST I mean MUST in the RFC2119 sense!)

  • day pack (~20-35 liter backpack)
  • Plastic (or synthetic) pack cover for day pack and for backpack.
  • 1 set of thermal underwear (top & bottom)
  • 1 sleeping bag (rating 0 C or four seasons)
  • 1 silk liner for sleeping bag
  • 1 warm jersey/sweater
  • 1 sleeping mat
  • 1 pair of track suit top & bottom (for sleeping)
  • 1 light towel
  • 1 polar fleece/down vest
  • 2 pair of light loose fitting cordura nylon (quick dry) trousers
  • 1 waterproof jacket and pants
  • 1 short sleeves shirt (quick drying synthetic material, no cotton – you will wear this for 3 days!)
  • 1 long sleeves shirt (quick drying synthetic material, no cotton – you will wear this for 3 days!)
  • 4-6 pairs of good quality hiking socks
  • 1 t-shirt (spare, cotton ok)
  • 1 pair of hiking boots (waterproof/goretex)
  • 1 pair of sneakers (or sandals)
  • 1 pair of warm heavy weight gloves/mittens
  • 1 pair of gaiters
  • 1 pair of light weight gloves (inner gloves)
  • 1 pair of cycling gloves
  • 1 bandana (good for first day, and when it gets hot)
  • 1 wide brim hat
  • 1 pair of sunglasses
  • 1 balaclava
  • 1 wool hat
  • 1 warm scarf
  • 2 x 1.5 litres water bottles or camel bag (4 liters)
  • 1 head lamp (plus spare batteries & bulb)
  • 1 pair of walking poles
  • 1 pocket swiss army knife (or better)
  • 1 travel pillow (optional)
  • 1 small first aid kit (it is unlikely that your guide will carry a first aid kit)
  • 1 toiletries bag (what to put in it is below!)
  • 3 Large black garbage bags
  • 8 small plastic bags

In toiletries bag you MUST bring:

  • antibacterial soap
  • antiseptic cream
  • deodorant (roll on)
  • sunscreen (35+, but 50+ preferred as you get burned really easily at high altitude)
  • anti-fungal talc powder (for feet and to stop any crotch-rot)
  • 12 imodium tables (Liperamide HCI BP 2mg) – diarrhea pills
  • Malaria pills
  • Moleskin (or some kind of blister protection)
  • Insect repellent (only for first day)
  • Leukoplast – Natural rubber adhesive medical tape (2.5cm x 5meters)
  • toothpaste/toothbrush/dental floss
  • water purification tablets
  • antibiotics (if you have trouble getting them from your GP, just get ’em in Tanzania at any pharmacy; they are happy to sell you anything there! 🙂 )
  • 40 panadol/aspirin/ibuprofen (even if you don’t need them, there is always someone who does!)

Equipment you MAY want to bring

  • Tent (4 season): We had a tent from the company. It leaked and generally sucked. Don’t go bringing a 5Kg, 10 person tent! get a light tent that weights no more than 2 kilos!
  • Purely Optional

    • mobile phone – There is reception on the whole mountain. Txt your love ones and let them know your progress and that you are OK.
    • mp3 player – loaded with music and audio books: like all hiking, it is a good time to do some soul searching and pondering while listening to your favorite tunes. It’s also a great time to get in some reading. My favorite book to listen to on walks is Neil Stepherson’s Snowcrash. The book made new sense to me in Tanzania: where the rules of the society and commerce are governed by monetary corruption and turbo capitalism, which has led to widespread poverty as a result of IMF/World bank imposed deregulation and the selling off of state assets to foreign interests. Essentially, Tanzania has no industry so the general economy seems to be made of up of people just selling little bits of food, clothing, and daily necessities to each other. In a lot of ways,
    • gps – always fun to know how high you are, especially if you don’t take drugs.

    Some useful suggestions for when you are hiking

    • Wear two pairs of socks all the time: this will stop you from getting blisters.
    • Always wear pants: shorts suck.
    • Always wear your gaiters: this will stop little stones and sticks getting into your boots. It will also protect your if it suddenly starts to rain or snow.
    • Clean your water bottles every few days as they may become septic (you will know this because they will smell bad). If bottles become septic, it’s your fault. Don’t blame your porters. Your mouth is full of bacteria and other nasty stuff, which, if you are not careful, will contaminate your bottle and can make you sick.

    Tipping on the walks

    Tip 10% of what your group paid. But at the same time, if you can, give a little more. Unemployment rate in Tanzania is over 60%. In the areas that supply the porters it can be as high as 80% unemployment. Remember, these poor dudes get about US$5 a day to carry all your shit up the mountain. That’s what most westerners on the mountain make in like 20 minutes of work per day. The least you can do is honestly ask your self, “how much would they have to pay me to carry my own stuff up the mountain?”. Ask yourself that on the last day, when you are up above 4000 meters and then you will get a sense of how hard-core being a porter or a guide is.

    Having mentioned tipping, I really hate tipping. I think that Tanzanian companies should just include the tip into the price and standardize their prices and compete on features, etc. I think the whole way Tanzania’s do business is really fucking backwards and really fucking stupid (I have no kinder words for it). All the bullshit about not having standardized prices reinforces the corruption in the society. All the bargaining for everything is an absolute waste of time and seem to be motivated by infantile greed. I’m sure it can be shown to be universally detrimental to the economy as a whole.