Why manifests and permissions don’t mix

I often get requests to add fine-grained API permissioning to the Web Manifest format. Fine grained permissioning would be something like: “I want to access contacts, but I only need read only access”. The opposite being course grain permissioning: “I want to access contacts… and do all the things! (i.e., read, write, update, and delete)”.

There are quite a few problems with trying to rely on a manifest to grant permissioning – and we can easily see what they are by looking at platforms that rely on manifests for permissioning (e.g., Android, iOS, and Firefox OS). As iOS’s permissioning closely mirrors the Web’s API permissions model, I’m not going to discuss it (irrespective of apps being vetted through an App Store which weeds out most bad actors). The problems with Android’s up front permissioning model are well known (i.e., few people read/understand them), so we don’t even need to go there. It’s bad. End of story.

That leaves Firefox OS (FxOS). In this post, I’ll critically examine the issues that come up with FxOS’s use of a manifest to enable permissions and the somewhat sad implications of deciding to do permission in a manifest. FxOS is also a good platform to focus on because it is the platform that, despite being proprietary, most closely resembles the Web Platform: hence, we can learn a lot about what the implications would be if we brought the same permissioning model to the Web.

As a full disclaimer, I work for Mozilla who make FxOS (but I don’t work directly on FxOS!). I’m also one of the key contributors to the Web Manifest spec and I’m implementing the Web Manifest spec in Gecko.

FxOS’s permissioning

FxOS supports both course-grained and fine-grained permissioning in the manifest. Beyond the API capabilities afforded by the Web Platform, web apps targeted at FxOS can only enable permissions once they are explicitly “installed” by the user. FxOS exposes a proprietary API to enable installing a web site.

The implications is that web apps are restricted in functionality until users complete an installation process. However, because users are not able to experience the full application until they install an application, developers may be forced to nag users to install their app (e.g., by showing big “Install me!” buttons). This takes users out of context and forces them to do something that they might otherwise not want to do (install an app) to complete some task.

To overcome the “install me!” button problem, FxOS relies on having a centralized app store (Firefox marketplace), which is quite similar to Apple’s App Store or Google’s Play Store in how it operates. More on this below.

Security/Privacy sensitive Device API access

FxOS restricts the APIs a Web App can access to those exposed by the Web Platform + tiny handful of other proprietary APIs. Web apps targeting these FxOS-proprietary capabilities are known as “hosted apps”.

To allow access to more powerful APIs FxOS relies on a proprietary packaged application format, which bundles a digital signature, and the resources on an application (css, html, js) into a Zip file. And, of course, then relies on the Firefox marketplace to do the distribution. Packaged apps created by the average developer are called “privileged apps”. It is “privileged” in that Mozilla checks it, and, if it meets a set of criteria, it gets a stamp of approval in the form of a digital signature (a light weight form of DRM).

Packaged application rely on a manifest being available within a (zip) package – and because of its immediate availability, FxOS can use it as a means to enable fine-grained permissions. However, this comes with fairly severe scalability issues – and brings all the issues of traditional software distribution with it:

Developers must submit their application to a market place – leading to centralization.
Submitting an application to an App Store triggers a lengthy review process – and access to device/storage APIs are checked by a human reviewer. However, what we are essentially doing here is making the decision to allow apps to use certain APIs – and given users and assurance that this is ok (i.e., at runtime, we don’t prompt for security permissions – only for privacy ones like geo and accessing camera/microphone, etc.).
PKI infrastructure must be established, because APIs are only enabled if Mozilla’s signature is in the package.
Packaged apps don’t work well with the way the web is architectured – and this has lead Mozilla to introduce fairly insecure APIs to overcome these limitations. For example, FxOS allows developers to request a “SystemXHR” API, which essentially allows packaged apps to contact any server, ignoring CORS in the process.

How the Web Manifest differs

FxOS’s approach is in stark contrast with the way we (the community working on standardizing the manifest) are approaching the development and standardization of the Web Manifest format. The manifest is designed to be a non-critical and low priority resource: the manifest should only be downloaded “when needed” – and it should be treat it like a “progressive enhancement”. That is to say, web apps should not depend on it being either present or supported… it’s just “the cherry on-top” of a web apps ice cream.

Effectively, this means when the user attempts to “add to home screen” will the user agent initiate the d/l for the manifest. This naturally limits the utility of the manifest as a place to put permission policies for APIs – as the APIs could only become available after the manifest loads … leading to some bad race conditions.

Currently, the Web Manifest spec explicitly says not to delay the load event of the document. So, this means that there are limitations in detecting when the manifest actually loads (this is by design).

Additionally, in Gecko, we intend to support two ways of actually loading a manifest:

link rel=”manifest”: non-blocking, but could fire an onload event to signal readiness.
HTTP’s Link: header. As it’s outside the Document, it doesn’t interface with it (though it could, but it would be a bit of a layering violation).

The implication is that, as a dev, one would need to wait for the manifest to be available and ready before granting API permissions. This is not ideal, from my perspective, because it introduces significant delay and dependency on a resource that may never become available. A security implication of 2 is that a proxy can insert a header (though a proxy could also change the payload, so also affecting 1).

Hopefully this explains is why there has been reluctance (from me, at least) to add security critical permissions to the manifest, and instead continue to rely on the Web APIs themselves to handle permissioning.

So what’s the alternative

A different approach would be to simply limit the usage of an API when it is initialized – or limit access to an API from same origin, or the top-level browsing context, and maybe even requiring TLS. This is similar to what Service Workers and requestAutoComplete() are doing – they both require TLS. Doing so overcomes many of the issues I raised above: If security/context conditions are met, APIs can be used immediately without having to wait on some other external resources that dictates the policy of usage. This also removes fragility, in that policy is tightly bound to the API at point of usage.

It also removes the need for apps stores – but again forces API and UX designers to deal with the permission problems (instead of centralizing everything and having to have other humans do a review of the app before users can use it).

Bottom line: let’s use the Web’s security model + TLS to do API permissioning.