Why manifests and permissions don’t mix

I often get requests to add fine-grained API permissioning to the Web Manifest format. Fine grained permissioning would be something like: “I want to access contacts, but I only need read only access”. The opposite being course grain permissioning: “I want to access contacts… and do all the things!”.

There are quite a few problems with trying to rely on a manifest to grant permissioning – and we can easily see what they are by looking at platforms that rely on manifests for permissioning (e.g., Android, iOS, and Firefox OS). As iOS’s permissioning closely mirrors the Web’s API permissions model, I’m not going to discuss it (irrespective of apps being vetted through an App Store which weeds out most bad actors). The problems with Android’s up front permissioning model are well known (i.e., few people read/understand them), so we don’t even need to go there. It’s bad. End of story.

That leaves Firefox OS (FxOS). In this post, I’ll critically examine the issues that come up with FxOS’s use of a manifest to enable permissions and the somewhat sad implications of deciding to do permission in a manifest. FxOS is also a good platform to focus on because it is the platform that, despite being proprietary, most closely resembles the Web Platform: hence, we can learn a lot about what the implications would be if we brought the same permissioning model to the Web.

As a full disclaimer, I work for Mozilla who make FxOS (but I don’t work directly on FxOS!). I’m also one of the key contributors to the Web Manifest spec and I’m implementing the Web Manifest spec in Gecko.

FxOS’s permissioning

FxOS supports both course-grained and fine-grained permissioning in the manifest. Beyond the API capabilities afforded by the Web Platform, web apps targeted at FxOS can only enable permissions once they are explicitly “installed” by the user. FxOS exposes a proprietary API to enable installing a web site.

The implications is that web apps are restricted in functionality until users complete an installation process. However, because users are not able to experience the full application until they install an application, developers may be forced to nag users to install their app (e.g., by showing big “Install me!” buttons). This takes users out of context and forces them to do something that they might otherwise not want to do (install an app) to complete some task.

To overcome the “install me!” button problem, FxOS relies on having a centralized app store (Firefox marketplace), which is quite similar to Apple’s App Store or Google’s Play Store in how it operates. More on this below.

Security/Privacy sensitive Device API access

FxOS restricts the APIs a Web App can access to those exposed by the Web Platform + tiny handful of other proprietary APIs. Web apps targeting these FxOS-proprietary capabilities are known as “hosted apps”.

To allow access to more powerful APIs FxOS relies on a proprietary packaged application format, which bundles a digital signature, and the resources on an application (css, html, js) into a Zip file. And, of course, then relies on the Firefox marketplace to do the distribution. Packaged apps created by the average developer are called “certified apps”. It is “certified” in that Mozilla checks it, and, if it meets a set of criteria, it gets a stamp of approval in the form of a digital signature (a light weight form of DRM).

Packaged application rely on a manifest being available within a (zip) package – and because of its immediate availability, FxOS can use it as a means to enable fine-grained permissions. However, this comes with fairly severe scalability issues – and brings all the issues of traditional software distribution with it:

  • Developers must submit their application to a market place – leading to centralization.
  • Submitting an application to an App Store triggers a lengthy review process – and access to device/storage APIs are checked by a human reviewer.. However, what we are essentially doing here is making the decision to allow apps to use certain APIs – and given users and assurance that this is ok (i.e., at runtime, we don’t prompt for security permissions – only for privacy ones like geo and accessing camera/microphone, etc.).
  • PKI infrastructure must be established, because APIs are only enabled if Mozilla’s signature is in the package.
  • Packaged apps don’t work well with the way the web is architectured – and this has lead Mozilla to introduce fairly insecure APIs to overcome these limitations. For example, FxOS allows developers to request a “SystemXHR” API, which essentially allows packaged apps to contact any server, ignoring CORS in the process.

How the Web Manifest differs

FxOS’s approach is in stark contrast with the way we (the community working on standardizing the manifest) are approaching the development and standardization of the Web Manifest format. The manifest is designed to be a non-critical and low priority resource: the manifest should only be downloaded “when needed” – and it should be treat it like a “progressive enhancement”. That is to say, web apps should not depend on it being either present or supported… it’s just “the cherry on-top” of a web apps ice cream.

Effectively, this means when the user attempts to “add to home screen” will the user agent initiate the d/l for the manifest. This naturally limits the utility of the manifest as a place to put permission policies for APIs – as the APIs could only become available after the manifest loads … leading to some bad race conditions.

Currently, the Web Manifest spec explicitly says not to delay the load event of the document. So, this means that there are limitations in detecting when the manifest actually loads (this is by design).

Additionally, in Gecko, we intend to support two ways of actually loading a manifest:

  1. link rel=”manifest”: non-blocking, but could fire an onload event to signal readiness.
  2. HTTP’s Link: header. As it’s outside the Document, it doesn’t interface with it (though it could, but it would be a bit of a layering violation).

The implication is that, as a dev, one would need to wait for the manifest to be available and ready before granting API permissions. This is not ideal, from my perspective, because it introduces significant delay and dependency on a resource that may never become available. A security implication of 2 is that a proxy can insert a header (though a proxy could also change the payload, so also affecting 1).

Hopefully this explains is why there has been reluctance (from me, at least) to add security critical permissions to the manifest, and instead continue to rely on the Web APIs themselves to handle permissioning.

So what’s the alternative

A different approach would be to simply limit the usage of an API when it is initialized – or limit access to an API from same origin, or the top-level browsing context, and maybe even requiring TLS. This is similar to what Service Workers and requestAutoComplete() are doing – they both require TLS. Doing so overcomes many of the issues I raised above: If security/context conditions are met, APIs can be used immediately without having to wait on some other external resources that dictates the policy of usage. This also removes fragility, in that policy is tightly bound to the API at point of usage.

It also removes the need for apps stores – but again forces API and UX designers to deal with the permission problems (instead of centralizing everything and having to have other humans do a review of the app before users can use it).

Bottom line: let’s use the Web’s security model + TLS to do API permissioning.

Web 2024 – A response to Robin Berjon’s post

I read Robin Berjon’s “Web 2024” and was kinda surprised how much our view of the future of the Web differs – though I agree with many things, specially with books turning into “apps” and TV-industry just doing it wrong. I think Robin was probably trying to drum up support for an exciting and somewhat positive vision, while sending a warning to others that if they don’t start to “get it”, they will go the way of the dodo Nokia.

This is my take on where we could be in 2024 and response to Robin’s write up. My vision is not pretty and isn’t what I want to happen, but what I feel will likely happen unless there is a radical shift in the way we build and standardize the platform.

Be warned, I’m a “the glass is half empty!” kinda guy.

Before presenting my history, some key things I fundamentally disagree with Robin about:

  • The rise of single page apps just ain’t gonna happen. Single page apps are unicorns. I proved that statistically already and I don’t see it ever becoming main stream. If we fix page transitions (so to avoid the flash of unstyled content as you navigate from one page to another), then single page apps are unnecessary. Yeah, it’s that simple, Robin!:)
  • JSON will die way before 2024. It’s a shitty standard and the lack of support for comments and trailing spaces makes it double shitty – it’s even worst than XML in that respect. It’s tremendously difficult to maintain and write. Something better will undoubtedly replace it way before 2024 (or browsers will start being more liberal about how they treat common errors leading to a new standard).

A history of the Web from 2014 to 2024

In the run up to 2024, a few attempts where made at making a browser in JavaScript but they all failed early on (around 2016 and then again in 2020). Engines like Gecko and Blink tried this (from before 2014) and were not able to implement as many features of the browser as they wanted in JS, because JS can’t access the things C++ code can, and it was not possible to implement APIs using JS for Workers (not making this up! this is a limitation of Gecko today that is not going away). JavaScript, despite its many advances, was still too far behind the curve of other languages to be competitive – it just lacked too many features, speed, and niceties when compared to the likes of Swift, Rust, Go, and even newer modern languages that emerged in the 2018-2022 period. Coupled with Apple’s marketing machine, and Swifts ease of use over Java, Objective-c, and JavaScript, many developers quickly became iOS converts leaving the Web out of frustration.

JavaScript

TC-39 felt the threat and tried to adapt (this time for real, having laughed off Dart into total irrelevance in 2013 despite Google’s fake/marketing-driven “standardization” of it in 2014 through ECMA); but the pace at which TC-39 standardized new features, and those features became available to the dependent native platforms was too slow. Unfortunately, in light of advances made in Swift and Rust and even crappy C++, JS just couldn’t hold its own in the app space. There was just too much legacy and browser “magic” baggage there: the inability to, for instance, not be able to use Object.observe() on host objects both confused and annoyed developers. And nobody got the whole “proxies” thing. Even the darling Node.js faded in favour of Go and other new emerging technologies.

JavaScript, of course, didn’t die or anything: it remained the lingua franca of the Web that it was in 2014 – but it was only in 2022 that it gained interesting features like enums, protocols, or generics. Interestingly, JS classes did become available in mid 2017 across all browsers, but lacking generics and protocols using classes didn’t really take off. Getting a module system did help tho, and it became quite widely used by 2020.

Web Components

JavaScript aside, things were looking up for the Web. After 2014’s Google IO the Web Components revolution finally began – and Service Workers were coming down the pipe and became usable in apps by 2017. Chrome’s of Service Workers implementation landed in late 2015, closely followed by Mozilla’s in 2016. Microsoft came soon after, but Apple held off till 2018 so no one could realistically use SWs in their apps till Apple finally supported them… and yeah, the iPhone 9 is pretty awesome, but I’m not allowed to talk about it.

Having Web Components was great, because it meant that HTML as a language was more or less done and developers were finally free to focus on creating their own elements that best represented the needs of their applications… except when they hit problems: mainly, this was to do with the preload scanner and other predictive magic the browser was trying to perform. Web components simply couldn’t explain the platform (or HTML elements) in the way its designers had hoped. The RICG had hit this problem early in 2012-2013 with and warned the Web Components people about it. But there was nothing that could be really done without exposing more of the guts of the browser (which required a lot of reengineering that browser vendors were not willing to do). So, web components were fairly successful but with some limitations. Thankfully, Client Hints started getting added to browsers in 2016 and it helped with many of the blockers around web components. Again, Apple held off supporting Web Components till 2017 so realistically they could not be used in production (not without needing to d/l a ton of polyfill code that just kept on growing).

Demise of the W3C

An interesting side effect of “finishing” HTML in 2014 was the W3C’s slow demise into irrelevance. The writing had been on the wall for a long time, as the W3C continue to pursue an increase in member participation instead of providing technical leadership. It also couldn’t really compete with the WHATWG and other community driven projects to add new features to browsers. The W3C’s inability to adapt its process to cater for living standards left more and more participants dissolution and further pushed browser makers to do their standardization work at the WHATWG and new emerging community driven efforts. The W3C shifted focus and became a place to “standardize” formats and other mostly irrelevant XMLy and other research/academic projects (so sad right now!). The last hold-outs were the CSS Working Group, but it too eventually broke up as means of adding CSS features became available to developers (i.e., a form of Web Components, but for CSS). By 2020, the browser vendors had all but abandoned the W3C – those that remained, only stayed there for marketing reasons but didn’t contribute anything technically.

Bit more about Service Workers

As mentioned, the paralleled Moz/Google development of service workers brought a great deal of innovation to the web platform. We could finally create apps that reliably worked offline – and JQuery 4 and new versions of Angular made this a breeze to set up and use. The missing bit was having the ability to “install” a web app as one installs a real native application. Another great win was Mozilla’s and Google forcing Service Workers to be exclusively used over HTTPS. This really helped with pervasive monitoring (much to the annoyance of the NSA, CIA, and advertisers as they could not spy as freely as before on many new Web apps).

The WebOS killed our last chance at interop

Despite Mozilla’s and Google’s attempts to standardize a manifest format to allow installable web apps, it never really took off. Under the noses of everyone back in 2014, the web was already undergoing major fragmentation. A lot of people knew this, but chose to pretend that it was actually a good thing for the Web… and maybe it was in hindsight. It really started with Chrome OS, but was rapidly legitimized by growth and popularity of Firefox OS. By 2018, FireFox OS had taken a foothold in the lower end of the market capturing around 3% of global market share (the increase could have been greater, but very cheap Android phones put a lot of pressure on Mozilla which caused its growth to slow). By 2016, it was too late to turn back. Mozilla had invested very heavily into their proprietary platform (“Web” APIs, dev tools, docs, marketplace, etc.) and ditching all for lousy/half-baked W3C alternatives didn’t make economic sense (even if they were royalty free). Additionally, it would have been too expensive to deprecate and rewrite the FireFox OS platform to make use of standardized APIs… so they didn’t bother and just kept going with FireFox OS as it was, even if no other platform supported the APIs. But what the hell, it was the “open” alternative.

The fragmentation problems were two fold:

  1. The steady rise of the “embedded web view”.
  2. that there was a large and deliberate attempt to fragment the web into silos: Chrome OS, FireFoxOS, and Windows 8 apps.

Rise of Servo and the embedded web view

Embedded web views was something the clever people that started PhoneGap tuned into early on, and that Adobe bought early on realizing the potential of. Despite the goal of having PhoneGap become irrelevant (by pushing the Web to provide the functionality it provides) – it actually went the other way! PhoneGap/Cordova continued to grow in relevance and popularity. In 2014-2015, Mozilla had also jumped on board along with Google, Microsoft, etc. and the Cordova APIs became the de facto standard by 2016 without undergoing W3C standardization.

To get into this war for the embeddable engine, Mozilla ramped up development of Servo. Making Servo into an embeddable platform made it a great drop in replacement for WebKit (after loosing more contributors to Blink, Apple eventually forked WebKit and made an internal project in 2016). Servo’s embedding API and use of Rust means it quickly became a serious contender against WebKit. Tools like Phantom JS ditch WebKit. By 2019, Gecko was finally killed off and Servo was put in it’s place. The combination of Servo’s reliance on Rust and WebIDL proved to be a winner here. It meant that developers could quickly add new custom features to the platform and use a relatively easy to use language (Rust) to add new features to the browser. This made it even more attractive than Blink as an embeddable browser engine.

Lastly, as FireFox OS moved to using Servo instead of Gecko, it started to support native applications written in Rust. This followed the trend set in 2014, where Google started supporting Android (Java) applications in Chrome OS. By 2024, most apps are written in Rust and a lot make use of Servo as a WebView component. This is great for viewing web content – but most serious app development is now done with either Rust, Java, Foopy (it’s awesome, wait till you play with it!) or Swift. The Web remains mostly a publishing medium.

I got in early on the whole Foopy thing. Made a mint and retired to Portugal and I now have a small fig farm (mostly sell jams and cakes… it’s nice).

W3C TAG elections

I recently learned through reading Alex Russell’s blog that Google had nominated him as a candidate on the upcoming W3C Technical Architecture Group elections. I thought this was great, as I more often then not find myself in violent agreement with Alex on how browsers should expose their guts to developers (amongst other things). As Alex put it:

I’m running to try to turn the TAG into an organization that has something to say about the important problems facing devs building apps today; particularly how new specs either address or exacerbate those challenges.

I thought it would be great to finally have someone who cares about the challenges that Web developers face represented on the TAG. So it then came to me as a bit of a humbling surprise that I had also been nominated (by Nokia) and asked to run by Robin Berjon. Admittedly, I was hesitant (and I still am) as I don’t know much about the TAG.

To us humble outsiders, the TAG has always been the Ivory Tower of the old guard of the Web: you know, where the guys that started it all pontificate about the nuances of URIs, HTTP Range 14 (don’t worry, I have no idea what the hell that is either!), and the mythical semantic web.

Because of the somewhat obscure range of topics, the TAG’s discussions have been the butt of many jokes on the Web (e.g., the fake tag) and humorous pictures on W3C Memes. It has also become synonymous with architecture astronautism. This is a shame, because, as Alex points out, it could be a force for the greater good, but the interactions with other working groups is generally been limited (and certainly does not appear to be focused on pursuing the issues relevant to Web developers who end up using the stuff coming out of the W3C).

Given the negative perception of the TAG, I basically share Alex’s “goal of turning the TAG into an advocacy organization for the interests of webdevs.” If elected, I want to work with other “reformist-minded candidates” (namely, Anne van Kesteren and Yehuda Katz) towards making that happen.

What and how?

Some proactive things that could be done by the TAG to meet the goal above include:

  • Take the discussion to where developers are (Google+, Twitter, GitHub, etc.) – ask them what the TAG should focus on (or make the case to developers to show that there is value in the TAG).
  • Talk to developers and find out what their pressing issues are. Do this by attending actual developer conferences and similar forums. See if we can make the TAG something cool and respected again!
  • Instead of publishing findings at the W3C, publish findings in the popular developer press (e.g., A List Apart, Smashing magazine, .Net magazine, HTML5 Rocks, or similar) – i.e., where developers can actually read the findings, and in a common voice. Make TAG members available for interviews to media.
  • Make time available to talk to developers on a regular (e.g., bi-monthly Q&A sessions on Google+)
  • Help developer-based Community Groups (e.g., the RICG and the Extensible Web CG) with navigating the process of adding things to the Web platform.
  • Work more closely with WebApps WG, System Apps, HTMLWG/WHATWG to make sure their API designs stay in sync and don’t cause developers unnecessary pain.
  • Advocate to W3C Working Groups for more clear specs that meet the needs to developers as well as implementers.

If you have more/better ideas of what could be done to make the TAG more relevant to developers, please let me know in the comments.

How to vote

Unfortunately, voting is W3C member only. But otherwise, you need your AC rep to nominate a candidate (instructions).

What’s my pitch

This is what I submitted to the W3C as my pitch to get votes:

Over the last 6 years, Marcos’ background in interaction design has brought a unique perspective to Web standards. Long before there was the “Native Apps vs Web Apps” debate, Marcos was leading the charge to standardise installable web applications at the W3C through the Widgets family of specifications. Until recently, Marcos worked as a software architect at Opera Software, where he led the team that created Opera Widgets and Extensions platforms. Aside from his work on installable web applications, Marcos has been involved in numerous efforts to bring device APIs to the Web. To the TAG, Marcos can bring hands-on experience dealing with the architectural challenges that come from designing, deploying, and running installable web apps – and how those apps can safely interact with device APIs. For more information about Marcos’ qualifications, please see Marcos at LinkedIn.

The W3C has also published the list of other candidates.