Position on Widget Security

I recently submitted a paper to the W3C’s Workshop on Security for Access to Device APIs from the Web. The workshop is being held in London on the 10th of December, 2008. The website for the workshop informs us that:

With the emergence of the Web as a compelling alternative to locally installed applications, security issues are an increasing obstacle for realizing the full potential of the Web, in particular when Web applications developers need to get access to features not traditionally available in the browsing environment: cameras, GPS systems, connectivity and battery levels, external applications launch, access to personal data (e.g. calendar or addressbook), etc.

The goal of this workshop is to bring together people from a wide variety of backgrounds (API designers, security experts, usability experts, …) to discuss the security challenges involved in allowing Web applications and widgets to access the APIs that allow to control these features, and to advise the W3C on appropriate next steps for any gap that needs to be addressed with new technical work.

As all the papers of the workshop are now available on-line, I thought I would republish my position paper here. Being a child of the social networking era, my paper focuses on securing widgets using a community-driven approach to control rogue software. I propose that control security policies be handed over to trusted communities. Anyway, I won’t spoil the fun by rewriting the paper here. I also include the comments I got back from the panel reviewers at the bottom of the post.

Towards a community-controlled security policy for Widgets
Marcos Caceres, W3C Invited Expert

Traditional software security models have had limited success in
halting the spread of malware. Some studies suggest that roughly 1/10
of all computers connected to the Internet are running some sort of
malware. Software security firms have effectively given up trying to
keep track of the number of infected computers.

The Web Apps working group is currently working towards standardizing
a class of client-side Web application colloquially referred to as a
"Widget".At the time of writing, widgets currently lack a standardized
security model that would make user's less susceptible to widgets as
malware. A new standardized approach is needed to protect users and
limiting widgets from further exacerbating the software security
problem. This is  particularly important on mobile devices (one of the
target platforms for Widgets), which already outnumber PC; A number
that is only set to increase as prices on mobile devices continue to
decrease and the social value of  convergence between internet and
device capabilities increasingly add value to the social dimensions of
the Web.

The mobile industry has largely avoided the software security problems
that plague PCs through (a) strict control over what software can go
on devices, or (b) by requiring that developers have their software
digitally signed by a vendor or some other trusted third party to
access device features. However, the stringent controls have resulted

 o Lack of innovation within the mobile software application space,
where only those who can afford to pay for a digital certificate are
allowed to access APIs on devices.
 o High barrier of entry for developers, because of the cost of buying
a code signing certificate.
 o Developers circumventing security policies and distribution models
(eg. jail-breaking the iPhone).
 o The creation of an extremely closed and anti-competitive
environment, where a single vendor can "kill" applications for any
arbitrary reason (as Apple has done a number of times on the iPhone).

At the same time as malware has grown exponentially, a different
phenomenon has shown itself resilient to vandalism and attacks on the
Web. Namely, Wikipedia has sustained a high level of quality of
content by leveraging a community's interest to keep the quality of
content high.

In this paper, I take the position that a different approach to
software security is needed for widgets: one that builds on
traditional ways of keeping software secure, but also attempts to
leverage the community of users that use widgets. In other words, a
community-driven security model for widgets whereby the social layer
of the Web form "trusted authorities" that have the ability, through
existing Web protocols, determine and dynamically adapt the security
privileges of Widgets running on end-user's devices (or even recommend
the removal of Widgets from an end-user's device, if the trusted
authority deems that a widget has become malicious).

Standardization of Widgets through Web Apps Working Group
Widgets are client-side applications that make use of Web
technologies, such as HTML, CSS, and JavaScript instead of compiled
programming languages such as Java or C#.  Although not formally
specified at the time of writing, the security model that underpins
widgets is one of almost total lockdown: by default, a widget is only
allowed to access resources within it's own package (widgets are
always packaged in a Zip file). To get access to the network, a widget
must explicitly request it through a configuration document . Widgets,
however, rely on the limited functionality provided by browser's APIs
in order to do anything remotely useful. For example, widgets from
various vendors rely on the XmlHttpRequest object to make asynchronous
requests to fetch data from the Web. The Web Applications Working
Group seeks to change this by allowing widget authors to have access
to APIs beyond those provided by today's Web browsers. To this effect,
the Working Group formulated the following requirement in the Widgets
1.0 Requirements document [1].

"R21. Feature Access Declarations
A conforming specification MUST specify or recommend a means to allow
authors to declare that an instantiated widget will require access to
device specific standardized features or proprietary features (e.g. a
proprietary API to access the camera on a device). A conforming
specification MUST be specified in such a way that fallback
relationships can be declared so that if one feature is unavailable,
another can be declared as a possible substitute. In addition, a
conforming specification MUST provide authors with a means of stating
which features are optional and which features are mandatory for a
widget to run."

Feature access, in this context, generally refers to accessing APIs to
device capabilities. The current proposal for standardization of
feature access is to declare, within a widget's configuration
document, a <feature> element. The feature element, as specified at
the time of writing, has two attributes: name and required. Name is a
URI that identifies the feature, and required means that the feature
is required for the widget to run. Feature elements can be nested,
forming a fallback relationship. So, if the outer most feature is
unavailable, then the widget user agent will attempt to use the next
inner feature like so:

<widget xmlns="http://www.w3.org/ns/widget"
   <feature name="uri:tryMeFirst">
       <feature name="uri:tryMeSecond"/>

It also needs to be stated at this point that widgets can be digitally
signed. Creating a digital signature for widgets involves hashing all
the resources inside a widget package to produce a digital signature
resource. This resource is stored inside the widget as
"signature.xml". Multiple signatures by different vendors may be
included in a widget package.

Securing feature access
Consider the following hypothetical example. A developer has created a
widget that requires the use of the W3C's Geolocation API. In the
example below, the Geolocation API is identified by URI, which the
widget engine is able to recognize.

<widget xmlns="http://www.w3.org/ns/widget"
   <feature name="http://www.w3.org/api/geo" required="yes" />

Then, at runtime, there are essentially four ways that a feature
widget can be allowed to access the feature:

1. The end user is prompted for permission to use the feature.
2. If the widget was digitally signed by a trusted source, a vendor
may grant the widget access to the feature.
3. The widget was packaged and digitally signed with a resource that
grants it permissions.
4. The widget engine acquires a list of features that a widget is
allowed to access from one or more trusted sources on the Web.

In the case of 1, prompting the user, it is generally known that
end-users will click "yes" without fully understanding the
consequences of what they are doing. Hence, leaving security decisions
solely to end-users does not generally help with security. In the case
of 2, relying of digital certificates, is costly for both
vendors/publishers and developers because it requires that every
widget be checked and signed by a single source authority. It also
requires that the root certificate of the authorizing signer be on
every device, which is economically infeasible and likely technically
impossible. Case 3, including signed permissions, suffers from the
same problems as 2. It also suffers in that once the widget has been
released into the wild, the permissions are effectively baked into the
widget package. In the case of 4, dynamically acquiring permissions to
access features from a trusted source, overcomes many of the problems
with 2 and 3, but requires an infrastructure that would offload the
quality assurance and privacy management and device capability access
to some trusted authority on the Web. However, I would argue that
model 4 is a natural evolution of community run Widget
galleries/Software review sites on the Web, which already provide
ratings for widgets by communities of users.  What is currently
lacking is the mechanisms that would allow such a model to emerge.

Of course, there are problems with the community driven security
approach. For instance, the permissions server may be unavailable or
the policy delivered for a widget may be incorrect. The system could
be circumvented by members of the community to grant a widget access
to more features. However, as with Wikipedia, community protocols can
be established to limit such things from happening. Another problem
with this approach is that it requires the developer to have their
widget verified by as many trusted authorities as possible, to make
sure that their widget runs on as many devices as possible. Despite
it's limitations, I would still argue that, together with traditional
software security models,  this proposed  model may further assist in
reducing widgets becoming malware.

How it would work in practice
A developer creates a widget and submits it to one or more trusted
authorities for review. As part of the widget, they declare the
features they require at runtime.
The trusted authority may grant the widget access to the features it
requested by providing some sort of downloadable permissions file.
If the widget starts misbehaving, members of the community of the
trusted authority may reduce the feature privileges of a widget, or
even send a warning to end-users that the widget has been deemed
The widget engine periodically verifies each widget with one ore more
trusted authorities, dynamically adjusting the security policies as
If a malicious developer submits a widget, and it is granted feature
rights, the widget can only do limited damage before it is discovered
and disabled by the community.

And the panel said:

Reviewer 1:
The notion of declaring in advance what feature access is needed is a good one. It’s similar to Necula’s notion of proof-carrying code (http://raw.cs.berkeley.edu/pcc.html). This is a good idea, regardless of how the authorization is done.

W00T! +1 to me! 🙂

I’m less enthusiastic about the multiple authorities notion. The paper suggests that developers pick the authorities, a notion that is ripe with potential for abuse. Even if all of the authorities are honest — itself quite an assumption — users may have different values. Some, for example, may trust the EFF’s notion of privacy and legitimate monitoring; others might prefer the RIAA. Why should the developer decide?

No no no! Users or operators can pick the authority! As if I would let the developer pick the authority, that is stupid. The developer is the hacker so of course they don’t pick the authority!

I’m also leery of community-based control of automated decisions. Wikipedia works (when it does work; the abuses are, of course, well known) because people read it and update it. What is to stop a botnet from flooding an authority with bogus “yes” votes for a malicious widget? At the least, this question should be addressed.

“When in works?!” WTF? it works just great as far as I can tell. What’s to stop the bogus bot, the same mechanisms and interests that stop the bots from destroying Wikipedia. It’s also the same mechanism that stop P2P networks from falling apart: the community. The community filters the content. And yeah, some bad stuff gets through, but it’s quickly filtered out before it does any damage. Ask the movie and media industry, they’ve made no progress in stopping piracy of movies and TV shows through bit torrents. Now apply the same to widgets.

Reviewer 2
It’s pretty notional, and the author doesn’t really seem to understand how the analogy between wikipedia and web trust breaks down in a number of ways. But the discussion could be fun and good for brainstorming.

I wish the reviewer would have elaborated and what I don’t get.

Get off the soap box and dive on how this would really work in the face of attempts to subvert it. That’s the hard problem. Right now, the paper has a bit too much hand waving.

Maybe this reviewer missed the “how it works in practice” section.

Remember how little time it takes for a spammer to get what they want. How will you deal with time lag exposures? (for example)

I proposed polling white/black lists. It’s better than the crap security models we have at the moment. It’s how current anti-virus software currently does it. I’m proposing a simple solution. I don’t think there is some magic notification system that could be implemented.

Reviewer 3
This is an interesting paper but I think it introduces more problems than it solves. Nevertheless, I think this would be a good topic of discussion for the workshop.

It be nice also if reviewer three listed what those problems where. Guess I’ll hear about them at the workshop.

Anyway, lots of great papers have been submitted to the workshop. I’ve not read all of them, but from what I’ve read so far, I recommend the following papers:

Yay for Text-to-Speech!

Thanks to the Opera browser, I recently (re)discovered the joys of Text-to-Speech technology. In particular, how good it is for proof reading. I was looking into VoiceML stuff and voice activated browsing on the Opera website, and downloaded the speech expansion pack. If you haven’t tried it out, you should because it is pretty rad. Anyway, I got it to read back the W3C Note I am working on and to my [not so] great surprise, it started skipping words, and sentences made no sense. Obviously, this was not a reflection on the quality of the text-to-speech engine, but to my own abilities as a writer. It seems that I suffer from some form of dyslexia.

I had also had some else read over the document, and they had not found as many errors as I found when proof reading with the text-to-speech engine. So it saved me a bit of embarrassment when submitting the document to be more formally reviewed by WAF-WG. I’ve now become a bit obsessed and decided to buy a more sophisticated text-to-speech engine that can integrate a bit more with Office and other stuff that I use. I opted for NaturalReader, which came with only one voice (Mike). NaturalReader uses the AT&T Truevoice technology, which I know nothing about. However, judging by the 500Mb voice file I had to download, I assume that it is not actually synthesising the voice, but probably has some sort of look up table for words or word parts.

Anyway, I’ve only been using it for about 4 hours and so far it has been pretty good. I don’t think it is as good as what Apple is about to come out with, but I believe it to be comparable. Maybe I will put up a sample, but in the mean you can listen to mike on the naturalReader sample page.

So! No more rereading over emails, word docs, and entries that I write 1000 times over. I just get the machine to read it back once or twice, make sure it makes sense, and send! I’m a happy chap.