W3C stops standardization of the declarative format for application and user interfaces (about time!)

Yay! the W3C has canned the work on the Declarative Format for Applications and User Interfaces (DFAUI), putting an end to something that had no way of ever finishing. Of course, you probably have never heard of the DFAUI because the WAF WG never published any documents about it. The idea was to standardized an XML language similar to XAML or Openlaszlo…. but instead, what the WAF-WG got was an input from Nexaweb called XAL. Anyway, the people that were supposed to be editing the document never got very far, and as far as I am concerned, the work they produced was of fairly low quality (that’s not to say my work doesn’t suck!).

These are my random thought on how I think the DFAUI should have been standardized…and why it failed….

Foundations for a Declarative Language for User Interfaces

Although it is possible to describe user interfaces and the interaction design styles in a colloquial language, the disciplines of Interaction Design and User Interface Design have well-understood theoretical and practical foundations including:

Cybernetics: stemming from the mathematical understanding of communication theory and feedback and control theory,
Visual Design: stemming from print design, color theory, aesthetics, semiotics (semantics and communication) theory,
Multimedia/Hypermedia Design: stemming from literary theory (hypertext),film theory, animation, sound design and motion graphics, and electronic publishing
Sound Design: stemming from music and audio composition,
Architecture: which provides structure, theories of form and function, and design patterns,
Software engineering: the actual means through which user interfaces can be realized,
Ergonomic Studies: sometimes also applied, as is the case with, for example, Fitt’s law (“the time to acquire a target is a function of the distance to and size of the target” [TUG]).

The effectiveness of a user interface is generally evaluated through the fields of Usability and User-Experience Design, however both Usability [Nielsen] an User-Experience Design [Garrett] lack a ‘design language’ on which to design and build multi-modal user interfaces. The fields that make possible the practice of user interface design employ design processes that are iterative and user-centered so to maximize communication and make user interfaces as usable (or effective) as possible. In addition, these design practices provide designers with the freedom of self expression and the means (tools and language) to effectively achieve their communicative goals. In this context, communication is “the full process by which the behavior of one goal-seeking entity comes to be affected by that of another through the reciprocal exchange of messages or signs over some mediating physical channel” [Mullet].

It is generally excepted that there has be a continuous growth in the market of portable and multi-modal devices, and as a result new interface paradigms are emerging that challenge the traditional means of interacting with data. The modes of interaction afforded by these new devices often challenge tradition user interface metaphors experienced by users on traditional desktop applications. These new interface paradigms offer new innovative modes and modalities through which users interact with information. For example, Apple’s IPod’s tactile wheel control that also provides auditory feedback, the upcoming Nintendo Wii and its gesture recognition system, and the Nintendo DS with its advanced uses of direct manipulation through a pointing device. Note that the focus here on entertainment and gaming devices is intentional as they often represent the state-of-the-art in effective user interface design by providing highly efficient user experiences that are both engaging and aesthetically pleasing. The sophistication in these user interfaces is achieved by the skillfully designed fine-grained coordination of multiple modalities over time that achieve an engaging user experience. In other words, images and sounds are skillfully coordinated together over time in response to create a successfully multimodal experience.

I now provide a brief overview of the fields listed above and explain their relationship to interface design in more detail.

Cybernetics

Cybernetics theory has its foundations in the formal mathematical model of communication as defined by Shannon and Weaver. The field deals specifically with the concept feedback and control. The mathematical model of the feedback and control is easily understood as an agent attempts to change the state of a system by manipulating some variable(s), the amount of change that the agent desires is commonly referred to as the Goal. As the agent changes a variable the system provides feedback to the agent allowing him or her to ‘Control’ the level of change.

A simple example of cybernetic theory in action

A person is driving a car at 50km/h (let speed be variable x = 50). Her goal is for the car to accelerate to 60km/s, so she presses down on the accelerator pedal (while(x< 60){ x++ } ). As the pedal moves downwards, more fuel is injected into the car’s engine causing a greater power output that changes the speed of the car (hence x++). However, as the car accelerates it produces feedback through a number of modalities: audibly she can hear the car accelerating, visually she can see the car’s speedometer changing, kinesthetically she can feel the inertia and gravitational forces on her body as the car accelerates. If the car over accelerates (ie. x > 60), she can hear, sense, and see the changes through the various modalities, so she can then ‘control’ the car’s speed by depressing the fuel pedal, or by applying the break pedal (while(x>60){ x– } ). In simple terms, the agent reaches a desired goal by systematically controlling a system through actions that produce perceivable feedback in the system.

The same principles apply feedback and control mechanisms apply when moving a mouse around the screen. When a user wants to click on a button, he pushes the mouse in the direction of the button. The computer interactively updates the display to reflect the changes in the mouse position in relation to the physical movement of the mouse on a surface. As the user watches the feedback that is the representation of the mouse on the screen, he is able to “control” the speed of the mouse by decreasing the amount of physical force applied. Eventually, the user reaches his goal. However, if feedback from the system is too slow and the representation of the

Visual Design

Mullet states that “Visual Design attempts to solve communication problems in a way that is at once functionally effective and aesthetically pleasing.”

Semiotics is the general study of signs, of which a branch is Semantics (Seead). semiotics studies forms the foundation on which communication can be understood. Parts of communication include: the sign: which has two parts the signifier (the presentation of the sign on some medium), and the signified (what the sign is communicates to the reader). Signs are always found on some medium with the purpose of communicating something. Signs are composed of three distinct parts:

the medium: or the ‘channel of communication’, for example marks on paper rely on the interaction between light and the contrast with the markings made by the author,
the modes: the communicative choices the author has when using a particular medium for communication. For example, for plain paper, available modes include colors, lines, markings and pictures, textures, symbols, and even smell. However, temporal or auditory models of communication are unavailable to the medium of paper.
The modalities: the sensory modalities that the reader uses to perceive the sign (eg, visually through the eyes or tactile as would be the case with brail, or through olfaction if the paper had been perfumed).

Because personal computers primarily use screens (emitted light) as their medium of communication, the must rely on the representation of other mediums to convey their messages (eg, photographic, three-dimensionality, document-based). The medium chosen for communication affects the way the message is understood by a reader. In semiotics, the combination of signs (media, modes, and modalities) is formally known as a ‘text’. Things get more technical from there and beyond the scope of this document.

The field of Semiotics has its roots in formal logic and the philosophy of language, but encompasses all forms of communication. Semiotics also provides a means to understand how time can be used in a communication process. Semiotics, however, can not be used independently to create communication as it lacks a design language – it can only be used as a means of to study and predict how people will interpret the semantics of signs. Thus, semiotics can be used to iteratively evaluate a communication process to make communication more effective, particularly because of semiotics’ reliance on understanding socio-cultural contexts in which signs are created and consumed (what we commonly refer to internationalization and localization of designs). Semiotics has a more sophisticated theoretical foundation than both Usability and User-Experience Design for describing human-computer interactions and has been widely applied in the literature of Human-Computer Interaction (Caceres, User interface design dudes).

Graphic Design

Just like written and spoken languages have a generally understood syntax and grammar, so do the visual arrangement of elements on some medium (a graphic). Graphic design theory is a theory of principles whose aims are to maximize communication through the structured arrangement of semiotic signs. The arrangement forms both a syntax and grammar that has been studied for thousands of years, but particularly since the invention of print and later with the establishment of electronic publishing in the previous century (see in particular Knuth).

Architecture

The applications of Architecture are generally well understood. The idea of design patterns was originally coined by architect Christopher Alexander et al. in the 1970s. The idea of a design pattern is to catalog a proven design solution to a particular design problem in such a way that a description to solution is applicable to lots of situations without being overly prescriptive or restrictive (as can be the case with guidelines or template).

Interaction Styles

Command: command given as direct input into the computer
Forms
Direct manipulation: directly manipulating the representation of an object (eg. drag and drop, icons)
Menu

The need of a Critical Language

Although it is possible to describe the elements that make up a graphical interface by the function that a group of controls play in an interface, such an approach is limited because it does not fully describe the interaction process. In other words, simply identifying interface elements (menu, button, radio button) does not fully capture the human-computer interaction in terms of input behavior, binding to a data model, eventing, presentation, styling, metadata and navigation (Raman, 2003). To fully describe the interaction process, it necessary to consider at least ask:

Input behavior:
- what means of interaction are presented to the user use to change the model?
- or what means of interaction are afforded by the device? (eg, direct manipulation, menu, command line input)
- how does the input behave as input as it interacts with the user? (for example, does it stop receiving input after a certain number of characters have been entered?)
- what means of control does the system afford the user in relation to feedback? (eg. speed of the mouse and the rate at which the screen updates, time before another key stroke is displayed on the screen while the user is holding down a key).
Binding to a data model:
- what is the goal of the user or the motivation for interaction? ie, what part of the data model do they wish to change or capture? (eg. save a file)
- what is the conceptual model that user interfaces is allowing the user change? (eg an address book, the set of spells that a character in a game can cast when under attack, etc)
- what is the actual model that the user interface is allowing the user change? (eg. a database table, a document model)
- what is the computation model that the user is changing? (eg, a thread inform the renderer to update portions of a screen buffer as a user “scrolls” down a page)
Eventing:
- what events are triggered by user interaction? (eg. onclick)
- what events are triggered by the system? (eg, onload)
- how do those behaviors affect the data model (eg, event phases and level of granularity or propagation)
Presentation:
- what medium, modes and modalities are utilized by the device to provide feedback to the user? (eg, a sonic chime when the file is saved successfully, the sound of paper being crushed when the recycle bin is emptied)
- what declared stylistic rules go into forming the representation of the interface component? (eg, css rules or XSL:FO for print)
- what are the stylistic restrictions imposed by the limitations of a particular device? (eg, monochrome display does not allow the use of color as a mode of communication, but may allow the use of shades and texture instead)
- what are the stylistic restrictions imposed by a particular device because of security concerns? (eg, always displaying ‘*****’ in place of passwords, or not reading passwords out loud as is the case when interacting with ATM machines or passwords fields in html forms)
metadata:
- what semantics can be extracted from the underlying model and content? (eg. the data type for this input field is of type date)
- the interaction layer? (eg, element can be ‘dropped’ onto other elements, but does not except textual input)
- and the presentational layer? (eg. in a localized Australian context, speak the month after the day: “today is the 20th of October”, but not in an USA context: “today is October 20th”. )
navigation:
- has the author imposed a navigational structure on the interface? (eg. steps 1 and 2 must be completed one after the other)
- does the interface provide the user with a non-linear interaction structure? (for example, by using the Accordion interface design pattern?)
- when a user interacts with an interface element, must particular conditions be met before feedback is displayed? (eg. searching for a contact in an address book)

From an interaction design perspective, the above questions form what in computer science is commonly referred to as the ‘Model-View-Controller’ (MVC) design pattern (Gamma et al.) (See also wikipedia for a detailed description of the MVC pattern). MVC is well understood withing the Working Group and MVC forms the foundations of the XForms specification. XForms describes the interaction process in terms of inputs, outputs, a model and triggers that manipulate some given abstract data model, and mechanisms to transmit data to and from the server.

XForms also describes the behavior of input types in relation to the model through interaction events, data-typing and error checking. XForms also defines mechanism by which inputs can behave as outputs. Hence, XForms provides the elemental primitives on which complex interactions with dynamic data models can be declared by an author. XForms’ power is in its flexibility and high degree of abstraction, accessible design, and device independence. XForms also defines structuring mechanism for user interfaces that can help structure how the user navigates a user interface. T. V. Raman describes the work done on XForms in the following way:

“The primary goal as we overhauled HTML forms from 1993 was to make the next generation Web forms technology robust with respect to accessibility, internationalization, and the ability to deliver user interaction to a variety of modalities and end-user devices. Looking forward to a ubiquitous Web that would be accessible using devices and modalities best suited to a user’s needs and abilities at a given time meant that we had to step back from commonly held notions of visual interaction and create an abstract user interface vocabulary capable of withstanding the test of time.”
“Addressing the need to create an abstract user interface vocabulary had the advantage of contributing directly to our primary goals of accessibility. Once we freed the user interface vocabulary from fixed ideas driven purely by visual interaction, we were able to leverage the separation of the XForms model from the user interface to the fullest degree in creating a set of controls that could be easily re-targeted to different user interaction environments. Cascading Style Sheets (CSS) is now a mature Web technology,and this has made it significantly easier for the XForms working group to defer stylistic and presentational issues to the CSS layer. Having separated out presentation by using CSS, we used the XML binding to DOM Events provided by XML Events to factor out interaction behavior of the various controls. Thus, the XForms user interface design separates content, presentation, and interaction to create an XML vocabulary that lends itself to intent-based authoring of user interaction”

Although logical from an engineering standpoint, the explicit separation of the content, presentation, and interaction is seen by some in the web community as counter intuitive and confusing. The traditional approach for authors has been to to consider the required user interface elements available HTML, collect the required data from the users through these interface components, and then have the user post the data back to the server for processing. In addition, XForms has also being criticized for:

it’s reliance on XML for the data model
overly declarative
developers consider XML to be ‘bloaty’
DOM interfaces hard to work with (hence the creating of workaround like the handy innerHTML property).

Regardless of these criticisms, XForms provides a foundation built of a proven design pattern (MVC) that has been used in practice by software engineers for approximately 30 years. XForms (together with its dependent technologies and pluggable design) provides the base primitives and structuring mechanisms needed by authors to realize almost any kind of user interface.

Given that we now have a language for describing user interfaces. We can begin an analysis of interface components of various applications. It is not possible in most cases to understand the underlying computational models of software applications, but it should be relatively easy to understand the conceptual models from a user’s perspective.

Understanding the UI space: the competition to a DFAUI

XAML
Macromedia MXML (Flex 2)
XForms
XUL
AJAX Frameworks (Prototype, Dojo, Scriptacolous)

Directions away from HTML forms

XAML Approach:
- generated XML spaghetti nightmare from hell.
- Tremendously difficult to read or write and probably maintain (requires an IDE).
- Does not separate content, structure, style, behavior, data binding(all mashed together)
- Defines it’s own styling language
- But apparently, one can make pretty powerful stuff because it uses the .Net. API
- Provides ready made desktop UI components (slider, progress bar, checkbox, etc)
Macromedia Approach:
- XML Elements map to classes in an API (Flex and ActionScript3), or user defined classes
- Heavily dependent of programming
- Leverages CSS (although implementation is fairly poor) and ECMAScript
- Powerful yet simple binding and extension mechanisms
- Provides ready-made interface components for re-skinning (button,accordion, input fields, etc)
- Very powerful network and hypermedia capabilities and ready made interface components
XForms approach:
- abstracts and separates UI logic, model, presentation, and behavior
- Clean and simple (when compared to XAML and XUL) declarative language
- Pluggable and leverages CSS, XPath, XMLEvents, Namespaces
- RESTful approach
- Device independence
- Lacks API and higher level interface abstractions
XUL:
- Geared towards desktop application development
- Leverages ECMAScript and CSS
- Provides an extensive set of ready made interface components and associated APIs

DFAUI – our current state

XAL proposal seems to want to compete with established UI languages, but is already years behind in implementation and sophistication
TID-Mobile is better but kinda sits in the same boat
DFAUI use cases seem to be following XAL by re-inventing the wheel
So far, we only have really self evident requirements:
- must use CSS
- work with events
- must be device independent
the requirements and use cases simply leading to requirements that will be identical to XForms

Why DFAUI Failed:

Monumental task of defining all possible UI elements: consider how long it took to build all the competing languages (even XAL has been undergoing development for years).
lack of interest in WG
lack of interest from implementors (they’ve already got way better solutions and have invested lots $$$,$$$,$$$ into them)
not really doing anything new

If we want our stuff to be adopted, we obviously need to make use of what has already been implemented; it should slot in with as little effort as possible; and should be at least comparable, if not an order of magnitude better than what is already out there.

Building on XForms: Standing on the Shoulders of Giants

XForms already resolved complex interface modeling problems through MVC
Leads developers to an effective software engineering solution (MVC)
but MVC may not lead to an effective interface design solution:
- comes with some UI design patterns built in: hint, help, error
- comes with grouping patterns
- comes with sequential patterns
- comes with data typing entry control (pattern)

Why build on top of XForms instead of other languages

“Pemberton blasted the scripting approach taken in Web Forms 2.0, sayingit doesn’t scale well, is harder to maintain, doesn’t address industry requirements and use cases, and doesn’t provide the ability to take snapshots of each step in a forms-based process for sensitive industrial or governmental applications…” (Festa, 2005)

“XForms is not a Web standard,” said Brendan Eich, a founding member of Mozilla in charge of technical direction, the creator of JavaScript and a member of WHAT-WG. “It’s a relatively new spec seeing early-adopter use in intranets” (Festa, 2005)

Maybe they both suck?
XForms MAY be the way forward (with the useful bits from Web Forms 2.0 added in).
However, XForms is still very high.
It will be hard to compete with established frameworks like Dojo, Prototype, etc.
XForms is more sophisticated than these frameworks, but not (yet) a widely available web standard.

DFAUI – what could it actually be?

Consider the emergent growth of open-source Ajax frameworks as the case study for the DFAUI:

“The DOM and JavaScript have enabled a huge amount of experimentation that have helped us discover useful design and interaction patterns on the Web; however, if we now fail to move these understood concepts to a declarative means that obviate programming, I would assert that we will largely fail to discover the next set of abstractions.” (comment made T.V. Raman on Mark Birbeck’s blog)

Web API WG is already covering the non-declarable aspects.
We could investigate what user interface design patterns are emerging (or have already been defined) and attempt to capture them in a standardized way.
Yahoo! UI is already doing this, but for Ajax.
Martijn van Welie and others have been capturing UI patterns for years

Towards an an eXtensible User Interface Patterns language (XUIPL)

Build higher level user-interface centric patterns using XForm primitives,but allowing them to maintain their device independence, accessibility, and adherence to MVC
Make the language extensible so developers can define and share their own patterns (Ala semantic web ontology style)
Overcome some of the complexity of XForms, by providing ready made rea world solutions to UI problems
Builds on XForms as a foundation, reinforcing it’s applicability as a suitable language to build powerful and usable web applications
Through XUIPL, patterns can be ‘natively’ supported by XForms thus making them easier for novices and professionals to use

XUIPL: A language for User Interfaces that is solution-driven based on proven UI design patterns that lead to better user interface designs that runson a proven software engineering foundation.

The approach I am proposing gives us a new noble direction that empowers UI designers, not just programmers
Has the potential to make a language that is an order of magnitude more usable and applicable than XForms alone
Is open and extensible, allowing new patterns to emerge and be used on the web community… hopefully speeding up the adoption of XForms
Does not interfere with the development of XForms, but provides more use cases for XForms and for CSS

From an ‘Application’ development perspective, other minor things that XUIPL could also add to make it more attractive:

Unit testing and logging module, making it easier to build and test web apps
applications could still be packaged, with the aid of the widget 1.0 spec

References

Mark Birbeck. “On Adobe and XForms via Declarative Programming, Wizards and Aspects”. September 04, 2005. Available at: http://internet-apps.blogspot.com/2005/09/on-adobe-and-xforms-via-declarative.html

Paul Festa, “Fight over ‘forms’ clouds future of Net applications” ZDNet News: February 17, 2005. Available at: http://news.zdnet.com/2100-9588_22-5581106-4.html?tag=st.num

See also: http://www.welie.com/patterns/literature.html for a good listing of literature related to UI patterns.

UI Pattern Collections

Notes

TV Raman says: * In this context, the key to XForms and XHTML2 is *not* to either obviate or decry imperative/script-based programming; rather, it’s primary goal is to liberate today’s script authors from the minutiae of writing yesterday’s code to focusing on the next level of innovation.

* The DOM and JavaScript have enabled a huge amount of experimentation that have helped us discover useful design and interaction patterns on the Web; however, if we now fail to move these understood concepts to a declarative means that obviate programming, I would assert that we will largely fail to discover the next set of abstractions.

2 thoughts on “W3C stops standardization of the declarative format for application and user interfaces (about time!)”

ProDFAUI says:

September 20, 2007 at 6:25 pm

Marcos, you know what happened in the WAF WG, and how we experienced the boicot from people … like you, that sold yourself to the HTML5 stuff and browser vendors.

The only reason why the DFAUI progress in W3C were browser vendors, and its lack of resources for building new technology … don’t confuse people.

And by the way the work you mention as low quality was never finished or published, so you don’t have the right to criticize it and moreover, you were part of the WG that tried to produce something.
Marcos Caceres says:

September 24, 2007 at 2:47 pm

Yes, admittedly, I am a HTML-5 sellout:) And yes, to be fair, there were only two “resources” working on the DFAUI. Nevertheless, I still think the approach taken to develop the technology was wrong.

Comments are closed.