Concrete UI Musings (my longest yet)

Wed Feb 19 08:04:03 CET 2003

[To get to the meat, skim down to "Level 0:"]

While the recent discussions on the UI are interesting, there
are so many abstract viewpoints I had difficulty relating them
to each other.  I thought about it a little and realized that a lot
of what I have done and am doing is really UI but I've not
realized it since I'm ostensibly building a "server".  The
discussion below is a feeble description of something concrete.
Tell me why it is wrong, what it can't do, and how you would do
it differently.

Here are the things I'm involved in that could do with a new UI:

1) A static website with instructional info
2) A calendar of events website which changes monthly and
includes photos
3) A dynamic user registration system with optional
demographic forms, validation and password login
4) A graphical applet for playing dominoes / cards
and chatting (IM)

Other things I'm thinking about building:

1) Weblogs for individuals to use from their browser
2) Poetry website for individual contributors, but others
can be editorial commentators
3) Web service server that extracts and merges content
from other websites on demand
4) Palm PDA-like functionality in a flexible database for
a browser or handheld
5) Integration of browser, email, chat so there is one place
for all

All these features are assumed to be provided by a server
(either my remote server, or the user's own PC / handheld
used as a server) which directs what to display and how to
display it.  I've been focussing on the database and the
server and all the mungeable bits, while building ad hoc
HTML or applet front ends for everything.

What I realized from the beginning was that how I store the
data drives all the capabilities that I can provide easily.  What
I didn't realize until recently is that my thinking has been
colored (collared?) by the available UIs, shaping the data into
ways that are slightly less than ideal.

Ideal:

1) Data entered in snippets with no formatting and
little labelling
2) Data strung together in bits to produce documents
or similar structured content
3) Websites strung together from documents and / or data
4) #2 and #3 being either static and / or dynamic
5) Style sheets and other descriptive ways of formatting 1-4
6) Dynamically differing views of the same data: summary,
detail, graphical, other
7) Editing and publication of #1-6 with automatic page
generation on future date / time
8) Recordings of email / chat or other exchanges with
search and visualization
9) Simple 2D / 3D graphical construction of multi-user,
multi-modal interactions (game rooms)
10) Active mouse / keyboard interaction that does not
have to use existing windowing look and feel
11) Variable size display without dramatic effort in
reformatting content
12) Brandable surfaces / spaces for advertising sponsorships
13) Animation via pre-caching of content while user
is occupied
14) Lightweight network transport
15) Prefer no software installation (I can hedge this one)

Things I have not considered (and probably won't
for a few years):

1) Printing documents, photos or other things
2) User-customizable surfaces / spaces (although #11
above may allow this)
3) Allowing programmers to build application interfaces
4) Serious interactive UIs such as "twitch" games or
scientific visualization

Most of my effort in the past have gone into building
content for websites and developing applets without
using any of the Swing or windowing system.  I
desperately want cut and paste, drop and forget,
contribution of content to websites that automatically
organize and publish themselves according to a style
that I define ahead of time.  For example, I want to
copy an email about a concert date and drop it in a
form.  The data automatically appears on the website
calendar on the future date linked into the structure
properly.  This would be based on text snippets plus
dynamic stylesheets and some by hand mark up for
special links, etc.  This is the same capability that
bloggers have when they start discussions about links
or just drop a personal note in an HTML form.

On the applet front I wanted small applets (I have
compromised with 100K size right now) with interactive,
graphical and dead easy UIs.  I achieved this just using
a bare canvas, making rectangular layers and tracking
the mouse over areas which activated from offscreen
buffers.  It performs reliably on multiple machine types
without the enormous lag or unpredictability of the normal
Java UI (this has much improved with DSL and 2GHz
chips, but the lack of uniformity with MS Java VM will
soon be a very big problem).

My thinking on erlang UI provides the capability in levels,
attempting to simplify each of them.  I know nothing of the
capabilities of graphics cards so this is all based on old
technology and the Java animation techniques that have
so far served me OK:

Level 0: Display bitmaps

This is the hardcore low-level mapping of memory to screen.
There may be some optimization based on hardware, but
for the most part it provides only raw memory, offscreen
and onscreen buffers, and a mechanism for switching
among them.

Level 1: Merging of bitmaps

This is where Z-layers are created to obscure each other,
where alpha-blending and other graphical tricks occur, and
where clipping regions and other such niceties get worked
out before blitting to Level 0.  This may be coupled with
Level 0 to take advantage of hardware acceleration
intelligently.

Level 2: I/O facilities

Mouse events, keyboard events, pen taps, beeps, and the
like are built here.  Any low-level physical buttons, or serial
comms are handled.

Level 3: Display <==> User interaction

This layer provides a mapping between low-level display
buffers and user devices.  Regions of bitmaps are designated
as "active" and mapped to particular interaction devices.  This
layer is directly controllable by the application and is at a
sufficient level to build simple graphical mouse-active screens,
or focusable regions for keyboard input.  This layer is strictly
for linking events, not for display of characters.  Think of it as an
empty behaviour that allows the connection of discrete events
to discrete display areas.

Level 4: Raster Image Processing

Display of characters, fonts, graphical elements, and so on.
This is where typical drawing occurs, as in draw-string, draw-arc,
shade area, etc.  This level should allow any variety of Modules
to plug-in and provide a basic RIP capability.  It should be a
library which requires a higher-level scripting or orchestration
to do complex layout.

Level 5: RIP Library Modules

Users can supply Modules, each of which define a reasonably
high-level interface to the underlying RIP capability.  This is
essentially equivalent to a widget library.

Level 6: GUI languages

A collection of Modules might define an iconography, a page
description language, a specially rendered consistent environment.

Level 7: Windowing Systems, Skins, Scripting environments

Manipulation of GUI languages to develop coherent systems,
user or programmer configurable spaces, or dynamic reactive
applications.

Levels 0-2 are about on equal ground, implemented in a
low-level fast language with destructive capabilities for efficient
use of mapped buffers and the like.  Think of these levels the
way you think of an ets table -- an efficient tool to have access
to, who cares what language as long as the interface is
erlang native.

Level 3 provides the protocol for user / screen interaction; a
framework that allows the programmer to drop in erlang
processes or data structures to manage the connections.  To
draw a button use Levels 0 & 1, to make the button mouseover
active or clickable with sound feedback, create a structure
linking Level 2 actions to Level 0 & 1 regions.  Define the
user interactions independent of the content; get the server /
client protocol right without worrying about the messages
exchanged.  Most of Level 3 could be user-provided, whereas
Levels 0-2 are "manufacturer" provided.  gen_fsm or
gen_event type functionality might be useful here.

Level 4 allows the building of form elements, button labels, etc.
At this level you can create the behaviours that describe a
text-input box, a beep-when-it-clicks-button, and other complex
screen controls.  You have to do some work to worry about all
the interactions, but you have complete control over liveness,
response, and other features in a way that windowing systems
don't allow.  Datatype checking could be enforced here by
correlating user events (e.g., keystrokes) with context (e.g., the
receiving process being a numeric text box).  gen_server is
more the model for generic behaviour at this level.

Level 5 is the first level where you might be tempted to define an
object (and probably the main place where objects of any kind
would reside).  Raw windowing system elements are collected
and defined in a Module to use at this level.

Level 6 is where PDF, PostScript, tk/tcl might occur.  Roll your
own language and use Module:Functions supplied in Level 5.

Level 7 is the top where IDEs and other WYSIWYG tools are
built.

Avoiding "objects" until Level 5 is both good and bad.  The difficulty
that object hierarchies solve is the efficient organizing of related
screen elements so that they can move together over the screen
(move the root node and everything automatically comes along
without computation -- however, screen redraw is a relative process).
Here, by avoiding the hierarchy, we end up with cooperating
processes that can interact in more complex ways than the rigid
hierarchy enforced by a menu object, for example, at the expense
of blitting bitmap layers as collections of related graphical elements
(in essence flattening the object hierarchy into a bitmap on a Z-layer).

The stack can be replaced at the top or at the bottom.  If you want
to output to paper, swap what the RIP talks to.  When I want to
generate an HTML interface I can have the RIP Library for forms
and pages output HTML rather than graphics commands.  Likewise
I can define a gameroom Module at Level 6 that I can use across
several types of games but still retain a similar look-and-feel.

jay