Reforming the GUI monoculture

DISCLAIMER: This site is a mirror of original one that was once available at http://iki.fi/~tuomov/b/

This article is a continuation to my treatise on the past, status quo and an unlikely future of computer user interfaces, begun in "Smashing the monopoly" and to some degree in "Vanhassa vara parempi".

While in "Smashing the monopoly" some ways to improve the terminal and shell and close the gap between the text terminal and graphical user interfaces were disgussed, this story takes the opposite approach. It concentrates on closing the gap from the other direction and also on other reforms on the WIMP GUI paradigm that would make it more keyboard-operable and as a consequence less cluttered for mouse operation as well.

Perhaps the biggest problem with the WIMP paradigm is that it tends to produce what I'd call highly nonlinear designs in lack of a better word. To demonstrate this, take almost any dialog that has more than two widgets in almost any WIMP program. Look up the active widget, supposing it has one and the program supports even a limited form of keyboard navigation. Now pick what you consider the "next" widget from this. Hit "Tab". Did the widget you guessed become activated? Unless you were familiar with the peculiarities of that particular dialog, I'd wager it didn't. Simple one-dimensional "linear" keyboard navigation simply isn't good enough for a two-dimensional layout. To make keyboard navigation predictable and thus usable, you either have to linearise the layout by making it one-dimensional, or you have to add an extra dimension to keyboard navigation.

However, keyboard navigation keys for all four primary directions isn't enough if the layout isn't such that it is even possible to relatively unambiguously decide which widget or a group of such should be considered the be on the left of or right of or above or below another. That is, if the layout doesn't consist of linear slices in both coordinates, such as consisting of columns of vertically stacked widgets. The primary example of such a highly nonlinear and cluttered layout is, of course, the conventional window management scheme or the desktop paradigm. Ion, on the other hand, enables managing windows in a more linear layout. It does allow for what would seem very unpredictable layouts as well, but on the other hand in Ion such are entirely the user's decision, not the programs'.

In some cases such a stacked-columns in two dimensions approach, however, does not quite work. Sometimes you have to do the column-stacking in the third dimension. Such is the case with the menubar. To begin listing the problems of the menubar widget, first of all, there tends to be the problem that the menubar is navigated with the same keys that submenus are, but there's no way to specifically return to the main menubar. Instead, the menubar can only be navigated when a non-submenu entry is selected in a menu. This, however, is not directly related to the unsuitability of the stacked-columns in 2d approach, and could be fixed within the confines of the menubar approach. I, however, think that there should not be two types of menus: the menubar and drop-down menus. What suffices is dropping the menubar and stacking the menus in the Z dimension, and simply providing a key or button to display the main menu. This is what many programs did in the DOS era, and games still tend to often do, although the layout is not always so linear anymore. Ion also uses this approach, although recently menus were experimentally replaced with queries (textual input of the same old menu entries as commands). Infact, a combination of menus and queries would be an interesting and likely fruitful experiment; a sort of deep typeahead find for menus.

The menubar is not so usable with the mouse either, because the menu items tend to be small and aiming small widgets is one of the worst parts of mouse use. The purportedly Fitt's Law based approach of MacOS of locating the focused window's menubar at the top of the screen isn't essentially any better, since horizontally the menu entries are still small (or take a lot of screen space). If, on the other hand, some corners of the screen simply contained a menu-activation button or area, it would be quite easy to hit, and the actual menu entries could much bigger.

Let us now return to the mention of textual input of commands, as there's lot to be said about that. You see, I don't think keyboard usability is just about having keyboard shortcuts to everything. Even the name "shortcut" implies that there's something fundamentally wrong about the primary way the UI was intended to be used in. The linearisation of the layout that was already discussed is, of course, part of that a design. But simply being able to preditably navigate deeply-nested dialogs and menus is not always enough. Neither are shortcuts, as the can be difficult to remember, and there can be a lot of them. I think command names can be often the easiest to remember – easier than the location of a widget or a key binding – for things that are used relatively often, but not all the time. After all, we think in terms of verbs and commands. A command may also be easier to type than a complex escape-meta-alt-control-shift key-combination, or looking up a widget for the action, by mouse or by keyboard. That's one of the reasons I prefer LaTeX over word processors. (The others are that I don't like WYSIWYG, instead preferring a more what-you-see-is-what-you-mean approach; and all the same reasons that apply to almost any WIMP program.)

Indeed, although it uses WYSIWYG paradigm, TeXmacs is, however, quite interesting in that it supports LaTeX-like commands in the input stream. Simply type '\' (backslash) followed by the name of the command and enter, and it will query for the parameter for the command. (Unfortunately in a WYSIWYG manner. I'd prefer to type the whole command before it was rendered, and space instead of enter). For example, typing \section followed by enter will start a new section, and what is next typed will be the name of that section. I'd like such an approach to be used much more.

Regarding key bindings for the often used actions, there's also the fact that the bindings of most WIMP GUIs tend to be quite limited and do not make good use of the easily accessible Control+Key combinations, and instead require one to use the arrow, home, end and other keys far away from the typing position of hands. Infact, WIMP GUIs tend to waste perfectly useful key combinations for something used very rarely, such as Control-P for print (instead of doing the same as the up arrow), and do not often even provide convenient ways to configure anything.

Finally, to make keyboard use more attractive, programs should advertise their key bindings more so that people can easily learn them, or look them up if they don't. I don't mean a manual or anything of the kind. It should be possible to put the currently relevant bindings always visible. Needless to say, good old DOS programs tended to do this more than programs today. This is not only a problem of WIMP GUI apps, but also of more traditional Unix terminal programs. When I started using Linux back in the mid-90s, the only powerful editor that I wasn't overwhelmed with was joe and the number one reason for this was that it has a very convenient help screen and the binding which toggles it on is advertised on the statusbar.

Of course, one can ask why is such extensive keyboard support needed at all? Isn't keyboard an out-dated method of input, mice and, say, voice recognition being the future? As for voice recognition, well, you can't use it everywhere, so clearly other methods of input would still be needed. Also I don't even think it is applicable to everything. The keyboard and the mouse also aren't applicable to every application. You can't conveniently type this story with a mouse, and some applications simply need some kind of pointing device. Programs that are used to create graphics in particular. However, even in such cases a clever combination of the keyboard and the pointing device can often be the best (known) approach, as evidenced by e.g. Blender and the prevailing input systems of first-person shooter games.

I think the two most important points speaking for keyboard-oriented design are

Extensive mouse usage tends in my personal experience to be more wearing on wrists than keyboard usage.
The inconvenience of switching the input device when working with textual data (such as this story or computer code and so on), and efficient access to almost all functionality.

Some people may define efficiency here terms of productivity. Fast keyboard access to all functionality they need enables them to get more work done. But I'm not one to endorse such a definition. Instead, I define an interface as efficient if it minimises the time I have to spend interacting with it – and in the long-term, no less. I think keyboard-orientedness as outlined above best provides such efficiency in most applications.

Article: