Smashing the monopoly

DISCLAIMER: This site is a mirror of original one that was once available at http://iki.fi/~tuomov/b/

This article will continue on the path begun in "Vanhassa vara parempi", but instead of simply praising old things, we will look into the past for inspiration for the future. More specifically, we will look at ways to fuse the good old command line with graphical user interfaces, on which the defective WIMP paradigm currently has a monopoly on, although it need not be so.

So, consider a shell, such as bash, running in a terminal emulator, such as xterm. All you see is text and perhaps some special characters. But wouldn't it be nice to display graphics for certain things? What if uptime or a similar command would in addition to the load averages be able to display a graph of the CPU usage for the past few minutes? If dd could display a graph or bar depicting disk usage? If it were possible to display caution sign images for error conditions? Or, more generally, cat an image to view it? And all this conveniently within the "terminal stream" instead of popping up new windows.

This is not a new idea, I should note. Some variants of the w3m and Links web browsers infact have kludges to do display images over the terminal, but since it isn't coordinated with the terminal emulator, it is not difficult to guess that the result isn't very good. Also XMLterm can potentially support all kinds of graphics within the terminal. I do think it takes the wrong approach, however. The bloated and over-complicated approach that isn't even general enough. The terminal software shouldn't support complex XML languages or even HTML.

Instead, I propose that in case of X11, the terminal emulator should simply enable displaying arbitrary XIDs (X windows and bitmaps/pixmaps) within the terminal stream, in two modes. These would be the inline and display modes to lend TeX math mode terminology. In inline mode, the XID could take the height of a single line, and a variable number of columns (but obviously not wrap to another line). XIDs in display mode would take multiple lines.

The benefit of this approach is, obviously, that the terminal emulator can be kept quite simple and still supporting all the old escape code functionality without much trouble, but nevertheless the terminal can display arbitrary content that isn't limited by the capabilities of any particular data format. Even interactive content is easily possible, because the actual drawing is handled be external programs, and the terminal simply keeps track of these "inserts". The protocol for doing this can itself be quite simple too. Just a few new terminal escape codes are needed: one code to query terminal font size and colour information, et cetera, and another to request inserting a given XID. The rest can be simply be the relavant parts (resize/move, close) of the ICCCM.

Things get even more exciting if we combine the above ideas with a shell and tools with typed pipes (and arguments). Something like MSH, although I would do all the details completely differently. Immutable ADTs (belonging to various type classes), not objects, are more suitable for piping. Objects have mutable state and identity, ADTs as such don't. The .NET API is horrible bloat, I want to stick to something more unixy (i.e. simple and compact) yet also more functional for the basics. Shell piping is, after all, in many ways similar to lazy functional programming and Haskell's IO monad as well. H4sh is indeed a nice hack bringing FP tools to the *nix shell. And, yeah, speaking of Haskell, I want strong static typing with (parametric) polymorphism for the shell as well.

In any case, supposing such a shell, upon receiving typed data from a command the shell should check whether the data type implements a graphical display routine (i.e. belongs to a suitable type class) and execute the routine (possibly another command on the file system) for displaying it graphically in the terminal. Otherwise it should check if the type implements a textual display routine, and so on.

I'm not done yet. There's one more thing I want to discuss about. In "Vanhassa vara parempi" I mentioned that there's a lot to be learned from old DOS software. One of those things related to all the above is this. In DOS, executing an application, even a graphical one, from the command prompt would take over the command prompt, intead of popping up a new window somewhere around the screen. So do many text-based curses and other similar programs in *nix terminal emulators. Additionally, these curses programs allow switching back to the shell by putting them in the background with ^Z or other binding. I like this behaviour very much. It's why I prefer using text editors that run in the terminal emulator to X text editors that might otherwise have better clipboard support and so on. I'd like applications in X to behave in this manner. To overtake the terminal emulator, but allow switching back to it with a simple binding. I'd like all such interactive applications (as opposed to the "terminal stream" applications discussed above) to think they were the only program running on the screen like those old DOS programs did. Only now the screen would be the terminal emulator, and it could be resized on the fly and so on.

Of course, Ion to a degree emulates this behaviour by putting new windows in the currently active frame, but it isn't perfect. It doesn't really know from which terminal (if any) a window originated from, so it can't switch back to it or always put the window in the right frame. The FDO startup notification specification may be a small remedy in the future. But there's another big problem as well. Too many programs use multiple windows (per "document") instead of the document-window-as-screen approach. Additionally, shell integration would be useful, such as that the shell's fg command could switch to a window of that program to be put on the foreground.

As we can see, there are many ways in which the distinction between graphical user interfaces and the command line can be removed, and therefore both improved. WIMP does not have to have a monopoly on graphics.

Article: