I've written previously about the general problems with the application-centric view of software. Here I'm going to discuss a specific example. First, some background, from my earlier post:
Applications are bad enough in that they trap potentially useful building blocks for larger program ideas behind artificial barriers, but they fail at even their stated purpose of providing an 'intuitive' interface to whatever fixed set of actions and functionality its creators have imagined. Here is why: the problem is that for all but the simplest applications, there are multiple contexts within the application and there needs to be a cohesive story for how to present only 'appropriate' actions to the user and prevent nonsensical combinations based on context. This becomes serious business as the total number of actions offered by an application grows and the set of possible actions and contexts grows. As an example, if I just have selected a message in my inbox (this is a 'context'), the 'send' action should not be available, but if I am editing a draft of a message it should be. Likewise, if I have just selected some text, the 'apply Kodachrome style retro filter' action should not be available, since that only makes sense applied to a picture of some sort.
These are just silly examples, but real applications will have many more actions to organize and present to users in a context-sensitive way. Unfortunately, the way 'applications' tend to do this is with various ad hoc approaches that don't scale very well as more functionality is added--generally, they allow only a fixed set of contexts, and they hardcode what actions are allowed in each context. ('Oh, the send function isn't available from the inbox screen? Okay, I won't add that option to this static menu'; 'Oh, only an integer is allowed here? Okay, I'll add some error checking to this text input') Hence the paradox: applications never seem to do everything we want (because by design they can only support a fixed set of contexts and because how to handle each context must be explicitly hardcoded), and yet we also can't seem to easily find the functionality they do support (because the set of contexts and allowed actions is arbitrary and unguessable in a complex application).
Today I was forced to edit a Microsoft Word document, containing comments (made by me, and by others) and tracked changes. I found myself wanting to delete all comments, and accept all tracked changes. It took a few minutes to figure out, and I very quickly gave up trying to actually discover the functionality within Word's actual UI and resorted to using Google. God help me if I wanted to, say, delete only comments made by me within the last ten days.
This problem isn't at all unique to Word, Word just happens to be a large application, and like all large apps it has no cohesive story for organizing all its functionality. There are menus with hundreds of entries, and toolbars. Of course, the designers of Word have tried to create some reasonable taxonomy for grouping these functions, but on some level the location of any one function is arbitrary and unguessable. It's the same thing in any complex application: Eclipse, IntelliJ, Photoshop, Illustrator, and so on, which is part of the reason why there's an entire subdivision of the publishing industry devoted to books explaining how to get things accomplished with these applications. Every single complex app is like a foreign language, with its own unique and arbitrary vocabulary and grammar, which must simply be memorized in order to become productive.
Nowadays, I think people try to write simpler applications than Word, with less functionality. This makes the UX problem more tractable while throwing the baby out with the bathwater. I want to be able to do complex things when I interact with some piece of software, I just also want all this complex functionality to be actually discoverable! Furthermore, it should be as obvious how to assemble functionality that didn't previously exist, using existing functions.
Type systems solve exactly this problem. Here's a sketch of how a type directed version of Word could work.
- I click on a comment. A status bar indicates that I have selected something of type
Comment. Now that I have a handle to this type, I then ask for functions of accepting a
List Comment. The delete comment function pops up, and I select it.
- The UI asks that I fill in the argument to the delete comment function. It knows the function expects a
List Commentand populates an autocomplete box with several entries, including an
all commentschoice. I select that, and hit
Apply. The comments are all deleted.
- If I want, I can insert a filter in between the call to
all commentsand the function to delete those comments. Of course, the UI is type directed--it knows that the input type to the filtering function must accept a
Comment, and prepopulates an autocomplete with common choices--by person, by date, etc.
What I (most likely) don't want is full text search through the documentation. That is a much too sloppy way of querying for the functions I want, and it's not nearly as seamless as the above.
I'm not necessarily advocating any particular presentation of the above interaction, though that is an interesting UX problem to solve. I'm just saying, this is the sort of discoverable, pleasing, logical interaction I want to have with software I use. The goal of software is to provide a kind of programming environment, and we ought to use the powerful tools of programming--namely type systems and type directed editing, to organize and unify that functionality that software allows users to interact with.