Eye Tracking Research & Applications

Symposium 2008

Keynote Address


Shumin Zhai
Research Staff Member
IBM Almaden Research Center
San Jose, CA 95120

Shumin Zhai is a Research Staff Member at the IBM Almaden Research Center. He has published over 100 research papers, received numerous patents, contributed to three IBM Research Division Accomplishments, and led IBM product innovations. His work has been broadly reported in the news media including the New York Times and the BBC News. He is on the editorial boards of Human-Computer Interaction, ACM Transactions on Computer-Human Interaction, and other journals. He has been a visiting professor and lectured at various universities in the US, Europe and China. He earned his Ph.D. degree at the University of Toronto. In 2006, he was named one of ACM's inaugural class of Distinguished Scientists.


On the Ease and Efficiency of Human-Computer Interfaces

Shumin Zhai

IBM Almaden Research Center

Abstract

Ease and efficiency are two critical qualities of human-computer interfaces. In this keynote address, I will first examine various contributing factors to these two qualities, including recognition and recall, open and closed loop control, controlled and automatic process [Schneider and Shiffrin, 1977; Shiffrin and Schneider, 1977; Schneider and Chein, 2003], mapping directness [Hutchins et al., 1985], unit of operation, and chunking.

Ideally an interface is both easy and efficient to use. However often they are at odds with each other. In a competition between a familiar hence easy method and a more efficient but unfamiliar method, there are many forces against the new and efficient method that requires upfront time investment in learning. On an individual basis, these forces include the discounted present value of future time, loss aversion, the status quo bias, the endowment effect and other human decision biases [Kahneman and Tversky, 1979; Kahneman et al., 1991]. On a society basis the path dependence of social-technical system evolution, also known as Qwertynomics [David, 1985], also favors a familiar but less optimal solution. A common example of the bias towards easy but inefficient method of interaction is the under use of keyboard shortcuts to pull down menus, even if gentle visual reminders are triggered by the menu pull down action [Grossman et al., 2007]. The cost of prolonging easy but not necessarily efficient interfaces is increasingly evident as common GUI interfaces become more complex. Indeed, we see the chance of revival of command line interfaces, if integrated properly with GUI interfaces, due to efficiency concerns [Norman, 2007].

It is possible to build interface mechanisms to push users towards efficient alternatives. For example the frosty interfaces studied in [Cockburn et al., 2007] deliberately obstruct easy visual recognition behavior in order to force the user either to brush off the frost cover in order to use visual guidance or to mentally recall the interface object location or action trajectory. Either way the active effort of retrieving past information facilitates learning [Schmidt and Bjork, 1992]. However, if the interface is made too hard, the user can be frustrated and give up altogether [Cockburn et al., 2007].

From a user interface research and design point of view, the best way to accommodate both ease and efficiency is to bridge the easy, elemental, recognition-based steps to more proficient, chunked, recall-based operation. Designing such bridges requires creativity and it may have to be domain specific, but a key is articulation constancy between the beginner behavior and the "expert" behavior [Kurtenbach and Buxton, 1991; Kurtenbach et al., 1994]. Such a bridge can be a step function as in Marking Menus for command entry [Kurtenbach and Buxton, 1994] [Kurtenbach et al., 1994]. A marking menu has two distinct modes. The "novice" mode is a pie menu with a fraction of a second time delay set by the system. In the "expert" mode, the user has memorized the angular marks from one level of a pie menu to the next and simply gestures ahead without paying the penalty of waiting for the delayed pie menu pop up. The bridge from beginner behavior to proficiency can also be gradual as in ShapeWriter, designed for off the desktop text and command input [Zhai and Kristensson, 2003; Kristensson and Zhai, 2004; Zhai and Kristensson, 2006]. ShapeWriter always displays a graphical keyboard as an "input map". In the beginning the user traces a word from one letter key to the next key on the keyboard with a stylus or finger, and the system uses pattern recognition to identify the intended word. The tracing process is initially completely visually guided by the keyboard. With use the whole or parts of the word patterns will be memorized so the gesture strokes are increasingly recall driven. Here the degree of visual guidance and memory recall is a continuum. Whether and how intently to rely on keyboard guidance depend on the users level of proficiency. In view of the frosty interface study [Cockburn et al., 2007], a forced step function bridge as in marking menus is probably faster at pushing the user towards automaticity but is limited to a smaller number of entries. Applying such bridges to more demanding tasks may frustrate the user too much. A gradual transition approach as in ShapeWriter may be more suited for a larger input vocabulary such as text input but slower in pushing the user to total recall driven behavior.

The goal of this talk is to stimulate discussion on novel interface research and development in consideration of these two interface qualities. From this perspective, I will review some recent eyetracking based user interface examples including Magic pointing [Zhai et al., 1999], EyePoint [Kumar et al., 2007], iTourist [Qvarfordt and Zhai, 2005], Dasher [Ward and MacKay, 2002], EASE input [Wang et al., 2001]. I will also touch on this years conference theme Usability and Ubiquity from an ease and efficiency point of view.

References