Ergonomics and HMI concerns in Mule:

towards intelligent input methods

Didier Verna

The XEmacs Project

<didier@xemacs.org>

April 19, 1999

Abstract

This paper presents some ergonomics and Human Machine Inter-

action problems that several input methods (notably the French ones)

introduce in Emacs

. First, a general overview of the available input

methods is given. Then, some ergonomics problems are described.

Finally, some ideas are proposed to improve the quality of these input

methods.

Introduction

Working with an internationalized text editor means that not only sever al

languages can be displayed, possibly at the same time, but also th at those

languages can be typed in. In order to ease the process of typing in differ-

ent languages with a st and ard keyboard, Mule offers the concept of “input

method” which allow you to enter characters from different alphabets. The

process of inputing characters from a traditional keyboard can be particu-

larly complex, notably for large character sets like kanjis. Witho ut going

that far, we can already identify ergonomics problems with simpler meth-

ods like the o n es used for inputing French. In this paper, we would like to

demonstrate those problems, and propose ideas that could be used in order

to make those input met h ods more “intelligent”, and thus simpler to use.

Section 1 propose s an over view of the different available input method s.

Section 2 de mo n strates the ergonomics problems that French input meth-

ods suffer from. Section 3 attempts to propose some ideas for improving

the quality and the ergonomy of thes e input method s.

the term “Emacs” refers to any ﬂavor of the software, notably GNU Emacs or XEmacs

1 Different Kinds of Input Methods

This section provides a short overview of t h e different kinds of input meth-

ods available in Mule for inputing different languages, and gives mo re de-

tails on t h e Quail French input methods.

1.1 Input methods classiﬁcation

Class 1: key bindings The ﬁrst class of input method works by rebinding

the keyboard keys to characters from another language. This is possible

for small alphabets that can be represented on a normal k eyboard. Russian

input met h ods belong to this categor y. Each character can also be printed

on the corresponding key to help ﬁnding them on the keyboard. As far

as French is concerned, note that an input me thod emulating an AZERTY

keyboard on a QWERTY one would belong t o th is catego r y.

Class 2: key sequences When there is n ot enough keys on the keyboard,

or when you don’t want to rebind them, some input methods use key s e-

quences to input characters. Latin-1 (notably for French) and Latin-2 input

methods belong to this category. The ne x t subsection will give a more de-

tailed overview of the French input method s.

Combination of class 1 and class 2 Some languages such as Thai use a

combination of the ﬁrst and second class to input characters. I n a ﬁrst s tep,

the keys are bound to vowels, consonants etc. and a seque n ce of such char-

acters actually produce a composite one.

Class 3: external help Finally, in mo re complex cases, some help from

an external program might be required. Japanese input methods belong

to this category: after using a combination of the preceding cases to input,

say, hiragana phonetically, an external program like canna-server proposes

possible kanji translations for this word.

1.2 Quail French input methods

Quail provides several French input methods belonging to class 2 as de-

scribed previously. “french-postﬁx” and “french-preﬁx” are the most widely

used. Those input methods typically let you enter a character with an ac-

cent or a cedilla in two keystrokes. In french-preﬁx, you type the symbol

ﬁrst, while in french-postﬁx, you type the letter ﬁrst. Figure 1 shows some

examples of key sequences from those input methods.

Obviously, it is also possible to cancel the key sequence, if you really

need the two characters one after each other. Under a french-postﬁx in-

put method, th is is usually achieved by typing the symbol twice. Und er a

french-preﬁx french-postﬁx

‘ + a à a + ‘ à

’ + e é e + ’ é

, + c ç c + , ç

< + < « < + < «

/ + c

c + /

/ + o o + /

Figure 1: Examples of Quail French key sequence s

french-preﬁx input method, the space bar terminates the sequence after the

symbol has been typed in. This mechanism is illustrated on ﬁgure 2.

french-preﬁx french-postﬁx

‘ + <space> ‘ + a ‘ a a + ‘ à + ‘ a ‘

’ + <space> ’ + e e ’ e + ’ é + ’ e ’

/ + <space> / + c c / c + /

+ / c /

/ + <space> / + o / o o + / + / o /

Figure 2: Cancelling Quail French key sequences

As you can see, those key sequences are very easy to remember. The

symbols used in french-postﬁx do look like the corresponding accents or

cedilla which makes their use very intuitive. Doubling the symbol in order

to cancel the sequen ce is also something r ather natu r al.

2 Cognitive problems in Quail input methods

Despite their appearant simplicity, t h ose input methods actually introduce

some cognitive problems th at from time to time can make them more d if-

ﬁcult to use that it seems at a ﬁrst glance. Figure 3 s h ows some common

mistakes issued with those methods . The ﬁrst one is in a french-preﬁx con-

text, the others are in french-postﬁx. Each case shows the sentence that was

obtained, and t h e o n e t h at was expected.

Obtained Expected

Lémpire contre attaque L’empire contre attaque

Sois franç ça ne marche pas. . . Sois franc, ça n e marche pas. . .

Utilisez plutôt ‘setlocalé Utilisez plutôt ‘setlocale’

/us

local/sr

xemacs /usr/local/src/xemacs

Figure 3: Common mistakes under Quail French input methods

As you can see, those mistakes are always related to key sequence can-

celation. What happens is that the user forgets to cancel the key sequence,

by either typing <space> (in french-preﬁx) or by doubling the symbol (in

french-postﬁx). We think that the reason for this breakage is that the con-

cept of key sequence cancelation is counter intuitive. As we have already

said, mapping symbols to accents is very easy to remember and doesn’t

raise any problem. However, cancelling a key seque n ce means that you

actually input something wrong and you must be aware of it, because af-

terwards, you will have the opportun ity to correct it. For instance, if you

want to input a “c” followed by a comma in french-postﬁx, you must have

in mind that you will ﬁrst input a “c” cedilla and then correct it to what you

want by doubling the comma.

According to the way Quail was designed, the idea of doubling the sym-

bol to cancel a key sequence is probably the best choice that could be don e.

Which is arguable is not the cancelation method, but really the fact that

cancelation is needed. This also stands for the french-preﬁx me thod. As a

result, we should try to determine how an input meth od cou ld avoid can-

celation. This is the purpose of the next section.

3 Proposed solutions

In order to eliminate the need for a cancelation method, we must accept

the fact that the user can t ype two keys in different circumstances with

different ideas in mind. For instance, the key seq uence <c comma> some-

times means “give me a c cedilla”,and sometimes means “give me a c and

a comma”. We should conseq uently ﬁnd w ays to make t h e input methods

underst and the different cases without requiring anything special from the

user.

3.1 Static solution: using the context

In the second example of ﬁgure 3, the situation happens to be rather simple

to correct: in French, a word cannot be ende d with a c cedilla. Conse-

quently, if a space immediately follows the c cedilla, we kn ow for sure that

the user actually wanted a c and a comma. This example shows that we

could beneﬁt from the “context” of the key s equence, in other words, the

characters already present around the insert ion point. Quail blindly uses

key sequences, without knowing anything about the current context.

Consequent ly, the ﬁrst solution we can t h ink of is deﬁning an input

method by character-key sequences rather than by key sequences only. Con-

sider the sample speciﬁcation given in ﬁgure 4. This speciﬁcation means

that typing a comma when there is a c in the buffer should produce a c

cedilla. However, typing a space when there is a c cedilla in the buffer

{c} + , ç

{ç} + <space> c , <space>

Figure 4: character-key speciﬁcation example

should turn it into a c followed by a comma and a space.

There is still something not completely satisfactory with this technique.

Namely, the fact that the c cedilla will still be generated in wrong cases ,

even temporarily. Although the user does not have to correct it by hand,

it can still be annoying to see it appear. This problem can p artially be

solved by using more than a single character as t h e context. For instance, in

French, we do not have any word cont aining the s equence <i r a ç>. Con-

sequent ly, if we add the rule “{irac} + ,

i r a c ,” to the input method

speciﬁcation, we will get immediately the proper characters in that case.

As we can see, by extending the concept o f key sequence to the concept

of character-key sequence, which should not be very hard to implement,

the current inpu t methods could be considerably improved with respect to

the cancelation problems. However, several issues remain problematic:

There exist cases that cannot be corrected with this method. The next

subsection presents some of them.

Specifying all the possible cases similar to the “irac” e x ample would

be enormous. We cannot afford it, espe cially because the user should

still h ave the possibility to customize his key sequences.

3.2 Dynamic solution: relationship with spell checking

One of the problems that the preceding solution cannot solve is ambiguous-

ness. This means that at t h e time a character-key sequence is encountered,

it is not necessarily possible to decide which action should be taken. For in-

stance, in French, the sequence “L ’ e n i” is u n decidable, because the quot e

can really be an apo strophe, but it could also be a french-preﬁx sequence

for a word beginning with “L é n i” (there are some). As a consequence, it

is only possible to decide what to do after some more characters are typed

in. In our example, the next character can be sufﬁcient. For ins tance, if it is

an “f”, the ke y sequence necessarily represents an “é”, and if it is a “v”, it

is n ecessarily a real apostrophe. Otherwise, the corresponding word does

not exist in French.

It is important to notice th at when we speak of solving ambiguousness,

what we have to do in terms of implementation is actually to look up into

a dictionary and see if such or such word does exist. This process is actu-

ally highly related to the no tion of spell checking. . . This idea of dynamic

checking can also be applied to the cases (described in the previous subsec-

tion) where the decision could be made immediately. Take again the “i r a

c” example described precedently. Ins tead of specifying explicitly the key

sequence as proposed in the ﬁrst solution, we can also look up into a dic-

tionary to see if such a word is possible. This way, we don’t have to specify

every possible s equence in the input met h od, the decision is made based

on a generic me chanism.

4 How far should we go?

If we push even farthe r the relationship between input method and spell

checking, we can reach reach the con ce pt of “adaptive" input method and

even the concept of word completion. However, going that far in the au-

tomation of character input is not necessarily a good thing.

4.1 Adaptive input m ethods

Consider the sequence (already typed ) “C é r”. In French, it is not possible

that th is sequence is followed with an “e”. Only a “é” can happen. Know-

ing this, we can imagine that if the user actually types an “e", it could be

automatically transformed into an “é”. He re, we have just reached the con-

cept of “adaptive" input method, that is a method mixing different classes

(see section 1). Our me thod which is normally of class 2 tu r n s out to be

a class 1 metho d (k ey bindings) in that case, since the “e” k ey produces

directly an “é”.

4.2 Word completion

While we are at it, why stopping at the accents or cedilla level? Since the

input met h od can sometimes decide what is the ne x t character in a word,

it can p robably also know how to complete the whole word in some cases.

There, we have gone from the concept of input method, through the no-

tion of spell checking and ﬁnally got a word completion mechanism. This

demonstrates how far those n otions are related to each other.

So, how far should we go? Although only experiments with Mule users

could give us a deﬁnitive answer, the ideas of adaptive input me thods and

word completion are probably not good things to implement. In Human

Machine Interaction, we know that automation is good only if the user per-

ceives it as a stable behavior. It is highly probable that if an input method

sometimes transforms an “e” directly into an “é” and sometimes does not,

the us er will be annoyed rathe r than pleased, be cause it is out of his capac-

ity to remember exactly in which cases the ﬁrst behavior happens and in

which cases the second one takes place. The same consideration applies for

the notion of w ord completion.

conclusion

In this p aper, we have described brieﬂy the main classes of input methods,

and given a more detailed view on the Quail French ones which belong to

the second class. The concept of key sequence cancelation, although n eces-

sary given the way Quail is designed, appears to be coun ter intuitive and

is at the origin of numerous breakages in the process of inputing charac-

ters. While examining possible solutions to eliminate the need for a key se-

quence cancelation process, we have ﬁnally demonstrated that the notion

of input method, at least in the case of French, and probably for all Latin

languages, is deeply related to spell checking. If one day w e can make spell

checkers understand that “e ’ ” is actually a misspelled version of “é”, then

ﬂyspell will probably be the most efﬁcient input met h od ever writte n .

However, we should keep in mind th at even spell checking input meth-

ods will not solve all the problems. In the example o f “/us

local/sr

xemacs”,

a broken pathname, the only way to correct it automatically wou ld be for

the machine to know that this is a pathname, and that the way it is currently

written is meaningless . How ever, nowadays, meaning recognition is still

another story. . .