Thanks to the level of genericity of our system, we demonstrate that whereas it was not
originally designed in this end, the system can be extended to such features as action simula-
tion and action execution with error tracking, all of this at a very low cost.
Finally, we describe a computer demonstrator known as the T
O
AS
t
system, which imple-
ments the proposed action recognition mechanism, in addition to its simulation, execution
and error tracking capabilities.
2 Why Doing Action Recognition?
Considering the state of the art in virtual reality systems (pure virtual reality systems [2, 12],
augmented reality systems [1, 14], computer aided tele-operation systems [6]. ..), the main
problem we are faced with is the problem of user interaction: a poor interaction can have
critical effects on the operations, and can even cause physical pain [4]). As suggested in
the introduction, a compromise must be established between the complexity of the interfaces
and their usability. In tele-robotics for instance, two philosophies exist to help the operator
interact with a robot: either virtual reality immersive devices are used, which leads to the
field of tele-operation [5], or more intelligence is given to the robot, which leads to the field
of autonomous robotics. This constitutes a kind of paradox because:
• the purpose of immersive systems is to give more technical abilities to the operator, for
instance, being able to control very precisely a remote robot. This, however, doesn’t
belong to the natural capabilities of a human, but rather to a machine.
• the purpose of autonomous robotics is to give more intelligence to the robot, for in-
stance, being able to cope with unexpected situations without the intervention of an
operator. This, however, doesn’t belong to the natural capabilities of a machine, but
rather to a human being.
While investigating on the notion of assistance to human-machine interaction in virtual
environments [13], it appeared to us that an intermediate approach would be worth trying:
what could we do to help the operator without giving him extraordinary technical powers,
and without giving the machine extraordinary reasoning skills ? A particular form of action
recognition was then envisioned: consider for example a system in which an operator is
controlling a robotic arm that is able to manipulate remote objects. Consider further that
we intentionally don’t want a tele-presence system (so we keep simple interaction devices
only, for instance a joystick and a 2D visual feedback), and we intentionally don’t want an
autonomous system (so the robot must stay mostly under the control of the operator). The
idea is then to design a system which is able to recognize relatively low-level, yet technically
difficult actions (like object grasping), and takes the control of the execution only when there
is no more complex behavioral decisions to make (like obstacle avoidance).
A typical scenario would then be the following: the operator starts moving the robotic
arm in order to grab an object. A joystick is not very well suited to do this, but that does not
matter, since imprecise motion is acceptable. When the robotic arm is close enough to the
object, the assistance system recognizes the action and takes control of the remaining work.
Its execution model is rather simple, but that does not matter since there are no more major
difficulties left.
This particular form of action recognition is exactly the direction we took in our research
(this particular example is actually implemented in a computer demonstrator known as the
T
O
AS
t
system, an acronym for “Tele-Operation Assistance System”, see the last section),