0% found this document useful (0 votes)
44 views

Development of A Visual Space-Mouse: Confe Nce On Robotics& Automation May

This document summarizes the development of a visual space-mouse interface to intuitively control a robotic arm with six degrees of freedom. Two cameras are used to track the position and orientation of a user's hand in 3D space. Gestures of the user's hand are directly mapped to movements of the robotic gripper, allowing natural control. The system was implemented using two cameras connected to an image processing device to extract the position of the user's hand from video feeds using blob detection. This visual interface provides a quantitative, wear-free method for intuitive teleoperation of complex robotic systems.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views

Development of A Visual Space-Mouse: Confe Nce On Robotics& Automation May

This document summarizes the development of a visual space-mouse interface to intuitively control a robotic arm with six degrees of freedom. Two cameras are used to track the position and orientation of a user's hand in 3D space. Gestures of the user's hand are directly mapped to movements of the robotic gripper, allowing natural control. The system was implemented using two cameras connected to an image processing device to extract the position of the user's hand from video feeds using blob detection. This visual interface provides a quantitative, wear-free method for intuitive teleoperation of complex robotic systems.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Proceedings of the 1999 fEEE

htemationsd Confe~nce on Robotics& Automation


Detroit, Michigao May 1999

Development of a Visual Space-Mouse


Tobias Peter Kurpjuhn Kevin Nickels
kurpjuhn(~lpr.e-technik. tu-muenchen.de knickels@trinity. edu
Technische Universitat Miinchen Trinity University
Munich, Germany San Antonio, TX, USA

Alexa Hauck Seth Hutchinson


[email protected]. tu-muenchen.de seth@uiuc. edu
Technische LTniversitat Munchen University of Illinois at Urbana-Champaign
Munich, Germany Urbana, IL, USA

Abstract the capability of an autonomous robotic system or the


decisions that have to be made demand complicated
The pervasiveness of computers in everyday life background knowledge.
coupled with recent rapid advances in computer tech-
This leads to the demand for a comfortable 3D con-
nology have created both the need and the means for so-
trol and manipulation interface. A very special kind
phisticated Human Computer Interaction (HCI) tech-
of controlling device is the space-mouse that has been
nology. Despite all the progress in computer technol-
developed recently by [1] and [2]. The space-mouse is
ogy and robotic manipulation, the interfaces for con-
a controlling device similar to the standard computer
trolling manipvtlators have changed very little in the
mouse, but instead of moving it around on a table,
last decade.
which causes plane related reactions on a computer
Therefore Human- Computer interfaces for control-
program, it consists of a chassis that holds a movable
ling robotic manipulators are of great interest. A flex-
ball. This ball is attached to the chassis in such a
ible and useful robotic manipulator is one capable of way that it can be moved with six degrees of freedom:
movement in three translational degrees of freedom,
three translations and three rotations. This makes it
and three rotational degrees of freedom. In addition to possible for the user to control six degrees of freedom
research labs, siz degree of freedom robots can be found with one hand. Because of this, even complex robotic
in construction, areas or other environments unfavor- devices can be controlled in a very intuitive way.
able for human beings.
One step further to a more intuitive and therefore
This paper proposes an intuitive and convenient vi-
more effective controlling device would be a system
sually guided interface for controlling a robot with six
that can be instructed by watching and imitating the
degrees of freedom. Two orthogonal cameras are used
human user, using the hand of the user as the major
to track the position and the orientation of the hand
controlling element. This would be a very comfortable
of the user. This allows the user to control the robotic
interface that allows the user to move a robot system
arm in a natural way.
in a natural way. This is called the visual space mouse.
The purpose of this project was to develop a system
1 Introduction that is able to control a robotic system by observing
the human and directly converting intuitive gestures
In many areas of our daily life we are faced with into movements of the manipulator. The hand serves
rather complex tasks that have to be done in circum- as the primary controller to affect the motion and po-
stances unfavorable for human beings. For example sition of a robot gripper. For the observation of the
heavy weights may have to be lifted, or the environ- user, two cameras are used. A precise calibration is
ment is hazardous to humans. Therefore the assis- not required for our method. In fact, the only cali-
tance of a machine is needed. On the other hand, bration that is required is the approximate knowledge
some of these tasks also need the presence of a hu- of the directions “up”, “down”, “left” and “right”.
man, because the complexity of the task is beyond If a translation or rotation of the camera moves the

0-7803-51 80-0-5/99 $10.00 @ 1999 IEEE 2527


controlling hand of the human out of the view of the Section 3 discusses two different approaches realizing
camera, the system will fail. The gripper of a PUMA the implernentation of the visual space-mouse.
560 robot arm with six degrees of freedom is used as a
manipulator. One camera is placed on the ceiling pro-
viding a vertical view of the controlling hand and one 2 System Overview
camera is placed on a tripod on the floor to provide a
horizontal view (see Figure 1). Together, the cameras The system of the visual space mouse can be di-
create a 3D work-space in which the user is allowed to vided into two main parts: image processing and robot
move. control. The role of image processing is to perform
operations on a video signal, received by the video
cameras. The aim is to extract desired information
out of the video signal. The role of robot control is
to transform electronic commands into movements of
the manipulator.

2.1 Image Processing

Our image processing unit consists of two greyscale


CCD (charge coupled device) cameras connected to an
Figure 1: Structure of the Visual Space-Mouse.
image processing device, the Datacube MaxVideo 20.
This Datacube performs operations on the video out-
The structure of the system yields some very pow- put. The operations of the Datacube are controlled by
erful advantages. The first group of advantages is de- a special image processing language: VEIL [5], which
termined by the structure itself the system provides a is run on a host computer. In this way the data col-
quantitative and cheap control unit without any aids lected by the camera can be processed in a convenient
or moving parts. That means there is no physical wear way. This makes it possible to extract the desired
in the controlling system. This eliminates one poten- information from the video output. In our case we
tial source of failure, thereby making the system more identify, track and estimate the position of the hand
robust. of the user.
The second group of advantages is determined by A special feature of VEIL is the use of blobs. A
the basic concept of the system w-hich provides up- blob in VEIL is defined as a connected white region
grading possibilities. One possibility would be the use within a darker environment. The use of blobs makes
of sensor data combined with an intelligent robot con- it possible to detect and track special regions in the
trol system. This would lead to a robotic manipulator image.
with teleassistance and all its advantages [3]. Another The image processing unit is supposed to detect
possibility would be the implementation of hand ges- and track the hand. To achieve this task, within the
tures as a communication language with the system environmental constraints imposed on the project, an
to produce a high level control-interface [4]. Addi- image processing task was set up as following. The
tionally the work-space created by the two cameras is video output of the camera is convolved with a blur-
determined by the angle of view of the two cameras. ring filter and then thresholded. After thresholding
Therefore it can be individually adjusted to the needs the image blob-detection is applied.
of the controlling task. A movement guided initializa- Since the purpose of this project is to generate a
tion sequence tells the system which hand serves as the prototype validating the benefits of the visual space-
primary input device. It is possible to keep track of mouse, extra constraints were placed on the image
several objects to perform more complex tasks. The processing. In particular, a black background in com-
scaling factor that translates the hand movement of bination with a dark clothing is used. By doing so,
the user into manipulator motion is fixed, but freely the output image of the threshold operation gives a
adjustable. black and white image of the camera view where ob-
The remainder of the paper is organized as follows. jects such as the hands or the face are displayed as
In Section 2 we describe the major components of the pure white regions. This image is then imported to
system. We begin in Section 2.1 by giving a brief the blob-routine, which will mark every white region
overview of the image processing unit. In Section 2.2 as a blob and choose one of the blobs to be the control
we describe the robot control portion of the system. blob.

2528
To gain tracking of the desired hand, a motion after that the vertical view. Both views are treated in
guided initialization sequence was added. In the ini- one cycle.
tialization sequence, the user waves the controlling Because the blob search is run on a host computer,
hand in the workspace. The image processing unit the image has to be transmitted from the Datacube to
records the image differences for several successive im- a workstation over a bus network, The bus network is
ages and creates a map of these differences. After ap- the bottleneck of the whole image processing unit, as
plyinga hlurring filter tosuppress pixel-noise, result- illustrated in Figure 2. By shrinking the image to &
ing in spikes in this map, the image processing unit of the original size, much transfer time can be saved
chooses the control blob. The blob in the current im- with an acceptable loss of accuracy.
age that is closest to the region that changes most is
chosen to be the control blob.
The values of this blob are stored in a global data ~.—+
structure to make it accessible to the robot control
unit. Blobs other than the control blob are ignored H—1
~
Digitizer

-
PUMA 560
Robot
‘---1
in the controlling process. To ensure tracking of the
hand after initialization without any sudden changes
in the control blob, the bounding box is only allowed
to change up to SCb pixels each cycle in each direction.
In the current implementation, this threshold
is set, to 5 pixels. This causes the blob to get stuck
value
‘SBUS
v
[o VME
, u
<
SBUS> (&;,p”ter

to the hand and not to jump to other objects that


are near the hand, One advantage to this is some Figure 2: Bus structure and ciata-flow,
measure of robustness to occlusion of the hand. If
an object (either dark or bright) passes between the
2.2 Robot Control
camera and the hand, the bounding box for that object
will not match the bounding box for the controlling
The second unit of the visual space mouse system
blob. Thus, the controlling blob will remain in the
is the robot control unit. The main elements of the
position it, was before the occluding object appeared.
robot control unit are the controller task, written in
The orientation of the major axis of the object is
RCCL (Robot Control C Library) as a task level robot
computed by using the centered second moments of
control language [7] [8], run on the host computer, and
the object as follows (see [6]):
the manipulator itself.
Describing a manipulator task requires specifying
mxY
y = ~ arctan - - (1) positions to be reached in space (the where) as well as
mzz — mYY
specifying aspects of the trajectory (the how). RCCL
%~b = mflb —mm n~b, a,b E {z)y} describes target positions using either Cartesian posi-
tion equations or sets of joint angles. [9].
where fi.~~, %VY, and %.Y are the centered second Cartesian position equations consist of several
moments about the horizontal, vertical, and 45° axes, transform matrices that are multiplied. Each trans-
respectively. The second equation is used to compute form matrix describes a rotation and translation of
these centered second moments, with mab and m. rep- the coordinate system. Together they form two sys-
resenting the non-centered second and first moments tems of coordinate transformations: one on the right
about the appropriate axes, respectively. By using side and one on the left side of a position equation.
these equations, angles between +7r/4 and –Ir/4 can Equation (2) describes the relationship of the two co-
be measured. When the real object oversteps an an- ordinate transformations.
gle of +7r/4 the result of Equation (1) will change its
sign. The routine to measure the orientation of the T~L@Tt. T.Uarzab~, = T~a~, . T6 . Ttooz) (2)
hand takes care of this effect by causing the angle re-
turned to be clipped to +7r/4 when the orientation of where Tb.~,, T6 and TtOO1represent the homogeneous
the blob oversteps this border. coordinate transformations from the world frame to
As there are two cameras each providing a different the robot’s base frame, from the robot’s base frame
view, those computations have to be done for each to a frame attached to link 6 of the robot, and from,
image source. The image processing task is set up in and from a frame attached to link 6 of the robot to
such a way that it processes first the horizontal, and the tool frame. The transform T,~arf represents a ho-

2529
mogeneous coordinate transform from the world co- hand abollt the optical axis of each camera, can be
ordinate frame to the initial position and orientation observed directly from the images. The third orien-
of the tool, and T’UO~taM~is a variable homogeneous tation, the rotation of the hand about the horizontal
coordinate transformation matrix that is continuously axis, has to be computed from the image data. The
updated, thereby causing the manipulator to move to value of the orientation of the (constant sized) hand
a goal position and orientation, could be computed, for example, from knowledge of
the size, position, and orientation of the projections
of the hand in the two images.
Every time these new values are passed to the con-
\ ./l-\\ trol unit a new transform matrix is created with re-
spect to the movement of the hand, This matrix is in-
‘,,,,,.,,. )
cluded in the coordinate transformation equation used
h.,, to control the robot. The equation is solved for T6,
the joint, values of the manipulator are computed and
-tif the results are passed to the trajectory generator.
T,,dr,

Figure 3: Effect of the position equation


3 The Visual Space-Mouse

In some cases it is not possible to use two cameras


As both sides of the equations are said to be equal,
to watch the controlling hand of the user. This can
both coordinate frame transformations have the same
be caused by limitations on free space or on accessible
effect. That means that both sides of the equation
hardware.
start shifting from the same point and reach the same
In Section 3.1 we discuss the space-mouse proposed
destination point. Equation (2) is solved for the ma-
previously, but we also suggest an approach to solve
trix T6, describing the desired position of the manip-
the dilemma of limited resources in Section 3.2.
ulator arm:

3.1 Two Camera Space-Mouse

To reach a point in space with the manipulator, you In our laboratory, only one image processing hard-
have to create the position equation, solve for T6 in ware device was available. Both camera views had to
Cartesian space and transform the solution into joint be processed by switching between two video channels.
space to achieve the desired values of the joint angles Combined with the transfer time via bus system (see
of the manipulator. With these joint angles the ma- Figure 2) this was a very time consuming procedure.
nipulator is able to reach the destination point. The So the biggest problem with the two-camera version
trajectory generator in RCCL will then plan a path to was the speed of the image processing unit. The whole
the desired joint angles and update it as necessary. network slowed down the performance to 3 fps (frames
The only inputs for the control unit are the two per second). This forced the user to slow down hand
blob data-structures, described in Section 2.1. These motion in an unnatural way,
data-structures represent the spatial position and ori- The use of a second image processing device would
entation of the object being tracked. The controlling increase the image processing performance to the level
unit looks at the center of the blob rectangles in the seen in the one-camera version described below, al-
image planes, which each contain a pixel-coordinate- lowing the user to move the hand at a natural speed.
system. The center of the camera views are said to Nevertheless it could be shown that the tracking of
be the origin of the coordinate systems. These pixel the hand and controlling of the manipulator worked
coordinates are translated into global coordinates for quite nicely in all six dimensions.
the manipulator. This is done by directly mapping the
movements of the blobs into movements of the manip- 3.2 Space-Mouse with one Camera
ulator: blob motion in the image causes the manipu-
lator to move in the corresponding direction. In some cases limitations have to be applied to
The orientations of the hand can also be observed the structure of the visual space-mouse, as described
(see Section 2.1) and are transformed into manipula- above. The solution of this obstacle leads to a one-
tor movements. Two orientations, the rotation of the camera-version of the visual space-mouse.

2530
By removing the overhead camera, any information els. Those action are actuating the gripper and rotat-
about the depth of the controlling hand is lost. Any ing the whole manipulator along its vertical axis.
rotation with the rotation axis parallel to the image By the use of the two planes, described previously,
plane will just change the height and the width of the only a cubic space in front of the arm can be accessed.
object. Thesign of the rotationc annot redetermined With the rotation along the z-axis this cube can be
easily. There are three dimensions of an object in a rotated and so the whole area around the manipulator
plane that are easily and robustly detectable: height, is accessible. The rotation is initiated by rotating the
width and rotation in the image plane. The control- hand in the plane of the image. This causes the robot
ling task of a manipulator with six degrees of freedom to turn in steps of 10 degrees.
is therefore very difficult with just 3 values. To han- The gripper movement controls the opening and
dle this problem but keep the user- interface intuiti~~e closing of the gripper. This movement is initiated by
and simple, a state machine was implemented in the rotating the hand in the horizontal plane as shown
controller. in Figure 5. Placing the gesture for the gripper in
The state machine consists of three different levels: the transition level has the advantage that, any move-
two control levels and one transition level. The control ment of the hand has no effect on the position of the
levels are used to move the manipulator. The transi- manipulator, which will keep the gripper fixed during
tion level connects the two control levels and affects actuation.

m
the gripper of the robot arm.
When the palm of the hand is facing toward or away
from the camera, the state machine of the controlling
unit is in one of two control levels. In each control
level the manipulator can be moved in a plane, by
moving the hand in the up-down direction or forward-
backward direction. The control levels differ in the ori-
entation of the planes the manipulator can be moved
in. The plane of control level 2 is orthogonal to the Figure 5: Control level gesture (left), transition level
plane of control level 1 (see Figure 4). The planes gesture (middle) and gripping gesture (right).
intersect at the manipulator.

To determine when the state machine is supposed


to change state, two threshold levels are computed and
stored during initialization of the program. The first
threshold level is set to ~ of the height of the original
blob(= height-3_4), the second one is set to ~ of the
width (= width-3-4) of the original blob. The state
machine goes into the transition level when the height
of the actual blobs falls below height_3_4. It goes into
the opposite control level when the actual height ex-
ceeds height-3-4. In the transition level, the gripper
is actuated when the width of the hand is reduced be-
low width_3-4. The structure of the state machine is
Figure 4: Motion planes of the two control levels.
illustrated in Figure 6.

To change the control levels the hand is turned so


that the palm is facing down. In this mode the hand
can be moved within the workspace without effecting
the position of the manipulator. This mode is called
the transition level (see Figure 5).
The transition level gets its name from its position
between the two control levels, which are the actual
steering levels. The task of the transition level is to
connect both control levels and to perform additional Figure 6: Structure of the state machine.
actions on the workspace managed by the control lev-

2531
The origin of a control levei plane is reset to the rMr- 6. Implementation of a state machine in the two-
rent position of the manipulator when the state ma- camera version of the space-mouse
chine enters that control level. This has the advantage
that, positions that are out of reach within the first at-
tempt can be reached in the second attempt just by 4 Conclusions
going into the transition mode, moving the hand to
a more convenient position and then returning to the The objective of developing a high-level visually
same control level. guided interface has been realized. As the experiment
The state machine always starts in control level 1. described in Section 3.3 showed, simple remote tasks
To visualize the state of the state machine of the con- can be performed with minimal training or teaching.
trol unit, the rectangle around the hand on the moni- This is a good demonstration of intuitive and conve-
tor is shown in a state-dependent color, nient way in which a 3D interface can be operated.
Additionally, several possible extensions to this im-
3.3 Discussion plementation of the visual space-mouse have been pro-
posed that would make it an even more powerful in-
An experiment was performed to validate the func- terface for control and manipulation.
tions of the one-camera space-mouse. The task was to
assemble a house out of three randomly placed wooden
pieces. Several persons have been chosen to perform References
this experiment with minimal training, and each was
able to successfully finish the task. The experilnent [1] G. Hirzinger, B. Gombert, J. Dietrich, and
showed that the state machine clescribed above was LIS- V. Senft, “Die Space Mouse - Eine neue
able. The biggest problem was that the gesture for the 3D-Mensch-Maschine-Schnittstelle als Spin-Off-
gripping movement was found to be unnatural. Most, Produkt der Raurnfahrt ,“ in Jahrbuch fir Optik
und Feinmechanik (W. D. Prenzel, cd.), Berlin:
of the candidates not only turned the hand in the hor-
Schiele und Schon, 1996.
izontal image plane, reducing the width of the hand
below width.3.4 as shown in Figure 5, but also turned [2] G. Hirzinger, B. Gornbert, and M. Herrmann,
their wrist. By doing so, they overstepped heigh-3-4 “Space mouse, the natural man-machine interface
and inadvertently transitioned into a control level. for 3D-CAD comes from space,” in ECUA ’96 Con-
One solution for this problem would be to introduce ference for CATIA and C’ADAM users in Europe,
(Goteborg, Schweden), 1996.
a third control level, as both control levels have been
exhausted in terms of robustly detectable intuitive ges- [3] P. K. Pook, Telea,ssistance: Using Deictic Gesture
tures. But this would require a significant change in to Control Robot Action. PhD thesis, University of
the state machine, and the control would become less Rochester, 1995.
intuitive. Thus, it has not been irnplernented. An- [4] V. I. Pavlovic, R. Sharma, and T. S. Huang, “Vi-
other possibility would have been to change the grip- sual interpretation of hand gestures for human-
ping gesture. This was not possible because of the computer interaction: .4 review, ” IEEE Trans-
limited possibilities of gestures that were bound to actions on Pattern Analysis and Machine Intelli-
the robustly detectable dimensions: position, size, and gence, vol. 19, pp. 384-401, July 1997.
orientation of the controlling blob, [5] T. J. Olson, R. J. Lockwood, and J. R. Taylor,
“Programming a pipelined image processor,” Jour-
3.4 Future Work nal of Computer Vision, Graphics and Image Pro-
cessing, Sept. 1996.
Both versions of the space mouse have several areas
for improvement. Some of them are as follows: [6] B. K. P. Horn, Robot Vision. Massachusetts Insti-
tute of Technology, 1986.
1. Implementation of a motion model of the hand
[7] J. Lloyd and V. Hayward, Multi-R CCL User’s
2. Segmentation of the hand blob, for higher resolu-
Guide. Montr+al, Qu6bec, Canada, Apr. 1992.
tion control of the robot
3. Including sensor data for achieving teleassistance [8] P. Leven, “A multithreaded implementation of a
robot control C library,” Master’s thesis, Univer-
[3] (e.g. collision avoidance)
sity of Illinois at Urbana-Champaign, 1997.
4. Implementation of a high-level gesture language
[9] M. Spong and M. Vidyasagar, Robot Dynamics
5. Adding a routine for filtering out the background
and Control. John Wiley & Sons, 1989.

2532

You might also like