Computer Vision A Reference Guide 1st Edition Katsushi Ikeuchi Eds download
Computer Vision A Reference Guide 1st Edition Katsushi Ikeuchi Eds download
https://ebookbell.com/product/computer-vision-a-reference-
guide-1st-edition-katsushi-ikeuchi-eds-4696220
https://ebookbell.com/product/computer-vision-a-reference-
guide-2e-2nd-edition-katsushi-ikeuchi-35101648
https://ebookbell.com/product/computer-vision-and-image-processing-in-
the-deep-learning-era-a-srinivasan-47573638
https://ebookbell.com/product/challenges-and-applications-for-
implementing-machine-learning-in-computer-vision-ramgopal-
kashyap-21967744
https://ebookbell.com/product/computer-vision-a-modern-approach-2nd-
edition-david-a-forsyth-2259096
Computer Vision A Modern Approach Us Ed David A Forsyth Jean Ponce
https://ebookbell.com/product/computer-vision-a-modern-approach-us-ed-
david-a-forsyth-jean-ponce-921688
https://ebookbell.com/product/computer-vision-a-modern-appr-d-forsyth-
j-ponce-4089718
Opencv Computer Vision Projects With Python Get Savvy With Opencv And
Actualize Cool Computer Vision Applications A Course In Three Modules
Michael Beyeler
https://ebookbell.com/product/opencv-computer-vision-projects-with-
python-get-savvy-with-opencv-and-actualize-cool-computer-vision-
applications-a-course-in-three-modules-michael-beyeler-10834122
https://ebookbell.com/product/modern-computer-vision-with-pytorch-a-
practical-and-comprehensive-guide-to-understanding-deep-learning-and-
multimodal-models-for-realworld-vision-tasks-2nd-edition-v-kishore-
ayyadevarayeshwanth-reddy-57684292
https://ebookbell.com/product/vision-with-direction-a-systematic-
introduction-to-image-processing-and-computer-vision-josef-
bigun-2628338
Katsushi Ikeuchi
Editor
Computer Vision
A Reference Guide
1 3Reference
Computer Vision
Katsushi Ikeuchi
Editor
Computer Vision
A Reference Guide
Computer vision is a field of computer science and engineering; its goal is to make
a computer that can see its outer world and understand what is happening. As David
Marr defined, computer vision is “an information processing task that constructs effi-
cient symbolic descriptions of the world from images.” Computer vision aims to
create an alternative for human visual systems on computers.
Takeo Kanade says, “computer vision looks easy, but is difficult. But, it is fun.”
Computer vision looks easy because each human uses vision in daily life without any
effort. Even a new-born baby uses its vision capability to recognize the mother. It is
computationally difficult, however, because the original outer world is made up of
three dimensional objects, while those projected on the retina or an image plane,
are of only two dimensional images. This dimensional reduction from 3D to 2D
occurs along the projection from the outer world to images. “Common sense” needs
to be used to augment the descriptions of the original 3D world from the 2D images.
Computer vision is fun, because we have to discover this common sense. This search
for common sense attracts the interest of vision researchers.
The origin of computer vision can be traced back to Lawrence Roberts’ research,
“Machine Perception of Three-Dimensional Solids.” Later, this line of research has
been extended through Project MAC of MIT. Professor Marvin Minski, the then
director of Project MAC, initially believed that computer vision could be solved as a
summer project of an MIT graduate student. His original estimation was wrong, and
for more than 40 years we have been investigating various aspects of computer vision.
This 40-year effort proved that computer vision is one of the fundamental sci-
ences, and the field is rich enough for researchers to devote their entire research lives
to it. This period also reveals that the field contains a wide variety of topics from
low-level optics to high-level recognition problems. This richness and diversity were
an important motivation for us to decide to compile a reference book on computer
vision.
Lawrence Roberts’ research contains all of the essential components of the
computer vision paradigm, which modern computer vision still follows: homoge-
neous coordinate system to define the coordinates, cross operator for edge detection,
and object models represented as a combination of edges. David Marr defines his
paradigm of computer vision: shape-from-x low-level vision, interpolation and fusion
of such fragmental representations, 2-1/2D viewer-centered representation as the
result of interpolation and fusion, and 3D object-centered representation. Roughly,
this reference guide follows these paradigms, and defines the sections accordingly.
v
vi Preface
vii
viii Editor’s Biography
Yasuyuki Matsushita received his B.S., M.S., and Ph.D. in Electrical Engineering
and Computer Science (EECS) from the University of Tokyo in 1998, 2000, and 2003,
respectively. He joined Microsoft Research Asia in April 2003, where he is a senior
researcher in the Visual Computing Group. His interests are in physics-based com-
puter vision (photometric techniques, such as radiometric calibration, photometric
stereo, shape-from-shading), computational photography, and general 3D reconstruc-
tion methodologies. He is on the editorial board of International Journal of Computer
Vision (IJCV), IEEE Transactions on Pattern Analysis and Machine Intelligence
(TPAMI), The Visual Computer, and associate editor in chief of the IPSJ Transactions
on Computer Vision and Applications Journal of Computer Vision and Applications
(CVA). He served/is serving as a program co-chair of Pacific-Rim Symposium on
Image and Video Technology (PSIVT) 2010, The first joint 3DIM/3DPVT confer-
ence (3DIMPVT, now called 3DV) 2011, Asian Conference on Computer Vision
(ACCV) 2012, International Conference on Computer Vision (ICCV) 2017, and a
general co-chair of ACCV 2014. He has been serving as a guest associate professor
at Osaka University (April 2010–), visiting associate professor at National Institute
of Informatics (April 2011–) and Tohoku University (April 2012–), Japan.
x Editor’s Biography
xi
xii Senior Editors
Daniel Cremers Professor for Computer Science and Mathematics, Chair for
Computer Vision and Pattern Recognition, Managing Director, Department of Com-
puter Science, Technische Universität München, Garching, Germany
xv
xvi Section Editors
Luc Van Gool Professor of Computer Vision, Computer Vision Laboratory, ETH,
Zürich, Switzerland
Atsuto Maki Associate Professor, Computer Vision and Active Perception Labora-
tory (CVAP), School of Computer Science and Communication, Kungliga Tekniska
Högskolan (KTH), Stockholm, Sweden
Section Editors xix
Ram Nevatia Director and Professor, Institute for Robotics and Intelligence
Systems, University of Southern California, Los Angeles, CA, USA
xx Section Editors
Shmuel Peleg Professor, School of Computer Science and Engineering, The Hebrew
University of Jerusalem, Jerusalem, Israel
Section Editors xxi
Marc Pollefeys Professor and the Head, Computer Vision and Geometry Lab
(CVG) – Institute of Visual Computing, Department of Computer Science, ETH
Zürich, Zürich, Switzerland
Long Quan Professor, The Department of Computer Science and Engineering, The
Hong Kong University of Science and Technology, Kowloon, Hong Kong, China
xxii Section Editors
Todd Zickler William and Ami Kuan Danoff Professor of Electrical Engineering and
Computer Science, School of Engineering and Applied Sciences, Harvard University,
Cambridge, MA, USA
Contributors
xxv
xxvi Contributors
Jun-Sik Kim Korea Institute of Science and Technology, Seoul, Republic of Korea
Ron Kimmel Department of Computer Science, Technion – Israel Institute of
Technology, Haifa, Israel
Eric Klassen Ohio State University, Columbus, OH, USA
Reinhard Koch Institut für Informatik Christian-Albrechts-Universität, Kiel,
Germany
Jan J. Koenderink Faculty of EEMSC, Delft University of Technology, Delft,
The Netherlands
The Flemish Academic Centre for Science and the Arts (VLAC), Brussels, Belgium
Laboratory of Experimental Psychology, University of Leuven (K.U. Leuven),
Leuven, Belgium
Pushmeet Kohli Department of Computer Science And Applied Mathematics,
Weizmann Institute of Science, Rehovot, Israel
Ivan Kolesov Schools of Electrical and Computer and Biomedical Engineering,
Georgia Institute of Technology, Atlanta, GA, USA
Sanjeev J. Koppal Harvard University, Cambridge, MA, USA
Kevin Köser Institute for Visual Computing, ETH Zurich, Zürich, Switzerland
Sanjiv Kumar Google Research, New York, NY, USA
Takio Kurita Graduate School of Engineering, Hiroshima University, Higashi-
Hiroshima, Japan
Sebastian Kurtek Ohio State University, Columbus, OH, USA
Annika Lang Seminar für Angewandte Mathematik, ETH Zürich, Zürich,
Switzerland
Michael S. Langer School of Computer Science, McGill University, Montreal,
QC, Canada
Fabian Langguth GCC - Graphics, Capture and Massively Parallel Computing, TU
Darmstadt, Darmstadt, Germany
Longin Jan Latecki Department of Computer and Information Sciences, Temple
University, Philadelphia, PA, USA
Denis Laurendeau Faculty of Science and Engineering, Department of Electrical
and Computer Engineering, Laval University, QC, Canada
Jason Lawrence Department of Computer Science, School of Engineering and
Applied Science University of Virginia, Charlottesville, VA, USA
Svetlana Lazebnik Department of Computer Science, University of Illinois at
Urbana-Champaign, Urbana, IL, USA
Longzhuang Li Department of Computing Science, Texas A and M University at
Corpus Christi, Corpus Christi, TX, USA
xxx Contributors
Sushil Mittal Department of Statistics, Columbia University, New York, NY, USA
Anton van den Hengel School of Computer Science, The University of Adelaide,
Adelaide, SA, Australia
Pramod K. Varshney Department of Electrical Engineering and Computer Science,
Syracuse University, Syracuse, NY, USA
Andrea Vedaldi Oxford University, Oxford, UK
Ashok Veeraraghavan Department of Electrical and Computer Engineering, Rice
University, Houston, TX, USA
David Vernon Informatics Research Centre, University of Skövde, Skövde, Sweden
Ramanarayanan Viswanathan Department of Electrical Engineering, University of
Mississippi, MS, USA
Xiaogang Wang Department of Electronic Engineering, Chinese University of Hong
Kong, Shatin, Hong Kong
Isaac Weiss Center for Automation Research, University of Maryland at College
Park, College Park, MD, USA
Gregory F. Welch Institute for Simulation & Training, The University of Central
Florida, Orlando, FL, USA
Michael Werman The Institute of Computer Science, The Hebrew University of
Jerusalem, Jerusalem, Israel
Tien-Tsin Wong Department of Computer Science and Engineering, The Chinese
University of Hong Kong, Hong Kong SAR, China
Robert J. Woodham Department of Computer Science, University of British
Columbia, Vancouver, BC, Canada
John Wright Visual Computing Group, Microsoft Research Asia, Beijing, China
Ying Nian Wu Department of Statistics, UCLA, Los Angeles, CA, USA
Chenyang Xu Siemens Technology-To-Business Center, Berkeley, CA, USA
David Young School of Informatics, University of Sussex, Falmer, Brighton, UK
Guoshen Yu Electrical and Computer Engineering, University of Minnesota,
Minneapolis, MN, USA
Christopher Zach Computer Vision and Geometry Group, ETH Zürich, Zürich,
Switzerland
Alexander Zelinsky CSIRO, Information Sciences, Canberra, Australia
Zhengyou Zhang Microsoft Research, Redmond, WA, USA
Bo Zheng Computer Vision Laboratory, Institute of Industrial Science, The Univer-
sity of Tokyo, Meguro-ku, Tokyo, Japan
Zhigang Zhu Computer Science Department, The City College of New York,
New York, NY, USA
Todd Zickler School of Engineering and Applied Science, Harvard University,
Cambridge, MA, USA
A
Prototype-Based Methods for Human Movement Active calibration is a process that determines the geo-
Modeling metric parameters of a camera (or cameras) using the
camera’s controllable movements.
Action Recognition
Background
Activity Recognition
Affordances and Action Recognition
Camera calibration aims to establish the best possible
correspondence between the used camera model and
the realized image acquisition with a given camera
[12], i.e., accurately recover a camera’s geometric
Active Calibration parameters, such as focal length and image center/
principal point, from the captured images. The clas-
Rui Shen1 , Gaopeng Gou2 , Irene Cheng1 and sical calibration techniques (e.g., [10, 16]) require pre-
Anup Basu3 defined patterns and static cameras and often involve
1
University of Alberta, Edmonton, AB, Canada solving complicated equations. Taking advantage of a
2
Beihang University, Beijing, China camera’s controllable movements (e.g., pan, tilt, and
3
Department of Computing Science, University of roll), active calibration techniques can automatically
Alberta, Edmonton, AB, Canada calibrate the camera.
Synonyms Theory
Active camera calibration; Pan-tilt camera calibra- The pinhole camera model is one of the most com-
tion; Pan-tilt-zoom camera calibration; PTZ camera monly used models, as shown in Fig. 1. p D .x; y/T
calibration is the 2D projection of the 3D point P D .X; Y; Z/T
on the image plane. Using homogeneous coordinates,
pQ and PQ have the following relationship:
Related Concepts
P (X,Y,Z)
(3D Point)
p (x,y)
(2D Point)
O
(0,0,0) f Z
Focal Length
Image Plane
Active Calibration, Table 1 Results of image center is increased, strategy D produces more reliable results
estimation than strategy C.
Angle Ground truth D0 D5 D 10 Strategies C and D were also tested on a real camera
(strategy C) ıx ıy ıx ıy ıx ıy ıx ıy in an indoor environment. The estimates for ıx and ıy
20ı 10 20 11 22 12 23 12 23 obtained by strategy C (90ı roll) were 3 and 29 pixels,
40ı 10 20 11 21 12 22 12 19
60ı 10 20 11 21 12 21 13 21 while the values obtained by strategy D were 2 and 30
80ı 10 20 11 21 11 21 11 22 pixels, which demonstrates the stability of the active
100ı 10 20 10 21 10 21 11 21 calibration algorithms in real situations. The estimated
120ı 10 20 11 21 11 21 11 21 values of fx and fy were 908 and 1,126, respectively.
140ı 10 20 11 21 11 21 11 21
160ı 10 20 10 21 11 21 11 21
180ı 10 20 10 20 10 21 10 21
Strategy D 10 20 10 20 11 20 11 20
References
Active Calibration, Table 2 Results of focal length estimation 1. Basu A (1993) Active calibration. In: ICRA’93: proceedings
of the 1993 IEEE international conference on robotics and
Ground truth D0 D5 automation, Atlanta, vol 2, pp 764–769
Strategy fx fy fx fy fx fy 2. Basu A (1993) Active calibration: alternative strategy and
Strategy C 400 600 403 602 396 603 analysis. In: CVPR’93: proceedings of the 1993 IEEE com-
Strategy D 400 600 401 601 403 599 puter society conference on computer vision and pattern
recognition (CVPR), New York, pp 495–500
3. Basu A (1995) Active calibration of cameras: theory
and implementation. IEEE Trans Syst Man Cybern 25(2):
extrinsic parameters. This method achieves relatively 256–265
high accuracy and robustness, but one drawback is the 4. Basu A, Ravi K (1997) Active camera calibration using pan,
high computational cost. tilt and roll. IEEE Trans Syst Man Cybern B 27(3):559–566
5. Borghese NA, Colombo FM, Alzati A (2006) Computing
camera focal length by zooming a single point. Pattern
Recognit 39(8):1522–1529
6. Brückner M, Denzler J (2010) Active self-calibration of
Experimental Results multi-camera systems. In: Proceedings of the 32nd DAGM
conference on pattern recognition, Darmstadt, pp 31–40
Some experimental results using Strategies C and D 7. Chippendale P, Tobia F (2005) Collective calibration of
from [4] are presented below. active camera groups. In: AVSS’05: proceedings of the
IEEE conference on advanced video and signal based
Tables 1 and 2 summarize the results of comput-
surveillance, Como, pp 456–461
ing the image center and focal lengths using simulated 8. Collins RT, Tsin Y (1999) Calibration of an outdoor active
data, along with the ground truths. The simulated data camera system. In: CVPR’99: proceedings of the 1999 IEEE
contain an image contour consisting of 50 points. The computer society conference on computer vision and pattern
recognition (CVPR), Ft. Collins, pp 528–534
pan and tilt angles were fixed at 3ı . For the experiment
9. Davis J, Chen X (2003) Calibrating pan-tilt cameras in wide-
on image center calculation, additive Gaussian noise area surveillance networks. In: ICCV’03: proceedings of the
with standard deviation of 0 (no noise), 5, and 10 pix- 9th IEEE international conference on computer vision, Nice,
els were added to test the robustness of the algorithms. pp 144–149
10. Horaud R, Mohr R, Lorecki B (1992) Linear camera cal-
It can be seen that strategy C performs reasonably in
ibration. In: ICRA’92: proceedings of the IEEE interna-
determining the image center even when is as large tional conference on robotics and automation, Nice, vol 2,
as 15. The results of strategy D are similar to those pp 1539–1544
produced by strategy C when the roll angle is 180ı. 11. Kanatani K (1987) Camera rotation invariance of image
characteristics. Comput Vis Graph Image Process
For the experiment on focal length calculation,
39(3):328–354
additive Gaussian noise with standard deviation of 12. Klette R, Schlüns K, Koschan A (1998) Computer vision:
0 (no noise) and 5 pixels were added. Strategy D is three-dimensional data from images, 1st edn. Springer,
a little more accurate as the equations are obtained New York/Singapore
13. McLauchlan PF, Murray DW (1996) Active camera cali-
directly without using the estimates of fx and fy .
bration for a head-eye platform using the variable state-
The focal lengths obtained by strategy D are similar dimension filter. IEEE Trans Pattern Anal Mach Intell
to those produced by strategy C. But when the noise 18(1):15–22
Active Sensor (Eye) Movement Control 5 A
14. Seales WB, Eggert DW (1995) Active-camera cali- Background
bration using iterative image feature localization. In:
CAIP’95: proceedings of the 6th international conference
on computer analysis of images and patterns, Prague, The generalized viewpoint [1] of a sensor is the vec- A
pp 723–728 tor of values of the parameters that are under the
15. Sinha SN, Pollefeys M (2006) Pan-tilt-zoom camera cali- control of the observer and which affect the imag-
bration and high-resolution mosaic generation. Comput Vis ing process. Most often, these parameters will be
Image Underst 103(3):170–183
16. Tsai R (1987) A versatile camera calibration technique for the position and orientation of the image sensor, but
high-accuracy 3d machine vision metrology using off-the- may also include such parameters as the focal length,
shelf tv cameras and lenses. IEEE J Robot Autom 3(4): aperture width, and the nodal point to image plane
323–344 distance, of the camera. The definition of general-
ized viewpoint can be extended to include illuminant
degrees of freedom, such as the illuminant position,
wavelength, intensity, spatial distribution (for struc-
Active Camera Calibration tured light applications), and angular distribution (e.g.,
collimation) [2].
Active Calibration
Changes in observer viewpoint are used in active
vision systems for a number of purposes. Some of the
more important uses are:
Active Contours – Tracking a moving object to keep it in the field of
view of the sensor
Numerical Methods in Curve Evolution Theory
– Searching for a specific item in the observer’s envi-
ronment
– Inducing scene-dependent optical flow to aid in
the extraction of 3D structure of objects and
Active Sensor (Eye) Movement Control scenes
– Avoiding “accidental” or nongeneric viewpoints,
James J. Clark which can result in sensor-saturating specularities
Department of Electrical and Computer Engineering, or information-hiding occlusions
McGill University, Montreal, QC, Canada – Minimizing sensor noise and maximizing novel
information content
– Increasing the dynamic range of the sensor, through
Synonyms adjustment of parameters such as sensor sensitivity,
aperture, and focus
Gaze control – Mapping the observer’s environment
Theory
Related Concepts
LowLevel Camera Motion Control Systems Most
Evolution of Robotic Heads; Visual Servoing robotic active vision control systems act mainly to
produce either smooth pursuit motions or rapid sac-
cadic motions. Pursuit motions cause the camera to
Definition move so as to smoothly track a moving object, main-
taining the image of the target object within a small
Active sensors are those whose generalized view- region (usually in the center) of the image frame. Sac-
point (such as sensor aperture, position, and ori- cadic motions are rapid, usually large, jumps in the
entation) is under computer control. Control is position of the camera, which center the camera field
done so as to improve information gathering and of view on different parts of the scene being imaged.
processing. This type of motion is used when scanning a scene,
A 6 Active Sensor (Eye) Movement Control
searching for objects or information, but can also be There are a number of approaches to dealing with
used to recover from a loss of tracking of an object delay. PID or PD systems can be made robust to delays
during visual pursuit. simply by increasing system damping by reducing the
Much has been learned about the design of pursuit proportional feedback gain to a sufficiently low value
and saccadic motion control systems from the study [7]. This results in a system that responds to changes
of primate oculomotor systems. These systems have in target position very slowly, however, and is unac-
a rather complicated architecture distributed among ceptable for most applications. For control of saccadic
many brain areas, the details of which are still sub- motion, a sample/hold can be used, where the posi-
ject to vigorous debate [3]. The high-level structure, tion error is sampled at the time a saccade is triggered,
however, is generally accepted to be that of a feedback and held in a first-order hold (integrator) [8]. In this
system. A very influential model of the human oculo- way, the position error seen by the controller is held
motor control system is that of Robinson [4], and many constant until the saccadic motion is completed. The
robotic vision control systems employ aspects of the controller is insensitive to any changes in the actual
Robinson model. target position until the end of this refractory period.
The control of an active camera system is both sim- This stabilizes the controller, but has the drawback that
ple and difficult at the same time. Simplicity arises if the target moves during the refractory period, the
from the relatively unchanging characteristics of the position error at the end of the refractory period can
load or “plant” being controlled. For most systems the be large. In this case, another, corrective or secondary,
moment of inertia of the camera changes only mini- saccadic motion may need to be triggered. For sta-
mally over the range of motion, with slight variations bilization of pursuit control systems in the presence
arising when zoom lenses are used. The mass of the of delay, an internal positive feedback loop can be
camera and associated linkages does not change. Iner- employed [4, 8]. This positive feedback compensates
tial effects become more important for control of the for delays in the negative feedback servo loop created
“neck” degrees of freedom due to the changing ori- by the time taken to acquire an image and compute
entation and position of the camera bodies relative to the target velocity error. The positive feedback loop
the neck. The specifications on the required veloci- sends a delayed efference copy of the current com-
ties and control bandwidth for the neck motions are manded camera velocity (which is the output of the
typically much less stringent than those for the cam- pursuit controller) back to the velocity error compara-
era motions, so that the inertial effects for the neck tor where it is added to the measured velocity error.
are usually neglected. The relatively simple nature The positive feedback delay is set so that it arrives at
of the oculomotor plant means that straightforward the velocity error comparator at the same time as the
proportional-derivative (PD) or proportional-integral- measurement of the effect of the current control com-
derivative (PID) control systems are often sufficient for mand, effectively canceling out the negative feedback
implementing tracking or pursuit motion. Some sys- and producing a new target velocity for the controller.
tems have employed more complex optimal control Another delay handling technique is to use predictive
systems (e.g., [5]) which provide improved disturbance control, such as the Smith Predictor, where the camera
rejection and trajectory following accuracy compared position and controller states are predicted for a time T
to the simpler approaches. in the future, where T is the controller delay, and con-
There is a serious difficulty in controlling cam- trol signals appropriate for those states are computed
era motion systems, however, caused by delays in the and applied immediately [6, 7]. Predictive methods
control loops. Such delays include the measurement make strong assumptions on changes in the external
delay due to the time needed to acquire and digitize environment (e.g., that all objects in the scene are static
the camera image and subsequent computations, such or traversing known smooth trajectories). Such meth-
as feature extraction and target localization. There is ods can perform poorly when these assumptions are
also a delay or latency arising from the time needed violated.
to compute the controller output signal [6]. If these The Next-Look Problem and Motion Planning The con-
delays are not dealt with, a simple PD or PID controller trol of pursuit and saccadic motions are usually han-
can become unstable, leading to excessive vibration or dled by different controllers. While pursuit or tracking
shaking and loss of target tracking. behavior can be implemented using frequent small
Active Sensor (Eye) Movement Control 7 A
saccade-like motions, this can produce jumpy images cameras, and the pan actions were sometimes linked
which may degrade subsequent processing operations. together to provided vergence and/or version motions
With multiple controllers, there needs to be a way for only. Examples include the UPenn head [11], generally A
the possibly conflicting commands from the controllers recognized as the first of its kind, the Harvard head
to be integrated and arbitrated. The simplest approach [12], the KTH head [13], the TRISH head from the
uses the output of the pursuit control system by default, University of Toronto [14], the Rochester head [15],
with a switch over to the output of the saccade con- the SAGEM-GEC-Inria-Oxford head [16], the Surrey
trol system whenever the position error is greater than head [17], the LIFIA head [18], the LIA/AUC head
some threshold and switching back to pursuit control [19], and the Technion head [5]. These early robotic
when the position error drops below another (lower) heads generally used PD servo loops, some with delay
threshold. compensation mechanisms as described above, and
Pursuit or tracking of visual targets is just one type were capable of speeds up to 180 degrees per second.
of motor activity. Activities such as visual search may The pan axis maximum rotational velocities were usu-
require large shifts of camera position to be executed ally higher than those of the tilt and vergence speeds.
based on a complex analysis of the visual input. The The axes were most often driven either by DC motors
process of determining the active vision system con- or by stepper motors.
troller set point is often referred to as sensor planning A more recent example of a research system is the
[1] or the next-look problem [9]. The next-look prob- head of the iCub humanoid robot [20]. Unlike the early
lem can be interpreted as determining sensor positions robotic heads, which were one-off systems limited to
which increase or maximize the information content use in a single laboratory, this robot was developed
of subsequent measurements. In a visual search task, by a consortium of European institutions and is used
for example, the next-look may be specified to be a in many different research laboratories. It has inde-
location which is expected to maximally discriminate pendent pan and common tilt for two cameras as well
between target and distractor. One principle that has as three neck degrees of freedom. The maximum pan
been successfully employed in next-look processes is speed is 180 degrees per second, and the maximum tilt
that of entropy minimization over viewpoints. In an speed is 160 degrees per second.
object recognition or visual search task, this approach Currently, most robotic active vision systems are
takes as the next viewpoint that which is maximally based on commercially available monocular pan-tilt
informative relative to the most probable hypotheses platforms. The great majority of commercial platforms
[10]. A common approach to the next-view problem are designed for surveillance applications and are rela-
in robotic systems is to employ an attention mecha- tively slow. There are a few systems with specifications
nism to provide the location of the next view. Based that are suitable for robotic active vision systems.
on models of mammalian vision systems, attention Perhaps the most commonly used of these fast plat-
mechanisms determine salient regions in visual input, forms are made by FLIR Motion Control Systems, Inc.
which compete or interact in winner-takes-all fash- (formerly Directed Perception). These are capable of
ion to select a single location as the target for the speeds up to 120 degrees per second and can handle
subsequent motion [8]. loads of up to 90 lbs. Commercial systems generally
lack torsional motion and hence are not suitable for
precise stereo vision applications.
Application The fastest current commercial pan/tilt units, as
well as the early research platforms, only reach max-
In the late 1980s and early 1990s, commercial camera imum speeds of around 200 degrees per second. This
motion platforms lacked the performance needed by is sufficient to match the speeds of human pursuit eye
robotics researchers and manufacturers. This led many movements, which top out around 100 degrees per sec-
universities to construct their own platforms and ond. However, if these speeds are compared to the
develop control systems for them. These were gener- maximum speed of 800 degrees per second for human
ally binocular camera systems with pan and tilt degrees saccadic motions, it can be seen that the performance
of freedom for each camera. Often, to simplify the of robotic active vision motion platforms still has room
design, a common tilt action was employed for both for improvement.
A 8 Active Stereo Vision
References
Active Stereo Vision
1. Tarabanis K, Tsai RY, Allen PK (1991) Automated sen-
sor planning for robotic vision tasks. In: Proceedings of Andrew Hogue1 and Michael R. M. Jenkin2
the 1991 IEEE conference on robotics and automation, 1
Faculty of Business and Information Technology,
Sacramento, pp 76–82
University of Ontario Institute of Technology,
2. Yi S, Haralick RM, Shapiro LG (1990) Automatic sensor
and light source positioning for machine vision. In: Pro- Oshawa, ON, Canada
2
ceedings of the computer vision and pattern recognition Department of Computer Science and Engineering,
conference (CVPR), Atlantic City, June 1990, pp 55–59 York University, Toronto, ON, Canada
3. Kato R, Grantyn A, Dalezios Y, Moschovakis AK (2006)
The local loop of the saccadic system closes downstream of
the superior colliculus. Neuroscience 143(1):319–337
4. Robinson DA (1968) The oculomotor control system: a Related Concepts
review. Proc IEEE 56(6):1032–1049
5. Rivlin E, Rotstein H (2000) Control of a camera for active
Camera Calibration
vision: foveal vision, smooth tracking and saccade. Int J
Comput Vis 39(2):81–96
6. Brown C (1990) Gaze controls with interactions and delays.
IEEE Trans Syst Man Cybern 20(1):518–527
7. Sharkey PM, Murray DW (1996) Delays versus perfor- Definition
mance of visually guided systems. IEE Proc Control Theory
Appl 143(5):436–447 Active stereo vision utilizes multiple cameras for 3D
8. Clark JJ, Ferrier NJ (1992) Attentive visual servoing. In:
Blake A, Yuille AL (eds) An introduction to active vision.
reconstruction, gaze control, measurement, tracking,
MIT, Cambridge, pp 137–154 and surveillance. Active stereo vision is to be con-
9. Swain MJ, Stricker MA (1993) Promising directions in trasted with passive or dynamic stereo vision in that
active vision. Int J Comput Vis 11(2):109–126 passive systems treat stereo imagery as a series of inde-
10. Arbel T, Ferrie FP (1999) Viewpoint selection by naviga-
tion through entropy maps. In: Proceedings of the seventh
pendent static images while active and dynamic sys-
IEEE international conference on computer vision, Kerkyra, tems employ temporal constraints to integrate stereo
pp 248–254 measurements over time. Active systems utilize feed-
11. Krotkov E, Bajcsy R (1988) Active vision for reliable rang- back from the image streams to manipulate camera
ing: cooperating, focus, stereo, and vergence. Int J Comput
Vis 11(2):187–203
parameters, illuminants, or robotic motion controllers
12. Ferrier NJ, Clark JJ (1993) The Harvard binocular head. Int in real time.
J Pattern Recognit Artif Intell 7(1):9–31
13. Pahlavan K, Eklundh J-O (1993) Heads, eyes and head-eye
systems. Int J Pattern Recognit Artif Intell 7(1):33–49
14. Milios E, Jenkin M, Tsotsos J (1993) Design and perfor- Background
mance of TRISH, a binocular robot head with torsional eye
movements. Int J Pattern Recognit Artif Intell 7(1):51–68 Stereo vision uses two or more cameras with over-
15. Coombs DJ, Brown CM (1993) Real-time binocular smooth
lapping fields of view to estimate 3D scene structure
pursuit. Int J Comput Vis 11(2):147–164
16. Murray DW, Du F, McLauchlan PF, Reid ID, Sharkey PM, from 2D projections. Binocular stereo vision – the
Brady M (1992) Design of stereo heads. In: Blake A, most common implementation – uses exactly two cam-
Yuille A (eds) Active vision. MIT, Cambridge, eras, yet one can utilize more than two at the expense
Massachusetts, USA, pp 155–172
of computational speed within the same algorithmic
17. Pretlove JRG, Parker GA (1993) The Surrey attentive robot
vision system. Int J Pattern Recognit Artif Intell 7(1): framework.
89–107 The “passive” stereo vision problem can be
18. Crowley JL, Bobet P, Mesrabi M (1993) Layered control of described as a system of at least two cameras attached
a binocular camera head. Int J Pattern Recognit Artif Intell
rigidly to one another with constant intrinsic calibra-
7(1):109–122
19. Christensen HI (1993) A low-cost robot camera head. Int J tion parameters (assumed), and the stereo pairs are
Pattern Recognit Artif Intell 7(1):69–87 considered to be temporally independent. Thus no
20. Beira R, Lopes M, Praga M, Santos-Victor J, Bernardino A, assumptions are made, nor propagated, about cam-
Metta G, Becchi F, Saltaren R (2006) Design of the robot-
era motion within the algorithmic framework. Passive
cub (iCub) head. In: Proceedings of the 2006 IEEE inter-
national conference on robotics and automation, Orlando, vision systems are limited to the extraction of metric
Florida, USA, pp 94–100 information from a single set of images taken from
Active Stereo Vision 9 A
different locations in space (or at different times) and environment [7, 8]. Examples of the output of such a
treat individual frames in stereo video sequences inde- system is shown in Fig. 2, and [9] provides an exam-
pendently. Dynamic stereo vision systems are charac- ple of an active system that interleaves the vergence A
terized by the extraction of metric information from and focus control of the cameras with surface esti-
sequences of imagery (i.e., video) and employ tempo- mation. The system uses an adaptive self-calibration
ral constraints or consistency on the sequence (e.g., method that integrates the estimation of camera param-
optical flow constraints). Thus, dynamic stereo sys- eters with surface estimation using prior knowledge
tems place assumptions on the camera motion such of the calibration, the motor control, and the previ-
as its smoothness (and small motion) between subse- ous estimate of the 3D surface properties. The result-
quent frames. Active stereo vision systems subsume ing system is able to automatically fixate on salient
both passive and dynamic stereo vision systems and visual targets in order to extend the surface estimation
are characterized by the use of robotic camera systems volume.
(e.g., stereo heads) or specially designed illuminant Although vision is a powerful sensing modality, it
systems (e.g., structured light) coupled with a feedback can fail. This is a critical issue for active stereo vision
system (see Fig. 1) for motor control. Although sys- where data is integrated over time. The use of comple-
tems can be designed with more modest goals – object mentary sensors – traditionally Inertial Measurement
tracking, for example – the common computational Units (see [10]) – augment the camera hardware
goal is the construction of large-scale 3D models of system with the capability to estimate the system
extended environments. dynamics using real-world constraints. Accelerome-
ters, gyroscopes, and compasses can provide timely
and accurate information either to assist in temporal
Theory correspondences and ego-motion estimation or as a
replacement when visual information is unreliable or
Fundamentally, active stereo systems (see [1]) must absent (i.e., dead reckoning).
solve three rather complex problems: (1) spatial
correspondence, (2) temporal correspondence, and Relation to Robotics and Mapping
(3) motor/camera/illuminant control. Spatial corre- A wide range of different active and dynamic stereo
spondence is required in order to infer 3D depth systems have been built (e.g., [7, 8, 11, 12]). Active
information from the information available in camera systems are often built on top of mobile systems
images captured at one time instant, while temporal (e.g., [7]) blurring the distinction between active and
correspondence is necessary to integrate visual infor- dynamic systems. In robotics, active stereo vision has
mation over time. The spatial and temporal correspon- been used for vehicle control in order to create 2D or
dences can either be treated as problems in isolation or 3D maps of the environment. Commonly the vision
integrated within a common framework. For example, system is complemented by other sensors. For instance
stereo correspondence estimation can be seeded using in [13], active stereo vision is combined with sonar
an ongoing 3D representation using temporal coher- sensors to create 2D and 3D models of the environ-
ence (e.g., [2, 3]) or considered in isolation using ment. Murray and Little [14] use a trinocular stereo
standard disparity estimation algorithms (see [4]). system to create occupancy maps of the environment
Motor or camera control systems are necessary for in-the-loop path planning and robot navigation.
to move (rotate and translate) the cameras so they Diebel et al. [15] employ active stereo vision for simul-
look in the appropriate direction (i.e., within a track- taneous estimation of the robot location and 3D map
ing or surveillance application), change their intrinsic construction, and [7], describes a vision system used
camera parameters (e.g., focal length or zoom), or for in-the-loop mapping and navigational control for
to tune the image processing algorithm to achiev- an aquatic robot. Davison, in [6], was one of the first to
ing higher accuracy for a specific purpose. Solving effectively demonstrate the use of active stereo vision
these three problems in an active stereo system enables technology as part of the navigation loop. The system
one to develop “intelligent” algorithms that infer ego- used a stereo head to selectively fixate scene fea-
motion [5], autonomously control vehicles throughout tures that improve the quality of the estimated map
the world [6], and/or reconstruct 3D models of the and trajectory. This involved using knowledge of the
A 10 Active Stereo Vision
a b
Uncontrolled
Movement
Object in Scene
Object in Scene (Dynamic)
(Static)
c Visual Feedback d
Pan
Motor Control T
Tilt
Cyclotorsion
Controlled
Movement
Active Stereo Vision, Fig. 2 Point cloud datasets obtained by the active stereo system described in [7]
Active Stereo Vision 11 A
current map of the environment to point the camera up to a rigid rotation/translation/scale) by noting these
system in the direction where it should find salient features should match in 3D space as well as in 2D
features that it had seen before, move the robot to a space. A transformation can be linearly estimated that A
location where the features are visible, and then search- constrains the projective solution to an affine recon-
ing visually to find image locations corresponding to struction. Once the plane at infinity is known, the
these features. affine solution may be upgraded to a metric solution.
In order to achieve the desired accuracy in the intrin-
sics, a nonlinear minimization scheme is employed to
Active Stereo Heads
improve the solution. If one trusts the accuracy of the
The development of hardware platforms to mimic
camera motion control system, the extrinsics can be
human biology has resulted in a variety of differ-
seeded with this information in a nonlinear optimiza-
ent designs and methods for controlling binocular
tion scheme that minimizes the reprojection error of
sets of cameras. These result in what is known as
the image matching points and their 3D triangulated
“stereo heads” (see entry on the evolution of stereo
counterparts. This nonlinear optimization is known as
heads in this volume). These hardware platforms all
bundle adjustment [20] and is used in a variety of forms
have a common set of constraints, i.e., the systems
in the structure-from-motion literature (see [17, 18]).
consist of two cameras (binocular) with camera intrin-
sics/extrinsics that may be controlled. In [16], an active
stereo vision system is developed that mimics human
Relation to Other Types of Stereo Systems
biology that uses a bottom-up saliency map model cou-
Since active stereo systems are characterized by the
pled with a selective attention function to select salient
use of visual feedback to inform motor control sys-
image regions. If both left and right cameras estimate
tems (or higher-level vehicular navigational systems),
similar amounts of saliency in the same areas, the ver-
they are related to a wide range of research areas
gence of the cameras are controlled so that the cameras
and hardware systems. Mounting a stereo system to
are focused on this particular landmark.
a robotic vehicle is common in the robotics literature
to inform the navigation system about the presence
Autocalibration of obstacles [21] and to provide input to mapping
A fundamental issue with active stereo vision is the algorithms [22]. The use of such active systems are
need to establish and maintain calibration parame- applicable directly to autonomous systems as they pro-
ters online. Intrinsics and extrinsics are necessary to vide a high amount of controllable accuracy and dense
the 3D estimation process as they define the epipolar measurements at relatively low computational cost.
constraints which enable efficient disparity estimation One significant example is the use of active stereo in
algorithms [17, 18]. Each time the camera parameters the Mars Rover autonomous vehicles [12].
are modified (e.g., vergence of the cameras, change Estimating 3D information from stereo views is
of focus), the epipolar geometry must be re-estimated. problematic due to the lack of (or ambiguous) tex-
Although kinematic modeling of motor systems pro- ture in man-made environments. This can be alleviated
vide good initial estimates of changes in camera pose, with the use of active illumination [23]. Projecting a
this is generally insufficiently accurate to be used by known pattern, rather than uniform lighting, into the
itself to update camera calibration. Thus, autocalibra- scene enables the estimation of a more dense disparity
tion becomes an important task within active stereo field using standard stereo disparity estimation algo-
vision. Approaches to autocalibration are outlined in rithms due to the added texture in textureless regions
[17, 19]. In [17], the autocalibration algorithm operates (see [24]). The illumination may be controlled actively
on pairs of stereo images taken in sequence. A pro- depending on perceived scene texture, the desired
jective reconstruction for motion and structure of the range, or the ambient light intensity of the environ-
scene is constructed. This is performed for each pair ment. The illumination may be within the visible light
of stereo images individually for the same set of fea- spectrum or in the infrared spectrum as most camera
tures (thus they must be matched in the stereo pairs sensors are sensitive to IR light. This has the added
as well as tracked temporally). The projective solu- advantage that humans in the environment are not
tions can be upgraded to an affine solution (ambiguous affected by the additional illumination.
A 12 Active Stereo Vision
Theory
Background
Let O D fo1 ; o2 ; ; on g be a sequence of obser-
The classic study on visual analysis of biologi- vations of the movement of a person over a period
cal motion using moving light display (MLD) [1] of time. The observations can be a sequence of joint
has inspired tremendous interests among the com- angles, a sequence of color images or silhouettes, a
puter vision researchers in the problem of recog- sequence of depth maps, or a combination of them. The
nizing human motion through visual information. task of activity recognition is to label O into one of the
The commonly used devices to capture human move- L classes C D fc1 ; c2 ; ; cL g. Therefore, solutions to
ment include human motion capture (MOCAP) with the problem of activity recognition are often based on
or without markers, multiple video camera systems, machine learning and pattern recognition approaches,
and single video camera systems. A MOCAP device and an activity recognition system usually involves
A 14 Activity Recognition
extracting features from the observation sequence O, by classifying the STIPs into a set of vocabulary (i.e.,
learning a classifier from training samples, and clas- a bag of visual words) and calculating the histogram
sifying O using the trained classifier. However, the of the occurrence of the vocabulary within the obser-
spatial and temporal complexity of human activities vation sequence O. There are two commonly used
has led researchers to cast the problem from differ- STIP extraction techniques. One extends Harris corner
ent perspectives. Specifically, the existing techniques detection and automatic scale selection in 2D space to
for activity recognition can be divided into two cate- 3D space and time [6] and the other is based on a pair
gories based on whether the dynamics of the activities of one-dimensional (1D) Gabor filters applied tempo-
is implicitly or explicitly modeled. rally and spatially [7]. Recently, another STIP detector
In the first category [2–9], the problem of activity has been proposed by decomposing an image sequence
recognition is cast from a temporal classification prob- into spatial components and motion components using
lem to a static classification one by representing activ- nonnegative matrix factorization and detecting STIPs
ities using descriptors. A descriptor is extracted from in 2D spatial and 1D motion space using difference of
the observation sequence O, which intends to capture Gaussian (DoG) detectors [8]. In terms of the classifier
both spatial and temporal information of the activ- for STIP-based descriptors, besides SVM and KNN,
ity and, hence, to model the dynamics of the activity latent topic models such as the probabilistic latent
implicitly. Activity recognition is achieved by a con- semantic analysis (pLSA) model and latent Dirich-
ventional classifier such as Support Vector Machines let allocation (LDA) were used in [9]. STIP-based
(SVM) or K-nearest neighborhood (KNN). There are descriptors have a few practical advantages including
three commonly used approaches to extract activity being applicable to image sequences in realistic condi-
descriptors. tions, not requiring foreground/background separation
The first approach builds motion energy images or human tracking, and having the potential to deal
(MEI) and motion history images (MHI), proposed with partial occlusions [10]. In many realistic appli-
by Bobick and Davis [2], by stacking a sequence cations, an activity may occupy only a small portion
of silhouettes to capture where and how the motion of the entire space-time volume of a video sequence.
is performed. Activity descriptors are extracted from In such situations, it does not make sense to clas-
the MEI and MHI. For instance, seven Hu moments sify the entire video. Instead, one needs to locate the
were extracted in [2] to serve as action descrip- activity in space and time. This is commonly known
tors and recognition was based on the Mahalanobis as the activity detection or action detection problem.
distance between the moment descriptors of the trained Techniques have been developed for activity detection
activities and the input activity. using interest points [11].
The second approach considers a sequence of In the second category [12–17], the proposed meth-
silhouettes as a spatiotemporal volume, and an activ- ods usually follow the concept that an activity is
ity descriptor is computed from the volume. Typi- a temporal evolution of the spatial configuration of
cal examples are the work by Yilmaz and Shah [3] the body parts and, hence, emphasize more on the
which computes the differential geometric surface dynamics of the activities than the methods in the
properties (i.e., Gaussian curvature and mean curva- first category. They usually extract a sequence of fea-
ture); the work by Gorelick et al. [4] which extracts ture vectors, each feature vector being extracted from
space-time saliency, action dynamics, and shape struc- a frame, or a small neighborhood, of the observation
ture and orientation; and the work by Mokhber sequence O. The two commonly used approaches are
et al. [5] which calculates the 3D moments of the temporal templates and graphical models.
volume. The temporal-template-based approach, typically,
The third approach describes an activity using a set directly represents the dynamics through exem-
of spatiotemporal interest points (STIPs). The general plar sequences and adopts dynamic time warp-
concept is first to detect STIPs from the observations ing (DTW) to compare an input sequence with
O which is usually a video sequence. Features are then the exemplar sequences. For instance, Wang and
extracted from a local volume around each STIP, and Suter [18] employed locality preserving projection
a descriptor can be formed by simply aggregating the (LPP) to project a sequence of silhouettes into a low-
local features together to become a bag of features or dimensional space to characterize the spatiotemporal
Activity Recognition 15 A
property of the activity and used DTW and temporal features extracted at a variety of scales on the sil-
Hausdorff distance for similarity matching. houettes and 3D joint angles. The results have shown
In the graphical model-based approach, both gener- that CRFs outperform the HMM and are also robust A
ative and discriminative models have been extensively against the variability of the test sequences with respect
studied for activity recognition. The most prominent to the training samples. More recently, Wang and
generative model is the hidden Markov model (HMM), Mori [17] modeled a human action by a flexible
where sequences of observed features are grouped into constellation of parts conditioned on image observa-
similar configuration, i.e., states, and both the prob- tions using hidden conditional random fields (HCRF)
ability distribution of the observations at each state and achieved highly accurate frame-based action
and the temporal transitional functions between these recognition.
states are learned from training samples. The first Despite the extensive effort and progress in activ-
work on action recognition using HMM is probably by ity recognition research in the past decade, continu-
Yamato et al. [12], where a discrete HMM is used to ous recognition of activities under realistic conditions,
represent sequences over a set of vector-quantized sil- such as with viewpoint invariance and large number of
houette features of tennis footage. HMM is a powerful activities, remains challenging.
tool to model a small number of short-term activi-
ties since a practical HMM is usually a fixed- and
low-order Markov chain. Notable early extensions to Application
overcome this drawback of the HMM are the variable-
length Markov models (VLMM) and layered HMM. Activity recognition has many potential applications.
For details, the readers are referred to [13, 14], respec- It is one of the key enabling technologies in security
tively. Recently, a more general generative graphi- and surveillance for automatic monitoring of human
cal model, referred to as an action graph, has been activities in a public space and of activities of daily
established in [15], where nodes of the action graph living of elderly people at home. Robust understand-
represents salient postures that are used to character- ing and interpretation of human activities also allow
ize activities and shared by different activities, and a natural way for humans to interact with machines.
weight between two nodes measures the transitional A proper modeling of the spatial configuration and
probability between the two postures represented by dynamics of human motion would enable realistic
the two nodes. An activity is encoded by one or mul- synthesis of human motion for gaming and movie
tiple paths in the action graph. Due to the sharing industry and help train humanoid robots in a flexi-
mechanism, the action graph can be trained and also ble and economic way. In sports, activity recognition
easily expanded to new actions with a small number of technology has also been used in training and in the
training samples. In addition, the action graph does not retrieval of video sequences.
need special nodes representing beginning and ending
postures of the activities and, hence, allows continuous
recognition. References
The generative graphical models often rely on an
assumption of statistical independence of observations 1. Johansson G (1973) Visual perception of biological motion
and a model for its analysis. Percept Psychophys 14(2):
to compute the joint probability of the states and the
201–211
observations. This makes it hard to model the long- 2. Bobick A, Davis J (2001) The recognition of human move-
term contextual dependencies which is important to ment using temporal templates. IEEE Trans Pattern Anal
the recognition of activities over a long period of time. Mach Intell 23(3):257–267
3. Yilmaz A, Shah M (2008) A differential geometric approach
The discriminative models, such as conditional random
to representing the human actions. Comput Vision Image
fields (CRF), offer an effective way to model long- Underst 109(3):335–351
term dependency and compute the conditional proba- 4. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (2007)
bility that maps the observations to the motion class Actions as space-time shapes. IEEE Trans Pattern Anal
Mach Intell 29(12):2247–2253
labels. The linear chain CRF was employed in [16]
5. Mokhber A, Achard C, Milgram M (2008) Recognition of
to recognize ten different human activities using fea- human behavior by space-time silhouette characterization.
tures of combined shape-context and pair-wise edge Pattern Recogn 29(1):81–89
Another Random Document on
Scribd Without Any Related Topics
BOBBINS.
EUROPE.
We have already seen how an increase in the number of
correspondences between objects from distant countries increases the
weight of their evidence in favor of contact or communication between
the peoples. If it should be found upon comparison that the bobbins on
which thread is to be wound, as well as the spindle-whorls with which it
is made, had been in use during prehistoric times in the two
hemispheres, it would add to the evidence of contact or communication.
The U. S. National Museum possesses a series of these bobbins, as they
are believed to have been, running from large to small, comprising about
one dozen specimens from Italy, one from Corneto and the others from
Bologna, in which places many prehistoric spindle whorls have been
found (figs. 367 and 368). These are of the type Villanova. The end as
well as the side view is represented. The former is one of the largest,
the latter of middle size, with others smaller forming a graduating series.
The latter is engraved on the end by dotted incisions in three parallel
lines arranged in the form of a Greek cross. A similar bobbin from
Bologna bears the sign of the Swastika on its end (fig. 193).[314] It was
found by Count Gozzadini and forms part of his collection in Bologna.
Fig. 367.
BOBBIN OR SPOOL FOR WINDING THREAD (?).
Type Villanova. Corneto, Italy.
U. S. National Museum.
Fig. 368.
TERRA-COTTA BOBBIN OR SPOOL FOR WINDING THREAD (?).
Type Villanova. Bologna, Italy.
Cat. No. 101771, U. S. N. M.
UNITED STATES.
The three following figures represent clay and stone bobbins, all from
the State of Kentucky. Fig. 369 shows a bobbin elaborately decorated,
from a mound near Maysville, Ky. It has a hole drilled longitudinally
through the center. The end shows a cross of the Greek form with this
hole in the center of the cross. Fig. 370 shows a similar object from
Lexington, Ky., sent by the Kentucky University. It is of fine-grained
sandstone, is drilled longitudinally through the center and decorated as
shown. The end view shows a series of concentric circles with rows of
dots in the intervals. Fig. 371 shows a similar object of fine-grained
sandstone from Lewis County, Ky. It is also drilled longitudinally, and is
decorated with rows of zigzag lines as shown. The end view represents
four consecutive pentagons laid one on top of the other, which increase
in size as they go outward, the hole through the bobbin being in the
center of these pentagons, while the outside line is decorated with
spikes or rays extending to the periphery of the bobbin, all of which is
said to represent the sun. The specimen shown in fig. 372, of fine-
grained sandstone, is from Maysville, Ky. The two ends are here
represented because of the peculiarity of the decoration. In the center is
the hole, next to it is a rude form of Greek cross which on one end is
repeated as it goes farther from the center; on the other, the decoration
consists of three concentric circles, one interval of which is divided by
radiating lines at regular intervals, each forming a rectangle. Between
the outer lines and the periphery are four radiating rays which, if
completed all around, might form a sun symbol. Bobbins of clay have
been lately discovered in Florida by Mr. Clarence B. Moore and noted by
Professor Holmes.
Fig. 369.
BOBBIN (?) FROM A MOUND NEAR MAYSVILLE, KENTUCKY.
Cat. No. 16748, U. S. N. M.
Fig. 370.
BOBBIN (?) FROM LEXINGTON, KENTUCKY.
Cat. No. 16691, U. S. N. M.
Fig. 371.
BOBBIN (?) OF FINE-GRAINED SANDSTONE.
Lewis County, Kentucky.
Cat. No. 59681, U. S. N. M.
Thus we find some of the same objects which in Europe were made and
used by prehistoric man and which bore the Swastika mark have
migrated to America, also in prehistoric times, where they were put to
the same use and served the same purpose. This is certainly no
inconsiderable testimony in favor of the migration of the sign.
VIII.—Similar Prehistoric Arts,
Industries, and Implements in Europe and
America as Evidence of the Migration of
Culture.
The prehistoric objects described in the foregoing chapter are not the
only ones common to both Europe and America. Related to the spindle-
whorls and bobbins is the art of weaving, and it is perfectly susceptible
of demonstration that this art was practiced in the two hemispheres in
prehistoric times. Woven fabrics have been found in the Swiss lake
dwellings, in Scandinavia, and in nearly all parts of Europe. They
belonged to the Neolithic and Bronze ages.
Fig. 372.
VIEW SHOWING BOTH ENDS OF A BOBBIN(?) OF FINE-GRAINED SANDSTONE.
Maysville, Kentucky. Cat. No. 16747, U. S. N. M.
Figs. 373 and 374 illustrate textile fabrics in the Bronze Age. Both
specimens are from Denmark, and the National Museum possesses
another specimen (Cat. No. 136615) in all respects similar. While
prehistoric looms may not have been found in Europe to be compared
with the looms of modern savages in America, yet these specimens of
cloth, with the hundreds of others found in the Swiss lake dwellings,
afford the most indubitable proof of the use of the looms in both
countries during prehistoric times.
Complementary to this, textile fabrics have been found in America, from
the Pueblo country of Utah and Colorado, south through Mexico, Central
and South America, and of necessity the looms with which they were
made were there also. It is not meant to be said that the looms of the
two hemispheres have been found, or that they or the textile fabrics are
identical. The prehistoric looms have not been found in Europe, and
those in America may have been affected by contact with the white
man. Nor is it meant to be said that the textile fabrics of the two
hemispheres are alike in thread, stitch, or pattern. But these at best are
only details. The great fact remains that the prehistoric man of the two
hemispheres had the knowledge to spin fiber into thread, to wind it on
bobbins, and to weave it into fabrics; and whatever differences there
may have been in pattern, thread, or cloth, they were finally and
substantially the same art, and so are likely to have been the product of
the same invention.
While it is not the intention to continue this examination among the
prehistoric objects of the two hemispheres in order to show their
similarity and thus prove migration, contact, or communication, yet it
may be well to mention some of them, leaving the argument or proof to
a future occasion.
The polished stone hatchets of the two hemispheres are substantially
the same. There are differences of material, of course, for in each
country the workman was obliged to use such material as was
obtainable. There are differences in form between the polished stone
hatchets of the two hemispheres, but so there are differences between
different localities in the same hemisphere. Some hatchets are long,
others short, some round, others flat, some have a pointed end, others a
square or nearly square or unfinished end; some are large, others small.
But all these differences are to be found equally well pronounced within
each hemisphere.
Scrapers have also been found in both
hemispheres and in all ages. There are the
same differences in material, form, and
appearance as in the polished stone hatchet.
There is one difference to be mentioned of
this utensil—i. e., in America the scraper has
been sometimes made with a stem and with
notches near the base, after the manner of
arrow- and spear-heads, evidently intended
to aid, as in the arrow- and spear-head, in
fastening the tool in its handle. This
peculiarity is not found in Europe, or, if
found, is extremely rare. It is considered
that this may have been caused by the use
of a broken arrow- or spear-head, which
seems not to have been done in Europe. But
this is still only a difference in detail, a
difference slight and insignificant, one which
occurs seldom and apparently growing out
of peculiar and fortuitous conditions.
CHAILLU, Paul B. Du. The Viking Age | The Early History | Manners and
Customs of the Ancestors | of the English-Speaking Nations |
Illustrated from | The Antiquities Discovered in Mounds, Cairns, and
Bogs, | As Well as from the Ancient Sagas and Eddas. | By | Paul B.
Du Chaillu | Author of “Explorations in Equatorial Africa,” “Land of
the Midnight Sun,” etc. | With 1366 Illustrations and Map. | In Two
Volumes * * | New York: | Charles Scribner’s Sons. | 1889.
8º, i, pp. i-xx, 1-591; ii, pp. i-viii, 1-562.
Swastika in Scandinavia. Swastika and triskelion, Vol. i, p. 100, and note
1; Vol. ii, p. 343. Swastika, Cinerary urn, Bornholm, Vol. i, fig. 210, p.
138. Spearheads with runes, Swastika and Triskelion, Torcello, Venice,
fig. 335, p. 191. Tetraskelion on silver fibula, Vol. i, fig. 567, p. 257, and
Vol. ii, fig. 1311, p. 342. Bracteates with Croix swasticale, Vol. ii, p. 337,
fig. 1292.
CHANTRE, Ernest. Études Paléoethnologiques | dans le Bassin du Rhône
| Âge du Bronze | Recherches | sur l’Origine de la Métallurgie en
France | Par | Ernest Chantre | Première Partie | Industrie de l’Âge
du Bronze | Paris, | Librairie Polytechnique de J. Baudry | 15, Rue
Des Saints-Pères, 15 | mdccclxxv.
Folio, pp. 1-258.
—— Deuxième Partie. Gisements de l’Âge du Bronze. pp. 321.
—— Troisième Partie. Statistique. pp. 245.
Swastika migration, p. 206. Oriental origin of the prehistoric Sistres or
tintinnabula found in Swiss lake dwellings, Vol. i, p. 206.
Spirals, Vol. ii, fig. 186, p. 301.
—— Notes Anthropologiques: De l’Origine Orientale de la Métallurgie. In-
8, avec planches. Lyon, 1879.
—— Notes Anthropologiques. Relations entre les Sistres Bouddhiques et
certains Objets Lacustres de l’Age du Bronze. In-8. Lyon, 1879.
—— L’Âge de la Pierre et l’Âge du Bronze en Troade et en Grèce. In-8.
Lyon, 1874.
—— L’Âge de la Pierre et l’Âge du Bronze dans l’Asie Occidentale. (Bull.
Soc. Anth., Lyon, t. I, fasc. 2, 1882.)
—— Prehistoric Cemeteries in Caucasus. (Nécropoles préhistoriques du
Caucase, renferment des crânes macrocéphales.)
ebookbell.com