Fujipress - JRM 29 2 1
Fujipress - JRM 29 2 1
p0275
Review:
© Fuji Technology Press Ltd. Creative Commons CC BY-ND: This is an Open Access article distributed under the terms of
the Creative Commons Attribution-NoDerivatives 4.0 International License (http://creativecommons.org/licenses/by-nd/4.0/).
Hashimoto, M., Domae, Y., and Kaneko, S.
2.2. Functions and Performance Required in Robot estimation involves a complex process. Thus, it is impor-
Vision tant to produce an algorithm appropriate for its objective.
The performance and functions that are important in Moreover, achieving constant task time and tact time is
practice are discussed below in terms of precision, speed, important in FA that involves repetitive tasks.
environmental robustness, and easy to use.
(3) Robustness Against Environmental Disturbance
It is important for the robot to be able to perform in a
(1) Precision stable manner regardless of the environment. Robot vi-
High assembly precision is required in FA. For exam- sion is affected by environmental light and temperature.
ple, precision in the sub-millimeter to micrometer order Because it is particularly affected by the environmental
can be required in the assembly of electric or electronic light, algorithm design is important. Furthermore, the ef-
parts. The factors that determine precision in robot vision fects of noise, vibrations, dust, water droplets, and other
are the resolution and error characteristics. These depend factors must also be controlled depending on the applica-
on both hardware, e.g., camera or projector, and software, tion environment. This is frequently addressed by hard-
e.g., measurement algorithm and image-processing algo- ware design.
rithm. Moreover, because the visual field and working
distance have a tradeoff relation with precision, system (4) Easy to Use
design is necessary. There are cases where this is resolved It is also important to consider the operations of non-
by the parallel use of multiple sensors. expert workers when designing a robot vision system. For
example, the adjustment of parameters in image process-
(2) Speed ing is frequently difficult to understand for non-experts.
Because high speed and real-time operability are re- Thus, an important issue in practice is to reduce the num-
quired in a robot that replaces human workers, high speed ber of parameters, make parameter adjustments easy to
is important in robot vision as well. The algorithm’s pro- perform, and automate the adjustment process. Fig. 5
cessing time can become excessively lengthy if the pose illustrates a human operator teaching movements to an
Fig. 9. Local reference frame (LRF). rotational projection statistics (RoPS-LRF) [21], the LRF
is defined to absorb the effect of the density differences
between the matched point groups.
eliminated by analyzing the occurrence probability of the
vector pairs in the model and by selecting those that are
unique. 3.4. Examples of Recent Studies
An example of a study on 3D features is the Shell
3.3. Major Local Reference Frames Histograms and Occupancy from Radial Transform
A local reference frame (LRF) refers to the coordinate (SHORT) [22] proposed by Takei et al. The use of a
system set up at each keypoint, as indicated in Fig. 9. support sphere surrounding the keypoint is the same as
The LRF’s most important role is in the definition of in other approaches; however, the interior is removed
features. The various features described in the previous to produce a shell structure and the 3D points in the
section are expressed numerically based on the LRF, such shell are counted to compute their space occupancy ratio
that LRF stability is directly related to the stability of the (Fig. 10(a)).
features. Because the LRF expresses the geometric rela- Because this index has the tendency to be high on pla-
tion of two matched keypoints, it can also be used to esti- nar surfaces and low in shape discontinuities, it can be
mate the object’s pose. The LRF is typically a rectangular used to detect keypoints. Furthermore, by setting multiple
3D coordinate system. The first axis (z-axis) is in many support spheres (shell regions) with different radii, as in
cases the normal vector of the local surface surrounding Fig. 10(b), the directions (from the keypoint) of the points
the keypoint under consideration; this can be determined in each shell layer can be accumulated as histograms to
in a relatively stable manner. The second axis (x-axis) is produce features. Thus, SHORT produces features that
a vector normal to this. The third axis (y-axis) is com- indicate shape characteristics without the explicit compu-
puted as the vector product of the first and second axes. tation of normal vectors, in both keypoint detection and
Thus, the setting of the second axis (x-axis) is the most computation of its features.
important practical issue when setting an LRF. Akizuki et al. proposed a stable LRF called the dom-
The most basic LRF is obtained by Mian’s method [20], inant projected normal (DPN)-LRF [23]. The concept
where the covariance matrix is computed from the 3D co- is displayed in Fig. 11. To address the density differ-
ordinates of the points surrounding the keypoint, and the ences between point groups to be matched, the axis di-
eigenvectors are used to build the LRF. In this method, the rections are computed by considering the object’s orig-
eigenvectors obtained from the covariance matrix of the inal shape present among the measured points, similar
coordinate data of points within a spherical region with to RoPS-LRF [21]. Furthermore, to improve robustness
radius r, are directly used as the LRF. For example, for a against partial occlusions, the dominant direction is com-
keypoint sampled from among the points lying in a quasi- puted by analyzing the distribution of normal directions.
planar region, the third principal eigenvector is equivalent
to the normal vector. Because the eigenvectors form an
orthogonal basis, it is natural to employ them as the LRF. 4. Latest Development Examples in Robot
Tombari et al. [16] improved this method by computing Vision
the eigenvectors from a weighted covariance matrix, us-
ing weights that decrease with the distance from the key- Practical examples of robot vision are introduced be-
point and thus drastically improved the reproducibility. In low.
in the size of the part that holds the liquid or that of the
hand-grasped part. In this study, therefore, the distribution
of the occupation ratio of parts that have been assigned
affordance labels, described previously, is defined as the
“affordance feature” and considered a common feature for
that class. The correlation of the affordance feature is high
between objects in the same class, however, low between
objects in different classes. Examples of affordance fea-
tures for the classes “cup” and “spoon” are displayed in
Fig. 17.
The flow of object recognition using affordance fea-
tures is illustrated in Fig. 18. In this method, a 3D scene Fig. 20. Result of object recognition using the proposed method.
reconfigured using an approach such as Kinect Fusion is
used as the input. First, each object in the input scene
is segmented and an affordance label is assigned to each and “spatula.”
point. Label classification is performed using Random Figure 20 displays an example of recognition results of
Forests. Next, the distribution of the occupation ratios of three types of common objects (commodities) using the
the affordance labels in each segment is computed as its proposed method.
affordance feature, which is then compared to the affor-
dance features (dictionary data) for various classes pre-
pared in advance to estimate the class name of the object 5. Latest Development Example of Robot Sys-
in the scene.
The extraction results of the affordance labels for the tem with Vision
four classes, cup, hammer, spoon, and spatula are dis-
played in Fig. 19. Based on the input data, it can be 5.1. Robot Cell for Assembly Including Flexible
observed that the “contain” and “grasp” labels have been Objects
extracted at the proper respective positions for the case of The manipulation of flexible objects is an important
“cup,” and similarly for the cases of “hammer,” “spoon,” issue in robots. In this section, we introduce a robot
[10] K. Ikeuchi and S. B. Kang, “Assembly Plan from Observation,” [35] S. Levine, P. Pastor et al., “Learning Hand-Eye Coordination for
AAAI Technical Report FS-93-04, pp. 115-119, 1993. Robotic Grasping with Deep Learning and Large-Scale Data Col-
[11] H. Murase and S. K. Nayar, “3D Object Recognition from Appear- lection,” arXiv:1603.02199, 2016.
ance – Parametric Eigenspace Method –,” IEICE Trans. Inf.& Syst., [36] C. Finn, X. Y. Tan et al., “Deep Spatial Autoencoders for Visuomo-
Vol.J77-D-II, No.11, pp. 2179-2187, 1994 (in Japanese). tor Learning,” arXiv:1509.06113, 2016.
[12] S. Ando, Y. Kusachi, A. Suzuki, and K. Arakawa, “Pose Estima- [37] T. Komuro, Y. Senjo, K. Sogen, S. Kagami, and M. Ishikawa, “Real-
tion of 3D Object Using Support Vector Regression,” IEICE Trans. Time Shape Recognition Using a Pixel-Parallel Processor,” J. of
Inf.& Syst., Vol.J89-D, No.8, pp. 1840-1847, 2006 (in Japanese). Robotics and Mechatronics, Vol.17, No.4, pp. 410-419, 2005.
[13] Y. Shibata and M. Hashimoto, “An Extended Method of the Para- [38] T. Senoo, Y. Yamakawa, Y. Watanabe, H. Oku, and M. Ishikawa,
metric Eigenspace Method by Automatic Background Elimination,” “High-Speed Vision and its Application Systems,” J. of Robotics
Proc. Korea-Japan Joint Workshop on Frontiers of Computer Vi- and Mechatronics, Vol.26, No.3, pp. 287-301, 2014.
sion, pp. 246-249, 2013.
[14] H. Yonezawa, H. Koichi et al., “Long-term operational experi-
ence with a robot cell production system controlled by low carbon-
footprint Senju (thousand-handed) Kannon Model robots and an ap-
proach to improving operating efficiency,” Proc. of Automation Sci-
ence and Engineering, pp. 291-298, 2011.
[15] H. Do, T. Choi et al., “Automaton of cell production system for cel- Name:
lular phones usng dual-arm robots,” J. of Advanced Manufacturing Manabu Hashimoto
Technology, Vol.83, No.5, pp. 1349-1360, 2016.
[16] F. Tombari, S. Salti, and L. D. Stefano, “Unique Signatures of His-
tograms for Local Surface Description,” European Conf. on Com- Affiliation:
puter Vision, pp. 356-369, 2010. School of Engineering, Chukyo University
[17] B. Drost, M. Ulrich, N. Navab, and S. Ilic, “Model Globally, Match
Locally: Efficient and Robust 3D Object Recognition,” IEEE Com-
puter Vision and Pattern Recognition, pp. 998-1005, 2010.
[18] C. Choi, Y. Taguchi, O. Tuzel, M. Liu, and S. Ramalingam, “Voting-
Based Pose Estimation for Robotic Assembly Using a 3D Sensor,”
IEEE Int. Conf. on Robotics and Automation, pp. 1724-1731, 2012.
[19] S. Akizuki and M. Hashimoto, “High-speed and Reliable Object Address:
Recognition Using Distinctive 3-D Vector-Pairs in a Range Image,” 101-2 Yagoto-Honmachi, Showa-ku, Nagoya, Aichi 466-8666, Japan
Int. Symposium on Optomechatronic Technologies (ISOT), pp. 1-6, Brief Biographical History:
2012. 1985 Graduated from Osaka University
[20] A. Mian, M. Bennamoun, and R. Owens, “On the Repeatability and 1987 Graduated from Graduate School, Osaka University
Quality of Keypoints for Local Feature-based 3D Object Retrieval 1987- Joined Mitsubishi Electric Corporation
from Cluttered Scenes,” Int. J. of Computer Vision, Vol.89, Issue 2- 2008- Joined Chukyo University
3, pp. 348-361, 2010.
[21] Y. Guo, F. Sohei, M. Bennamoun, M. Lu, and J. Wan, “Rotational Main Works:
Projection Statistics for 3D Local Surface Description and Object • S. Akizuki, M. Iizuka, and M. Hashimoto, “‘Affordance’-focused
Recognition,” Int. J. of Computer Vision, Vol.105, Issue 1, pp. 63- Features for Generic Object Recognition,” ECCV 2nd Int. Workshop on
86, 2013. Recovering 6D Object Pose, 2016.
[22] S. Takei, S. Akizuki, and M. Hashimoto, “SHORT: A Fast 3D Fea- • S. Akizuki and M. Hashimoto, “Stable Position and Pose Estimation of
ture Description based on Estimating Occupancy in Spherical Shell Industrial Parts using Evaluation of Observability of 3D Vector Pairs,” J. of
Regions,” Int. Conf. on Image and Vision Computing New Zealand Robotics and Mechatronics, 2015.
(IVCNZ), 2015.
Membership in Academic Societies:
[23] S. Akizuki and M. Hashimoto, “DPN-LRF: A Local Reference • The Institute of Electrical and Electronic Engineers (IEEE)
Frame for Robustly Handling Density Differences and Partial Oc-
clusions,” Int. Symposium on Visual Computing (ISVC), LNCS • The Robotics Society of Japan (RSJ)
9474, Part I, pp. 878-887, 2015.
[24] H. Kayaba, S. Kaneko, H. Takauji, M. Toda, K. Kuno, and H.
Suganuma, “Robust Matching of Dot Cloud Data Based on Model
Shape Evaluation Oriented to 3D Defect Recognition,” IEICE
Trans. D, Vol.J95-D, No.1, pp. 97-110, 2012.
[25] S. Kaneko, T. Kondo, and A. Miyamoto, “Robust matching of 3D
contours using iterative closest point algorithm improved by M-
estimation,” Pattern Recognition, Vol.36, pp. 2041-2047, 2003.
[26] Y. Domae, H. Okuda, Y. kitaaki, Y. Kimura, H. Takauji, K. Sumi,
and S. Kaneko, “3-D Sensing for Flexible Linear Object Alignment
in Robot Cell Production System,” J. of Robotics and Mechatronics,
Vol.22, No.1, pp. 100-111, 2010.
[27] Y. Domae, S. Kawato et al., “Self-calibration of Hand-eye Coor-
dinate Systems by Five Observations of an Uncalibrated Mark,”
CIEEJ Trans. On Electronics, Information and Systems, Vol.132,
No.6, pp. 968-974, 2011 (in Japanese).
[28] R. Haraguchi, Y. Domae et al., “Development of Production Robot
System that can Assemble Products with Cable and Connector,” J.
of Robotics and Mechatronics, Vol.23, No.6, pp. 939-950, 2011.
[29] A. Noda, Y. Domae et al., “Bin-picking System for General Ob-
jects,” J. of the Robotics Society of Japan, Vol.33, No.5, pp. 387-
394, 2015 (in Japanese).
[30] Y. Domae, H. Okuda et al., “Fast graspability evaluation on single
depth maps for bin picking with general grippers,” Proc. of ICRA,
pp. 1997-2004, 2014.
[31] N. Correll, K. E. Bekris et al., “Lessons from the Amazon Picking
Chllenge,” CarXiv:1601.05484, 2016.
[32] R. Jonschkowski, C. Eppner et al., “Probabilistic Multi-
Class Segmentation for the Amazon Picking Challenge,”
http://dx.doi.org/10.14279/depositonce-5051, 2016.
[33] I. Lenz, H. Lee et al., “Deep Learning for Detecting Robotic
Grasps,” Proc. of ICRA, pp.1957-1964, 2016.
[34] L. Pinto and A. Gupta, “Supersizing Self-supervision: Learning to
Grasp from 50K Tries and 700 Robot Hours,” arxiv:1509.06825,
2015.
Name:
Yukiyasu Domae
Affiliation:
Principal Researcher, Advanced Technology
R&D Center, Mitsubishi Electric Corporation
Address:
8-1-1 Tsukaguchi, Hon-machi, Amagasaki, Hyogo 661-8661, Japan
Brief Biographical History:
2008- Joined Mitsubishi Electric Corp.
2012 Received Ph.D. in Information Science from Hokkaido University
2015- Joined National Institute of Advanced Industrial and Technology
(AIST)
Main Works:
• “Fast Graspability Evaluation on Single Depth Maps for Bin Picking
with General Grippers,” Proc. of ICRA, pp. 1997-2004, 2014.
• “Development of Production Robot System that can Assemble Products
with Cable and Connector,” J. of Robotics and Mechatronics (JRM),
Vol.23, No.6, pp. 939-950, 2011.
Membership in Academic Societies:
• The Robotics Society of Japan (JRSJ)
• The Japan Society for Precision Engineering (JSPE)
• The Institute of Electronics, Information and Communication Engineers
(IEICE)
• Information Processing Society of Japan (IPSJ)
• The Institute of Electrical and Electronics Engineers (IEEE)
Name:
Shun’ichi Kaneko
Affiliation:
Hokkaido University, Graduate School of Infor-
mation Science and Technology
Address:
Kita-14, Nishi-9, Kita-ku, Sapporo 060-0814, Japan
Brief Biographical History:
1991-1996 Associate Professor, Tokyo University of Agriculture and
Technology
1996-2004 Full Professor, Hokkaido University
Main Works:
• “Robust image registration by increment sign correlation,” Pattern
Recognition, Pergamon Press, 2002.
Membership in Academic Societies:
• The Japan Society for Precision Engineering (JSPE)