Machine Vision Application Chapter
Machine Vision Application Chapter
net/publication/328575507
CITATIONS READS
4 5,211
4 authors:
All content following this page was uploaded by Mahmood A. Mahmood on 11 August 2019.
Moises Rivas-Lopez
Universidad Autónoma de Baja California, Mexico
Oleg Sergiyenko
Universidad Autónoma de Baja California, Mexico
Wendy Flores-Fuentes
Universidad Autónoma de Baja California, Mexico
Copyright © 2019 by IGI Global. All rights reserved. No part of this publication may be reproduced, stored or distributed in
any form or by any means, electronic or mechanical, including photocopying, without written permission from the publisher.
Product or company names used in this set are for identification purposes only. Inclusion of the names of the products or
companies does not indicate a claim of ownership by IGI Global of the trademark or registered trademark.
Library of Congress Cataloging-in-Publication Data
Names: Rivas-Lopez, Moises, 1960- editor. | Sergiyenko, Oleg, 1969- editor. |
Flores-Fuentes, Wendy, 1978- editor. | Rodriguez-Quinonez, Julio C.,
1985- editor.
Title: Optoelectronics in machine vision-based theories and applications /
Moises Rivas-Lopez, Oleg Sergiyenko, Wendy Flores-Fuentes, and Julio Cesar
Rodriguez-Quinonez, editors.
Description: Hershey, PA : Engineering Science Reference (an imprint of IGI
Global), [2019] | Includes bibliographical references and index.
Identifiers: LCCN 2017055227| ISBN 9781522557517 (hardcover) | ISBN
9781522557524 (ebook)
Subjects: LCSH: Computer vision--Industrial applications. | Optoelectronic
devices. | Industrial electronics.
Classification: LCC TA1634 .O687 2019 | DDC 621.39/93--dc23 LC record available at https://lccn.loc.gov/2017055227
This book is published in the IGI Global book series Advances in Computational Intelligence and Robotics (ACIR) (ISSN:
2327-0411; eISSN: 2327-042X)
All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the
authors, but not necessarily of the publisher.
Chapter 8
Machine Vision Application
on Science and Industry:
Machine Vision Trends
Bassem S. M. Zohdy
Institute of Statistical Studies and Research, Egypt
Mahmood A. Mahmood
Institute of Statistical Studies and Research, Egypt
Hesham A. Hefny
Institute of Statistical Studies and Research, Egypt
ABSTRACT
Machine vision studies opens a great opportunity for different domains as manufacturing, agriculture,
aquaculture, medical research, also research studies and applications for better understanding of processes
and operations. As scientists’ efforts had been directed towards deep understanding of the particular
material systems or particular classes of types of specific fruits, or diagnosis of patients through medical
images classification and analysis, also real time detection and inspection of malfunction piece, or pro-
cess, as various domains witnessed advancement through using machine vision techniques and methods.
INTRODUCTION
Machine vision studies open a great opportunity for different domains as manufacturing, agriculture,
aquaculture, medical research, also research studies and applications for a better understanding of its
processes and operations. As scientists efforts had been and still directed towards the understanding of
materials, systems or/and specific classes of types of fruits, or diagnosis of patients through medical
images classification, and analysis, and also real-time detection and inspection of malfunction piece,
DOI: 10.4018/978-1-5225-5751-7.ch008
Copyright © 2019, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Machine Vision Application on Science and Industry
or process, as various domains witnessed advancement through using machine vision techniques and
methods. Researchers in material sciences achieved very important findings in understanding and
analyzing microstructural images, as the aim is to produce a common method to extract meaningful
features regarding micrographs. Also image analysis main aim is to extract meaningful data included
in images to be analyzed, the image analysis contains many processes including image registration and
image fusion, image registration uses two images, the first one is called the reference image, while the
second one called the sensed image, the sensed image is aligned to the reference image, the result or
output of this process is used in another process is called image fusion, image fusion is the process of
combining the data contained in two images in one image, as the resulting image will be more informa-
tive than the input images, mentioned processes, image registration, and image fusion are very helpful
in many practical and application domains, as industrial, medical, aquaculture, and agriculture and other
areas that use machine vision technologies and applications (DeCost, 2015;El-Gamal, 2016;Saberioon,
2016;Benalia, 2016).
Industrial Studies
Studies revealed many types of images or video streams that could be used in machine vision applications
and techniques, as in Aldrich (2010) the main concern of the mineral processing industry is to extract
features from froth images, this features could be obtained through employing color feature extraction
of red, green, blue (RGB) to better determine the minerals captured by pictures.
In Liu (2017) a research study proposes that the Internet of Thinks (IoT) makes possible the use of
intelligent system for mechanical products assembly for (IIASMP), the proposed system make use of a
mechanical product assembly process to evaluate the characteristics of IoT-based manufacturing systems,
among the characteristics of the proposed system is the self-regulation and self-organization, meaning
that the system can monitor itself through collecting information about its status and analyze this data to
enhance or react to current status, as computer vision systems are embedded in material handling robot,
smart controller, actuator, and a training information consisting of image database, image processing
algorithm and encoding rules, as the system recognizes material types through capturing images of
materials and then processing it. The framework proposed by this study consists of five layers, first is
the sensing assembly layer, this layer employs resources identification technology along with multiple
source and sensor data acquisition to obtain the state of the assembly resource, then a feedback formulates
instructions in order to adjust the assembly process, the second layer is the net layer that could convert
the protocol, store, route, transfer the sensing data, the third layer is the where the sensor data is fusion,
in this layer the sensing data map the performance of the resources assembly, transforming the data into
information in order to extract and combine this data, the forth layer is where the decision and applica-
tions is made, the extracted information from the last layer could be audited and analyzed to support the
intelligent management control, the fifth layer is system service part providing resource configuration
and data security, the sixth and last layer is interface layer that enables methods of exchanging data.
In Dcost (2015) previously established computer vision methods are used to define quantitative
microstructure descriptors and methods to identify classes of microstructures, from a diverse collection
234
Machine Vision Application on Science and Industry
recorded in databases. The results are computed in real-time, capturing the most important characteristics
details from microstructural images without fine-tuning from human experts.
The author of Di Leo (2016) proposed a vision system for online quality of industrial manufactur-
ing, with a reference values the proposed measurement system detects the defects of electromechanical
parts, using components include two cameras to produce image for quality monitoring purposes, as the
images produced by each camera go for two different procedures, first is the top image processing, and
the objective is to measure the length for design specification purposes, as this procedure adopt NI vi-
sion by National Instrument, the other procedure obtains images for parts that could not be seen from
the top angle, this procedure uses backlight illuminator to create a binary image with 75% threshold.
A comparison study in Rashidi (2016) assess various machine learning techniques to detect three
categories of construction; concrete, red bricks and Oriented Strand Boards (OSBs), the study compares
Radial Basis Function (RBF), Multilayer Perceptron (MLP), and Support Vector Machine (SVM), ac-
cording with the results, SVM shown better performance than the other two techniques, for the three
kind of materials their surface textures were accurately detected.
In Lapray (2016) is proposed HDR-ARtiSt (High Dynamic Range Adaptive Real-time Smart cam-
era) which is complete FPGA-based smart camera architecture that obtains a real-time video stream
with a high dynamic range from multiple acquisitions, this study produces uncompressed B&W 1,
280*1,024-pixel HDR live video at 60 fps, displaying the HDR live video on a LCD monitor with the
help of an embedded DVI controller.
A study conducted by authors in Gade (2014) to review the thermal cameras and its applications,
illustrated that thermal cameras are classified as passive sensors, these sensors detect the infrared radia-
tion emitted by bodies with a temperature above absolute zero. Thermal camera was developed for the
military purpose, as a surveillance and night vision tool; nowadays the price has decreased, opening up
a broader field of applications. These types of cameras overcome the illumination problems in gray-
scale and RGB images. As the produced images from thermal cameras represented as grayscale images
with a depth of 8 to 16 bit per pixel as images can be compressed with standard JPEG and video can be
compressed with H264 or MPEG.
Agriculture Studies
A review study conducted by Di Leo (2016) in research domain of computer vision in agriculture assures
that using computer vision techniques in evaluating the quality of fruits and vegetables will decrease
time wasted in human check for fruits and vegetables, as it provides powerful tools for external quality
assessment and evaluation of fruits and vegetables. Also, a review paper by Tscharke (2016) investi-
gates the application of machine vision systems to recognize and monitor the behavior of animals in a
quantitative manner.
In Di Leo (2016) the author reviews the three main types of machine vision systems used in automated
inspection of fruits and vegetable qualities, the traditional computer vision systems deals with (RGB)
color cameras that simulate the human eyesight, by capturing images with the three filters centered
red, green, and blue (RGB). Hyperspectral machine vision systems unlike the traditional systems, as it
combines in one system a spectroscopic with imaging techniques to obtain set of monochromatic im-
ages. In multispectral machine vision systems, they are different from the traditional and hyperspectral
systems in the number of monochromatic images in the spectral domain, as the main advantage of this
235
Machine Vision Application on Science and Industry
system is the number of wavelengths of monochromatic images captured can be chosen freely by using
narrowband filters.
A review describes and assesses the most recent technologies of various optical sensors and its suit-
ability for fish farming management. Measurement and prediction of fish products quality are performed
(Saberioon, 2016) the major areas of optical sensors applications in aquaculture are discussed in this
review: pre-harvesting, during cultivation and post-harvesting. Machine Vision Systems (MVSs) and
optical sensors are excellent options for real-world application based on digital camera development
thanks to the increasing speed of computer-based processing.
A survey study conducted by Cubero (2016) displays the recent researches that use color and non-
standard computer vision systems for the inspection of citrus in an automated way. The existing tech-
nologies to acquire fruits images and their use for the non-destructive inspection of internal and external
characteristics of these fruits are explained. Machine vision has been proved to be a very useful and
practical tool for automatic purposes during the fruit inspection. Indoors and outdoors inspections have
shown in excellent performance and accuracies.
In Benalia (2016) the authors proposed a system for real-time and in-line sorting of dried figs through
color assessment measured by chroma meter CR-400; which is handheld, portable measurement instru-
ment designed to evaluate the color of objects, particularly with smoother surface conditions or minimal
color variation, and CIE illuminant D65 and the 10o observer standard that is a commonly used standard
illuminant defined by the International Commission on Illumination (CIE). It is part of the D series of
illuminants that try to portray standard illumination conditions at open-air in different parts of the world.
A Canon EOS 550D digital camera was used for the fig image acquisition, which captured images
of 2592* 1728 pixels size, with a resolution of 0.06 mm/pixel. Eight fluorescent tubes (BIOLUX 18
W/965, 6500 K, OSRAM, Germany) were used to lighting placing them on the four sides of a square
inspection chamber in a 0°/45° configuration.
The results from Chroma meter and image analysis made possible a complete classification between
high quality and deteriorated figs, by the evaluation of the figs color attributes.
In Beltrán Ortega (2016) the study to evaluate the quality of virgin olive oil during its manufactur-
ing in-line process and storage is presented. The study list the sensing technologies used in the olive
oil industry. For tasks that allow an effective process supervision and the control of the virgin olive oil
production. Electronic noses pose good classification abilities to discriminate between olive oils in terms
of quality and origin and even quantify different olive pastes. Valuable information is extracted online
at different stages. Polyphenols are measure through the use of Electronic tongues that are also able to
mimic the human sense of taste. Also the dielectric spectroscopy is able to mimic the human sense of
taste, mainly used for the detection of water content in olive oil.
Medical Studies
The research by Norouzi (2014) classifies the segmentation methods of image processing regarding
medical imaging into four categories, first region based methods, then clustering methods, then classi-
fier methods, and finally hybrid methods. In region-based methods, which considered as the bases of
most image segmentation methods, and the most popular region-based methods are thresholding and
region growing, as the second method is classification, that uses training data as sample images to extract
features for future detection and classification of medical images, as the algorithms used in this method
are k-nearest neighbor, and maximum likelihood. The third method is clustering methods, this method
236
Machine Vision Application on Science and Industry
is the classification method but it does not use training data but rather using statistical techniques to
extract the features of data, it employs k-means, fuzzy C-mean and expectation maximization algorithms,
the last method is the hybrid method, which combines boundary and regional information in order to
segment medical images.
As Gomes (2011) the computer integrated surgery employs two dimensional (2D) or three dimensional
(3D) medical imaging for preoperative planning, in order to collect information about the patient, these
images combined together with the human anatomy to produce a computer model that is employed in
surgical planning, using intraoperative sensing to register patient data.
The robotic surgical systems rely on 3D imaging, as the vision system uses high-resolution 3D endo-
scope and image processing equipment, also Praxiteles robotic system developed by Praxim Medivision
presented a bone-mounted guide positioning to align a cutting guide in image-free total knee arthroplasty
(TKA), used during surgeon process to perform the planar cuts manually using the guide.
In Havaei (2016) the author proposed a deep neural network method (DNN) to segment brain tumor
images using Magnetic resonance (MR) images. The study proposes a convolutional neural network
(CNN) adapting Deep Neural Network (DNN). As the results revealed that the proposed architecture is
30 times faster than currently published studies. The study uses Pylearn 2 library that is a deep learn-
ing open source library, the library supports GPUs usage that increases the speed of deep learning
algorithms, as convolutional neural network (CNN) learns the data features, the proposed method uses
minimal pre-processing, the preprocessing follows 3 steps, first removing 1% high and low intensities,
then applying N4ITK bias correction, then subtracting the channel’s mean and dividing by the channel’s
standard deviation in order to normalize the data, connected components simple method is employed
for post-processing.
The study (Escalera, 2016) proposed technique for detecting multiple sclerosis in the brain using
stationary wavelet transform (SWT), also the study uses three machine learning classifiers, namely;
decision tree, k-nearest neighbors, and support vector machine, the technique implies using two-level
stationary wavelet entropy (SWE), first; it applies stationary wavelet transform (SWT) on a given image,
second; the randomness of the sub-bands is measured by the Shannon entropy A, defined as follows:
Here A represents the entropy and Xi represents the ith element of a given sub-band. Note that for
an m-level decomposition, there are in total (3m+ 1) sub-bands, and thus a vector of (3m+ 1) elements
was formed.
Then this methodology applied three classifiers as mentioned above, and the results showed that
k-nearest neighbors outperform the other algorithms.
An experiment study performed in Tajbakhsh (2016) considered four medical imaging applications
involving classification, detection, and segmentation from various different imaging modality systems,
the experiment showed that the use of pre-trained CNNs with fine tuning outperformed the trained
CNNs from scratch.
Ophthalmic imaging provides a way to diagnose and objectively assess the progression of a number
of pathologies including neo-vascular age-related macular degeneration and diabetic retinopathy. In
De Fauw (2017) a study counts on digital fundus photographs digital Optical Coherence Tomography
(OCT) images using DeepMind (Beattie, 2016).
237
Machine Vision Application on Science and Industry
Feature extraction in mineral processing industries as reported in the review study conducted by Aldrich
(2010) categorized in three categories; first the physical features to detect the bubble size and shape
through the usage of the edge detection method, which detects the gradients of the pixel intensities
between bubbles in froth images by using the valley edge detection and valley edge tracing methods
to segment froth images, in valley edge detection the concern is to detect the valley edges between the
bubbles, images then filtered to extract noise and then image pixels are assessed to promote possible
edge candidates, then the next step is to cleanup based on valley edge tracing to ensure the removal of
gaps between valley edges.
Another technique of image segmentation used in the mineral processing industry, which employs
two-stage procedure, through identifying the local minima in pixel intensities, then calculating the bubble
diameter in order to border thinning, as reported by Aldrich (2010) that this approach is more accurate
for larger bubble size. Also, another algorithm of physical feature extraction in the mineral processing
industries is a watershed algorithm which is a formation approach relies on simulation of water rising
from a set of markers; the approach relies on identifying the minima and regional maxima by locating
trends in pixel intensities along different scan lines. Another approach is froth color, which extracts the
color features of minerals loaded, red, green, and blue (RGB), also hue, saturation, intensity or values
(HSI) or (HSV).
Other approaches used in the industrial domains are the statistical approaches, which used in the
extraction of features from images, as fast Fourier Transforms (FFT), wavelet transforms, Fractal de-
scriptors, co-occurrence matrices and their variants, texture spectrum analysis, latent variables which
implies; principal component analysis, Hebbian learning, multilayer perceptrons, cellular neural networks.
It is noteworthy to mention that other methods used specifically in mineral processing industries are
dynamic features; which uses descriptors designated specifically to capture the behavior of froth, which
implies mobility techniques; that refers to the speed and direction of movement, mobility techniques
that include; bubble tracking, block matching, cluster matching and pixel tracing. Besides the mobility
techniques, there is a stability technique that reveals the appearance and disappearance of bubbles in
the froth.
In Lapray (2016) was used the multiple exposure control (MEC) algorithm for low and high exposure
images, as it is more close to Kang’s exposure control and Alston’s multiple exposure systems, as the
proposed study the exposure system is alternate between two values which is continuously adapted to
reflect changes in the scene.
Authors in Beltrán Ortega (2016) reviewed four types of medical image segmentation using region-
based, classification, clustering, and hybrid algorithms, first the region-based methods categorized in two
algorithms, first is thresholding, in this technique the image is formed with different gray levels, therefore,
foreground if f (x , y ) ≥ T
g (x , y ) = (2)
background if f (x , y ) <T
Where f(x, y) is the pixel intensity in the (x, y) position and T is the threshold value. An inappropriate
threshold value leads to poor segmentation results.
238
Machine Vision Application on Science and Industry
For an image that does not have the constant background and has a diversity across the object, the
local thresholding method is used to divide images into sub-images, to posteriorly calculate the threshold
value for each part. The thresholding results for each part of an image are merged. The image is divided
into vertical and horizontal lines, each part includes a region of both the background and the object.
Finally, an interpolation is made to obtain appropriated results.
In the discipline of machine learning, SVM is classified as a supervised learning model with associ-
ated learning algorithms capable of recognizing patterns in the data analyzed. The simple task of SVM
is predicts information from an input data set, for each given input, and two possible classes form the
output, making it a no probabilistic binary linear classifier. An SVM model is a representation of the
samples as points in feature space, mapped so that the samples of the separate categories are divided
by an hyper-plane that is as far as possible from marginal samples of each category. Figure 1 shows the
distribution of samples related to an example problem in a 2D space. Additionally, it shows the resulted
hyper-plane (a line in 2D space) and margins by SVM. SVMs can efficiently perform non-linear clas-
sifications using different kernel functions. Kernel functions map inputs into higher-dimensional feature
spaces. By following this process, the problem will be reformulated as a linear problem; thus, the ordinary
SVM can perform linear classification in the new feature space (Gutschoven, 2000).
239
Machine Vision Application on Science and Industry
In the study Di Leo (2016) the authors review the various computer vision systems that are used in
agriculture research domain, the systems could be categorized in three main types; the traditional, the
hyper-spectral, and the multispectral computer vision systems.
SVM was trained to classify microstructures into one of seven groups with greater than 80% accuracy
over 5-fold cross-validation (Dcost, 2015), as SVM can classify microstructures into groups automatically
and with high accuracy, even using relatively small training data sets. In addition, the feature histogram
can provide the basis for a visual search engine that finds the best matches for a query image in a data-
base of microstructures. This automatic and objective computer vision system offers a new approach to
archiving, analyzing, and utilizing microstructural data.
The methodology proposed by Tseng (2015) for e-quality control using support vector machine
starts with embedding different test samples, and then a remote inspection process, followed by data
acquisition, in order to prepare the training data, then selecting the model, then a sensitivity analysis is
conducted with different C values and comparing results with different kernels, this leads to identifying
optimal values. In training SVMs, the study selects a kernel and set a value to the margin parameter
C. To develop the optimal classifier, the proposed methodology needs to determine the optimal kernel
parameter and the optimal value of C. A k-fold (a value of k=10) cross-validation approach is adopted
for estimating the value of training parameter. The minimum value considered was 0.1 and the maximum
value was 500. The value of C with the high-level training accuracy percentage was identified as the
optimal value (highlighted in bold characters). The performance of the classifiers is evaluated by using
different kernel functions in terms of testing accuracy, training accuracy, a number of support vectors,
and validation accuracy. Four different kernel functions are identified for this research. They include
(1) Linear Kernel, (2) Polynomial Kernel, (3) Radial Basis Function (RBF) Kernel, and (4) Sigmoid
Kernel. Polynomial and RBF kernels are by far the most commonly used kernels in the research world.
In the study Tscharke (2016) machine vision system is used in behavior recognition process to de-
termine the welfare of a livestock, and this process has four main parts depicted in Figure 2, first is the
initialization, which leads to calibration process that requires data represented as model constraints, next
software variables, and hardware camera variables, second element is tracking through segmentation that
classifies object and backgrounds, that leads to the third element of pose estimation through a predictive
model, and finally results or the recognition.
Another machine learning technique is MLP (Jazebi, 2013) that is derived from the artificial neural
networks to solve more sophisticated nonlinear problems, each MLP encompasses a number of basic
neurons, which are organized in three layers: the input layer, the hidden layer, and the output layer as
depicted in Figure 1. An MLP is mainly a feedforward network, which implies the error back propaga-
tion concept for the training purposes.
RBF (Man, 2013) is a one hidden layer neural network, which may use several forms of radial func-
tions as the activation function. The most common one is the Gaussian function defined by:
x −µ 2
f j (x ) = exp j
(3)
2σ 2
j
Where σ is the radius parameter, μ is the vector determining the center of basis function fj and x is
the d-dimensional input vector. In an RBF network as in Figure 1, a neuron of the hidden layer is acti-
240
Machine Vision Application on Science and Industry
vated whenever the input vector is close enough to its center vector μ. There are several techniques and
heuristics for optimizing the basis functions parameters and determining the number of hidden neurons
needed to achieve the optimal classification. The second layer of the RBF network, which is the output
layer, comprises one neuron to each class. The output is the linear function of the outputs of the neurons
in the hidden layer and is equivalent to an OR operator. The final classification is given by the output
neuron with the greatest output. An RBF neural network with one output neuron is depicted in Figure 1.
Authors in Havaei (2016) applied deep neural networks, especially CNN has recently many studies in
the medical domain, as CNN consists of a succession of layers, which perform operations on the input
data. Convolutional layers (symbol Cks) convolve the images Isize presented to their inputs with a pre-
defined number (k) of kernels, having a certain size s, and are usually followed by activation units which
rescale the results of the convolution in a nonlinear manner. Pooling layers (symbol Pstridesize) reduce the
dimensionality of the responses produced by the convolutional layers through downsampling, using dif-
ferent strategies such as average pooling or max pooling. Finally, fully connected layers (F#neurons) extract
compact, high-level features from the data. The kernels belonging to convolutional layers as well as the
weights of the neural connections of the fully connected layers are optimized during training through
241
Machine Vision Application on Science and Industry
back-propagation. The user specifies the network architecture, by defining the number of layers, their
kind, and the type of activation unit. Other relevant parameters are the number and size of the kernels
employed during convolution, the number of neurons in the fully connected part and the downsampling
ratio applied by the pooling layers.
Machine vision techniques employ technologies for robots as three-dimensional perception that
considered one of the technologies used to aid in recognizing the surroundings and navigation in order
to accomplish tasks in fully understood environment, as controlling and operating robots requires the
ability to visualize the surroundings in a human way, for that reason robot needs to gain a 3D information.
Mathematical models in Salvi (2002) can give the corresponding point in an image given a point in
the site of view, as determining a set of parameters that describes the mapping between the 2D image
coordinates and 3D point in world coordinate system, in other words, this situation called the camera
calibration. Pinhole camera model presented in Faugeras (1993) is used to model the projection of the
world coordinates into images.
As depicted in Figure 4, the image of the 3D point P, is generated by optical ray passing by the opti-
cal center and intersecting the image plane, resulting p’ in the image plane, that is located at a distance
focal length f behind the optical center (Usamentiaga, 2014), in order to describe the projection of 3D
points on the 2D image plane mathematically is the transformation of the real coordinate system to W
to the camera coordinate system C, this transformation is done using the equation (4), so the camera
coordinates of point Pc=(xc, yc, zc)T calculated from the world coordinates Pw=(xw, yw, zw)T using rigid
transformation Hw→c, The transformation Hw→c include three translations (Tx, Ty, Tz) and three rotations
242
Machine Vision Application on Science and Industry
(α,β,γ ) , using equation (5) the projection in the camera C into the image coordinate system is calcu-
lated.
P c p w
1 = Hw → c 1 (4)
u f x c
v = c y c (5)
z
Figure 5a shows the point projection in the real worldview on an image, although the point in the
image cannot directly be obtained by the original point as it is not one to one relationship but one to
many so the inverse problem is weakly stated, as different points could be on the same pixel, effectively
dealing with the inverse problem by forming a straight line contains formed by all the points of the same
pixel of the image as depicted in Figure 5b.
Another technique defined as passive techniques like stereo-vision that requires only ambient light
deals with this issue through locating the same point and calculating the intersection of projection lines
of multiple images. Other techniques employ an infrared pattern to estimate the depth information from
the returning time. As active vision techniques like laser triangulation, structured light, and light cod-
ing use different illumination method as the more appropriate features, the object has will increase the
accuracy when passive vision is used. Unlike active vision techniques and passive vision, techniques
require multiples cameras for the sake of 3D reconstruction as it depends on the application.
Stereo-vision and photogrammetry is a 3D reconstruction technique that uses conventional 2D
imaging, measuring real dimensions of an object from an image (Eisenbeiss, 2005). Stereo-vision and
photogrammetry calculate the intersection of the projection line as depicted in Figure 6 through obtain-
243
Machine Vision Application on Science and Industry
Figure 6. From 2D to 3D, (a) Homologous points, (b) Intersection of project lines
ing the same point in other images preferably in three other images in order to improve the accuracy as
to get the accurate 3D position homologous points must be selected.
Ensuring the detection could be obtained through the usage of laser points or other techniques of
physical marks with high contrast, as shown in Figure 7 physical marks taken and processed, in order to
calculate the 3D position there are many factors that must be considered, these factors are the calibration
parameters and the spatial position of the cameras, as the marks could be paired using epipolar geometry
and the intersection lines of the projection lines (Luhmann, 2006).
Markerless stereo-vision (Canny, 1987) uses feature-tracking algorithm to find, extract, and match
characteristics of objects between similar images abandoning physical marks as depicted in Figure 8.
Time of flight (ToF) (Kolb, 2009) as an active vision technique projects infrared pattern to get 3D
data depicted in Figure 9, as this technique employs a Tof camera that uses light pulses as its illumina-
tion, it is on for a very short time, the resulting light pulse is projected on the objects and reflected as
the camera captures the reflected light into the sensor plane. A delay of returning light could be caused
by the distance and could be calculated by equation (6).
D
tD = 2 ⋅ )
c
244
Machine Vision Application on Science and Industry
Where tD is the delay, D is the distance to the object and c is the speed of light.
Structured light techniques have two main categories of techniques, time-multiplexing techniques and
one-shot techniques (Chen, 2000), in time-multiplexing techniques there is no limit for the number of
patterns, so the number of correspondences and larger resolution could be obtained, but all factors like
camera, projector, and the object have to remain fixed and unchanged. In one-shot techniques moving
245
Machine Vision Application on Science and Industry
camera could be used, as each line or point could be identified by its local neighborhood, unlike laser
the projected light is not harmful to human, the camera records a stripped light that was originally a
light transformed from its original status as shown in Figure 10, as depth is obtained by calculating the
intersection between lines and planes (Pages, 2006; Salvi, 1998).
The stereo-vision and photogrammetry (Rusu, 2011) could be found in automotive industry for the
sake of car body deformation measurements, adjustment of tools and rigs, as in aerospace industry
for measurement and adjustment for mounting rigs, alignment between parts, wind energy systems
for deformation measurements and production control, as for water dams, tanks and plant facilities in
construction, the stereo-vision and photogrammetry offers accuracy and highly precise measurements.
There are three main different approaches to photogrammetric solutions (Hefele, 2001). First, relies on
two or more static cameras observing target points called forward intersection observing moving targets.
In the second approach, one or more cameras are placed to observe the fixed target, this approach called
resection. The third approach, called bundle adjustment combines the first and second approach.
In laser triangulation technique (Mahmud, 2011), a laser emitter, the camera, and the point form a
triangle as shown in Figure 11. The shape and size of the triangle are known by the distance between
the camera and the laser emitter, the angle of the laser emitter corner is known, and camera corner de-
termined by the location of the laser dot corner of the triangle.
246
Machine Vision Application on Science and Industry
Structured light supply visual features not depending on the site, in Claes (2007) a solution is presented,
shown in Figure 12, where only one camera is attached to the end effector, and a static projector used
to illuminate the work in hand. Reported that the accuracy is 3 mm, and this is appropriate for concrete
implementation. As Pages (2006) proposed coded structured light, a robust visual feature is provided by
the coded light pattern. Even in the existence of occlusions, placing the robot with reference to planar
object yields well-accepted results.
Light coding techniques in Patra (2012) and Susperregi (2013) have many industrial and robotic ap-
plications, also used for people tracking in video games. As it is a great innovation offered at a very low
cost. But has some shortcomings, as it provides a noisy point cloud. One of the solutions is to combine
HD camera with light coding sensors to obtain high-resolution point cloud. Experimental results show
that this approach enhances both indoor and outdoor sites with a significant increase in the resolution of
the point cloud. In order to improve the detection of people on mobile robots, thermal sensors could be
placed on mobile platforms and use classifiers of supervised learning, since humans could be detected
through their thermal characteristics distinguished from other objects, as the experimental results show
247
Machine Vision Application on Science and Industry
decreased false positive rate. Wang (2013) Shows that using sensor data and virtual 3D robot models for
collision detection depicted in Figure 13, shop floor environment displayed as 3D models connected to
motion sensors, used to simulate the real environment, as unstructured foreign objects including mobile
operators added by light coding sensors.
Applications of machine vision systems could be exploited in different and various domains, in Aldrich
(2010) a comprehension of procedures happening in the froth floatation systems has long been held key
to understanding of the overall behavior of froth floatation systems, as humans cannot diagnose and
predict the status related to more advanced structure in the froth, the developed techniques of machine
248
Machine Vision Application on Science and Industry
vision enables such industries to extract features of images, the Figure 14 below gives indication of the
application of froth imaging systems in the mineral processing industries.
Figure 14 shows that the majority of applications (approximately 48.2%) have been reported in the
base metals (BM) industry, which mostly includes copper, lead, and zinc, with a few papers related to
nickel, magnesium, and tin. Application in the coal industries (30.4%), particularly in China, is second,
followed by application (12.5%) in the platinum group metal (PGM) industry, mostly in South Africa.
The balance of the applications reported in the literature (8.9%) is associated with oxides, such as P2O5,
SiO2, and CaO.
In Gomes (2011) the authors review surgical robotics field and how to use the computer vision tech-
niques like image processing to mimic the human ability in different operations, as the key player of such
systems is the da Vinci robotics system and others, as the accuracy have been accomplished through the
use of image processing and image registration to robotic system.
In Lapray (2016) the implementation of HDR-2 video system has been first prototyped on the HDR-
ARtiST platform with the limitation P=2, using only two frames to generate each HDR frame. The authors
decided to continuously update these of exposure times from frame to frame to minimize the number of
saturated pixels by instantaneously handling any change of the light conditions. The estimation of the
best exposure times is computed from the 64-level histogram (q) provided automatically by the sensor
in the data-stream header of each frame. For each low-exposure frame (IL) and each high exposure (IH),
249
Machine Vision Application on Science and Industry
evaluating QL and QH that are, the ratio of pixels in the four lower levels and the four higher levels of
the histogram:
h =4 q (h ) h =64 q (h )
QL =∑ H =∑
Q (7)
h =1
N h =60
N
Where q(h) is the number of pixels in each histogram category h and N the total number of pixels.
A series of decisions can be performed to evaluate the low-exposure time (DtL;t.1) and the high-exposure
time (DtH;t.1) at the next iteration (t + 1) of the acquisition process.
where, x is the integration time of one sensor row. QL,req and QH,req are, respectively, the required number
of pixels for the low levels and the high levels of the histogram. To converge to the best exposure times
as fast as possible, we decide to use two different thresholds for each exposure time: thrLm and thrLp are
thresholds for the low-exposure time whereas thrHm and thrHp are for the high-exposure time. In our
design, values of QL,req and QH,req are fixed to 10% pixels of the sensor. Values of thrLm and thrHm are
fixed to 1% whereas thrLp and thrHp are fixed to 8%.
Detecting temperature of specific scene which is produced by thermal cameras could have many ad-
vantages (Gade, 2014), in order to detect some objects by their temperature may lead to a wide research
and applications, this application may give clues to health, type of object or material, as in animal and
agriculture detecting disease in animals could be easily detected by thermal cameras. The stress level
of animals before slaughtering is important to the meat quality. The stress level is correlated with the
blood and body temperature of the animal. It is therefore important to monitor and react to a rising tem-
perature. Another application is the detection of heat loss in buildings, as many application reviewed
and conducted for human in detecting facial expressions and medical analysis using thermal cameras.
Detecting and evaluating the quality in industrial, medical, agriculture and aquaculture domains as
reviewed by Tseng (2015), has a lot of applications with the aid of computer vision systems, automated
diagnosis, automatic target defect identification, intelligent faults diagnosis, intelligent quality man-
agement, on-line dimensional measurement, optimization, quality control, quality monitoring as for
e-quality control.
250
Machine Vision Application on Science and Industry
In Rashidi (2016) is proposed an approach to detect images of building materials, this approach
contains two steps, first is feature extraction, and second is supervised learning algorithm specifically
classification. First, the technique detects the feature set. Then the classifier takes the combined feature
set in order to classify the objects, as a result, the detection or in other words, the job is done using
machine learning supervised learning algorithms especially RBF, SVM, and MLP. The model with the
best results and performance is chosen.
REFERENCES
Aldrich, C., Marais, C., Shean, B. J., & Cilliers, J. J. (2010). Online monitoring and control of froth
flotation systems with machine vision: A review. International Journal of Mineral Processing, 96(1),
1–13. doi:10.1016/j.minpro.2010.04.005
Beattie, C., Leibo, J. Z., Teplyashin, D., Ward, T., Wainwright, M., Küttler, H., . . . Schrittwieser, J.
(2016). Deepmind lab. arXiv preprint arXiv:1612.03801
Beltrán Ortega, J., Gila, M., Diego, M., Aguilera Puerto, D., Gámez García, J., & Gómez Ortega, J.
(2016). Novel technologies for monitoring the in‐line quality of virgin olive oil during manufacturing
and storage. Journal of the Science of Food and Agriculture, 96(14), 4644–4662. doi:10.1002/jsfa.7733
PMID:27012363
Benalia, S., Cubero, S., Prats-Montalbán, J. M., Bernardi, B., Zimbalatti, G., & Blasco, J. (2016). Com-
puter vision for automatic quality inspection of dried figs (Ficus carica L.) in real-time. Computers and
Electronics in Agriculture, 120, 17–25. doi:10.1016/j.compag.2015.11.002
Canny, J. (1987). A computational approach to edge detection. In Readings in Computer Vision (pp.
184-203). Academic Press. doi:10.1016/B978-0-08-051581-6.50024-6
Chen, F., Brown, G. M., & Song, M. (2000). Overview of 3-D shape measurement using optical methods.
Optical Engineering (Redondo Beach, Calif.), 39(1), 10–23. doi:10.1117/1.602438
Claes, K., & Bruyninckx, H. (2007, August). Robot positioning using structured light patterns suitable
for self calibration and 3D tracking. Proceedings of the 2007 International Conference on Advanced
Robotics.
Cubero, S., Lee, W. S., Aleixos, N., Albert, F., & Blasco, J. (2016). Automated systems based on ma-
chine vision for inspecting citrus fruits from the field to postharvest—a review. Food and Bioprocess
Technology, 9(10), 1623–1639. doi:10.100711947-016-1767-1
De Fauw, J., Keane, P., Tomasev, N., Visentin, D., van den Driessche, G., Johnson, M., ... Peto, T. (2017).
Automated analysis of retinal imaging using machine learning techniques for computer vision. F1000
Research, 5. PMID:27830057
DeCost, B. L., & Holm, E. A. (2015). A computer vision approach for automated analysis and classifi-
cation of microstructural image data. Computational Materials Science, 110, 126–133. doi:10.1016/j.
commatsci.2015.08.011
251
Machine Vision Application on Science and Industry
Di Leo, G., Liguori, C., Pietrosanto, A., & Sommella, P. (2016). A vision system for the online quality
monitoring of industrial manufacturing. Optics and Lasers in Engineering, 89, 162–168. doi:10.1016/j.
optlaseng.2016.05.007
Eisenbeiss, H., Lambers, K., Sauerbier, M., & Li, Z. (2005). Photogrammetric documentation of an
archaeological site (Palpa, Peru) using an autonomous model helicopter. In CIPA 2005 (pp. 238-243).
Academic Press.
El-Gamal, F. E. Z. A., Elmogy, M., & Atwan, A. (2016). Current trends in medical image registration
and fusion. Egyptian Informatics Journal, 1(17), 99–124. doi:10.1016/j.eij.2015.09.002
Escalera, S., Athitsos, V., & Guyon, I. (2016). Challenges in multimodal gesture recognition. Journal
of Machine Learning Research, 17(72), 1–54.
Faugeras, O. (1993). Three-dimensional computer vision: a geometric viewpoint. MIT Press.
Gade, R., & Moeslund, T. B. (2014). Thermal cameras and applications: A survey. Machine Vision and
Applications, 25(1), 245–262. doi:10.100700138-013-0570-5
Gomes, P. (2011). Surgical robotics: Reviewing the past, analysing the present, imagining the future.
Robotics and Computer-integrated Manufacturing, 27(2), 261–266. doi:10.1016/j.rcim.2010.06.009
Gutschoven, B., & Verlinde, P. (2000, July). Multi-modal identity verification using support vector
machines (SVM). In Information Fusion, 2000. FUSION 2000. Proceedings of the Third International
Conference on (Vol. 2, pp. THB3-3). IEEE.
Havaei, M., Davy, A., Warde-Farley, D., Biard, A., Courville, A., Bengio, Y., ... Larochelle, H. (2016).
Brain tumor segmentation with deep neural networks. Medical Image Analysis, 35, 18–31. doi:10.1016/j.
media.2016.05.004 PMID:27310171
Hefele, J., & Brenner, C. (2001, February). Robot pose correction using photogrammetric tracking. In
Machine Vision and Three-Dimensional Imaging Systems for Inspection and Metrology (Vol. 4189, pp.
170–179). International Society for Optics and Photonics. doi:10.1117/12.417194
Jazebi, F., & Rashidi, A. (2013). An automated procedure for selecting project managers in construction
firms. Journal of Civil Engineering and Management, 19(1), 97–106. doi:10.3846/13923730.2012.738707
Kolb, A., Barth, E., Koch, R., & Larsen, R. (2009, March). Time-of-Flight Sensors in Computer Graph-
ics. In Eurographics (STARs) (pp. 119-134). Academic Press.
Lapray, P. J., Heyrman, B., & Ginhac, D. (2016). HDR-ARtiSt: An adaptive real-time smart camera for
high dynamic range imaging. Journal of Real-Time Image Processing, 12(4), 747–762. doi:10.100711554-
013-0393-7
Liu, M., Ma, J., Lin, L., Ge, M., Wang, Q., & Liu, C. (2017). Intelligent assembly system for mechanical
products and key technology based on internet of things. Journal of Intelligent Manufacturing, 28(2),
271–299. doi:10.100710845-014-0976-6
252
Machine Vision Application on Science and Industry
Luhmann, T., Robson, S., Kyle, S. A., & Harley, I. A. (2006). Close range photogrammetry: principles,
techniques and applications. Whittles.
Mahmud, M., Joannic, D., Roy, M., Isheil, A., & Fontaine, J. F. (2011). 3D part inspection path planning
of a laser scanner with control on the uncertainty. Computer Aided Design, 43(4), 345–355. doi:10.1016/j.
cad.2010.12.014
Man, Z., Lee, K., Wang, D., Cao, Z., & Khoo, S. (2013). An optimal weight learning machine for hand-
written digit image recognition. Signal Processing, 93(6), 1624–1638. doi:10.1016/j.sigpro.2012.07.016
Norouzi, A., Rahim, M. S. M., Altameem, A., Saba, T., Rad, A. E., Rehman, A., & Uddin, M. (2014).
Medical image segmentation methods, algorithms, and applications. IETE Technical Review, 31(3),
199–213. doi:10.1080/02564602.2014.906861
Pages, J., Collewet, C., Chaumette, F., & Salvi, J. (2006, June). A camera-projector system for robot po-
sitioning by visual servoing. In Computer Vision and Pattern Recognition Workshop, 2006. CVPRW’06.
Conference on (pp. 2-2). IEEE. 10.1109/CVPRW.2006.9
Patra, S., Bhowmick, B., Banerjee, S., & Kalra, P. (2012). High Resolution Point Cloud Generation from
Kinect and HD Cameras using Graph Cut. VISAPP, 12(2), 311–316.
Rashidi, A., Sigari, M. H., Maghiar, M., & Citrin, D. (2016). An analogy between various machine-learning
techniques for detecting construction materials in digital images. KSCE Journal of Civil Engineering,
20(4), 1178–1188. doi:10.100712205-015-0726-0
Rusu, R. B., & Cousins, S. (2011, May). 3d is here: Point cloud library (pcl). In Robotics and automa-
tion (ICRA), 2011 IEEE International Conference on (pp. 1-4). IEEE.
Saberioon, M., Gholizadeh, A., Cisar, P., Pautsina, A., & Urban, J. (2016). Application of machine vision
systems in aquaculture with emphasis on fish: State‐of‐the‐art and key issues. Reviews in Aquaculture.
Salvi, J. (1998). An approach to coded structured light to obtain three dimensional information. Uni-
versitat de Girona.
Salvi, J., Armangué, X., & Batlle, J. (2002). A comparative review of camera calibrating methods with
accuracy evaluation. Pattern Recognition, 35(7), 1617–1635. doi:10.1016/S0031-3203(01)00126-1
Susperregi, L., Sierra, B., Castrillón, M., Lorenzo, J., Martínez-Otzeta, J. M., & Lazkano, E. (2013).
On the use of a low-cost thermal sensor to improve kinect people detection in a mobile robot. Sensors
(Basel), 13(11), 14687–14713. doi:10.3390131114687 PMID:24172285
Tajbakhsh, N., Shin, J. Y., Gurudu, S. R., Hurst, R. T., Kendall, C. B., Gotway, M. B., & Liang, J. (2016).
Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Transac-
tions on Medical Imaging, 35(5), 1299–1312. doi:10.1109/TMI.2016.2535302 PMID:26978662
Tscharke, M., & Banhazi, T. M. (2016). A brief review of the application of machine vision in livestock
behaviour analysis. Journal of Agricultural Informatics, 7(1), 23-42.
253
Machine Vision Application on Science and Industry
Tseng, T. L. B., Aleti, K. R., Hu, Z., & Kwon, Y. J. (2015). E-quality control: A support vector machines
approach. Journal of Computational Design and Engineering, 3(2), 91–101. doi:10.1016/j.jcde.2015.06.010
Usamentiaga, R., Molleda, J., & Garcia, D. F. (2014). Structured-light sensor using two laser stripes for
3D reconstruction without vibrations. Sensors (Basel), 14(11), 20041–20063. doi:10.3390141120041
PMID:25347586
Wang, L., Schmidt, B., & Nee, A. Y. (2013). Vision-guided active collision avoidance for human-robot
collaborations. Manufacturing Letters, 1(1), 5–8. doi:10.1016/j.mfglet.2013.08.001
254