Several Data Analysis and Processing of Electronic Nose Data Preprocessing Subsystem
Several Data Analysis and Processing of Electronic Nose Data Preprocessing Subsystem
Abstract—Electronic nose is a kind of system that similar to olfactory cells, data preprocessing is
imitates biological olfactory organs. It uses different equivalent to olfactory bubble, and pattern
response sequences of gas sensor array to identify gas The recognition is equivalent to olfactory center.
electronic nose system consists of gas sensor array
subsystem, electronic nose data analysis subsyst em, II. P ATTERN RECOGNITION
electronic nose pattern recognition subsystem and
electronic nose result discrimination subsystem. The As a class of open complex system, we extend the
development of electronic nose has two significant concept of traditional network survivability to large-scale
directions: miniaturization and systematization. This networks. For a large-scale network system S, when one
paper introduces the functions and components of the or more subsystems Si suffer internal or external
electronic nose system, focusing on several data disturbances such as external attack or system failure etc.,
analysis of process in the electronic nose system. S maintains continuous services through adaptation,
There are mainly k-nearest neighbor, linear configuration, restoration, and evolution etc., and make
discriminant, principal component analysis and entire network far away from failure status. This
cluster analysis. The processing characteristics of behavioral characteristic is called survivability.
these methods are analyzed and explained.
Pattern recognition is to describe and distinguish
Keywords—electronic nose; Data Preprocessing; things by analyzing the representation information
Subsystem; Data Analysis of things. The pattern recognition system of
electronic nose is to analyze and pre judge the
I. INTRODUCTION composition of gas mixture by algorithm after
Electronic nose is a system that mimics preprocessing the output signal of sensor array.
biological olfactory organs. It is an electronic According to different recognition results, the
system that uses different response sequences of gas current pattern recognition analysis can be divided
sensor array to identify gas. Its main mechanism is into two types: qualitative recognition and
that each sensor in the gas sensor array has different quantitative recognition.
sensitivity to different measured gases, that is,
different response, that is, each sensor is similar to The qualitative identification only requires the
an olfactory cell. correct determination of the composition of the
detected gas mixture, while the quantitative
The electronic nose system can be divided into identification requires the correct determination of
three important parts: gas sensor array, data the composition and the corresponding
preprocessing and pattern recognition, as shown in concentration of the mixture.
Figure 1. In terms of function, sensor array is
Gas to be
measured Discriminant
results
Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.
According to different recognition principles, difference result, such a result is not accepted,
pattern recognition can be divided into statistical usually can give different weights to the neighbors
numerical analysis and artificial neural network. with different distances to correct. It can be seen
The artificial neural network pattern recognition from the above that k-NN algorithm is suitable for
algorithm can realize both qualitative and the classification of class domains with large sample
quantitative recognition. This paper mainly size, while it has obvious advantages in the
introduces several statistical qualitative recognition classification of materials with more overlapping or
algorithms, including k-nearest neighbor [1], linear overlapping class domains.
discrimination [2], principal component analysis [3]
and clustering analysis. B. Linear Discriminant Analysis
Linear discriminant analysis is a statistical
A. K-Nearest Neighbor analysis method. Its basic idea is to project high -
Survivability is a fundamental characteristic of large- dimensional sample instances into the optimal
scale network, and would not disappear due to system discriminant space through a certain transformation,
evolution or external environment changes. In the large- so as to achieve the effect of extracting
scale network, there exists information exchange between classification information and reducing spatial
the failed subsystems and other subsystems, and thus may dimension, and ensure that different sample instance
cause the cascaded failures. The increase of failed categories have the largest class spacing and the
subsystems may cause the whole large-scale network smallest class inner distance after projection.
failure.
K-Nearest Neighbor is a relatively simple
classification algorithm. Its principle is to store a
certain number of sample instances in advance , and
when new instances to be tested arrive, the system
will start to query the K sample instances closest to
the instance to be tested in the stored sample
instances, and then classify the samples to be tested
into the category they belong to.
As shown in Figure 2, the blue square and red
triangle are the two types of sample instances that Fig 3 The example 1 of LDA
have been stored, while the green circle is the
As shown in Fig 3 and Fig 4, the algorithm
instance to be tested. Then, it can be seen that when
includes LDA examples of instance samples of two
k = 3, red triangles account for two of the three
categories: in Fig 3, the projections of instance
neighbors of the green circle in the solid line circle,
samples of two categories on the blue line intersect
so the green circle should be classified as red
each other, and the effect of separation is not
triangle according to k-NN algorithm. However,
achieved; in Fig 4, the projections of instance
when k = 5, it can be seen from the dotted circle in
samples of two different categories on the blue line
the figure that at this time, the blue square accounts
realize good separation, and the corresponding
for 3, so the green circle should be classified as the
compactness is maintained within the same category.
blue square category.
1098
Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.
C. Principal component analysis D. Cluster analysis
Principal component analysis (PCA) is a Cluster analysis is also a kind of analysis method
multivariate statistical method. When the sample set based on multivariate statistics. According to the
to be tested has multiple variables, a small number characteristics of "birds of a feather flock together",
of important samples are selected as the main the sample cases can be grouped spontaneously
reference by mathematical methods such as linear according to their own characteristics. The
transformation. advantage of this algorithm is that there is no need
to set the sample set of classification in advance,
The main idea is to reduce the dimension. Firstly, and when the sample to be tested cannot be
the index data is standardized by the corresponding integrated into any class that has been aggregated, it
software algorithm, then the correlation between the can create a new classification and form a new
indicators is determined, and then the expression of school.
the corresponding components is determined
according to the number of principal components. In
judging the correlation of each index, we need to
obtain the principal component, that is, we need to
select the index with the largest difference (the
largest variance), that is to say, we should select the
index with the largest variance as the first principal
component, and the second principal component is
the second principal component. Similarly, we
should select the number of principal components in
turn.
REFERENCES
[1] R.J. Ellison, D.A. Fisher, and R.C. Linger, Survivable Network
System: An Emerging Discipline. Technical Report, CMU/SEI-
97-TR-013, Carnegie Mellon University, 1997.
[2] R.J. Ellison, D.A. Fisher, and R.C. Linger, “Survivability:
Protectiong Your Critical Systems,” IEEE Internet Computing, vol.
3, issue 6, pp. 55-63, 1999.
[3] D. Dan, Y.Q. Zhang, “Research on Definition of Network
Survivability,” Journal of Computer Research and Development,
vol. 43(Suppl.), pp. 525-529, 2006.
[4] K.Q. Zhao, Set pair analysis and its preliminary applicatio.
Fig 6 the example 1 of PCA Hangzhou, Zhejiang Science and Technology Press, 2000.
1099
Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.
[5] K.Q. Zhao, A.L. Xuan, “Set Pair Theory - A New Theory Method Y.L. Jiang, C.F. Xu, “Advances in Set Pair Analysis Theory and
of Non-Define and Its Applications,” System Engineering, vol. 14, its Applications,” Computer Science, vol. 33, issue 1, pp. 205-209,
issue 1, pp. 18-23, 1996. 2006.
1100
Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.