0% found this document useful (0 votes)
40 views

Several Data Analysis and Processing of Electronic Nose Data Preprocessing Subsystem

The document discusses several data analysis and processing techniques for electronic nose data preprocessing subsystems. It introduces electronic nose systems and their components, focusing on common data analysis methods for preprocessing electronic nose data, including k-nearest neighbor, linear discriminant analysis, principal component analysis, and cluster analysis. It analyzes the processing characteristics of these methods and explains how pattern recognition in electronic nose systems works to analyze and preliminarily judge the composition of gas mixtures through algorithms after preprocessing sensor array output signals.

Uploaded by

陈述涵
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Several Data Analysis and Processing of Electronic Nose Data Preprocessing Subsystem

The document discusses several data analysis and processing techniques for electronic nose data preprocessing subsystems. It introduces electronic nose systems and their components, focusing on common data analysis methods for preprocessing electronic nose data, including k-nearest neighbor, linear discriminant analysis, principal component analysis, and cluster analysis. It analyzes the processing characteristics of these methods and explains how pattern recognition in electronic nose systems works to analyze and preliminarily judge the composition of gas mixtures through algorithms after preprocessing sensor array output signals.

Uploaded by

陈述涵
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

IAEAC 2021(ISSN 2689-6621)

Several Data Analysis And Processing of


Electronic Nose Data Preprocessing Subsystem
2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC) | 978-1-7281-8028-1/20/$31.00 ©2021 IEEE | DOI: 10.1109/IAEAC50856.2021.9390785

Huan zhang1 ,Hongli Tai1


1. Department of Information Engineering, Sichuan Staff University of Science Technology, Chengdu, China
[email protected], [email protected]
Corresponding Author: Hongli Tai Email: [email protected]

Abstract—Electronic nose is a kind of system that similar to olfactory cells, data preprocessing is
imitates biological olfactory organs. It uses different equivalent to olfactory bubble, and pattern
response sequences of gas sensor array to identify gas The recognition is equivalent to olfactory center.
electronic nose system consists of gas sensor array
subsystem, electronic nose data analysis subsyst em, II. P ATTERN RECOGNITION
electronic nose pattern recognition subsystem and
electronic nose result discrimination subsystem. The As a class of open complex system, we extend the
development of electronic nose has two significant concept of traditional network survivability to large-scale
directions: miniaturization and systematization. This networks. For a large-scale network system S, when one
paper introduces the functions and components of the or more subsystems Si suffer internal or external
electronic nose system, focusing on several data disturbances such as external attack or system failure etc.,
analysis of process in the electronic nose system. S maintains continuous services through adaptation,
There are mainly k-nearest neighbor, linear configuration, restoration, and evolution etc., and make
discriminant, principal component analysis and entire network far away from failure status. This
cluster analysis. The processing characteristics of behavioral characteristic is called survivability.
these methods are analyzed and explained.
Pattern recognition is to describe and distinguish
Keywords—electronic nose; Data Preprocessing; things by analyzing the representation information
Subsystem; Data Analysis of things. The pattern recognition system of
electronic nose is to analyze and pre judge the
I. INTRODUCTION composition of gas mixture by algorithm after
Electronic nose is a system that mimics preprocessing the output signal of sensor array.
biological olfactory organs. It is an electronic According to different recognition results, the
system that uses different response sequences of gas current pattern recognition analysis can be divided
sensor array to identify gas. Its main mechanism is into two types: qualitative recognition and
that each sensor in the gas sensor array has different quantitative recognition.
sensitivity to different measured gases, that is,
different response, that is, each sensor is similar to The qualitative identification only requires the
an olfactory cell. correct determination of the composition of the
detected gas mixture, while the quantitative
The electronic nose system can be divided into identification requires the correct determination of
three important parts: gas sensor array, data the composition and the corresponding
preprocessing and pattern recognition, as shown in concentration of the mixture.
Figure 1. In terms of function, sensor array is

Sensor Data analysis Pattern


array recognition

Gas to be
measured Discriminant
results

Olfactory Olfactory Olfactory


cells vesicle center

Fig 1 block diagram of electronic nose system

978-1-7281-8028-1/21/$31.00 ©2021 IEEE 1097

Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.
According to different recognition principles, difference result, such a result is not accepted,
pattern recognition can be divided into statistical usually can give different weights to the neighbors
numerical analysis and artificial neural network. with different distances to correct. It can be seen
The artificial neural network pattern recognition from the above that k-NN algorithm is suitable for
algorithm can realize both qualitative and the classification of class domains with large sample
quantitative recognition. This paper mainly size, while it has obvious advantages in the
introduces several statistical qualitative recognition classification of materials with more overlapping or
algorithms, including k-nearest neighbor [1], linear overlapping class domains.
discrimination [2], principal component analysis [3]
and clustering analysis. B. Linear Discriminant Analysis
Linear discriminant analysis is a statistical
A. K-Nearest Neighbor analysis method. Its basic idea is to project high -
Survivability is a fundamental characteristic of large- dimensional sample instances into the optimal
scale network, and would not disappear due to system discriminant space through a certain transformation,
evolution or external environment changes. In the large- so as to achieve the effect of extracting
scale network, there exists information exchange between classification information and reducing spatial
the failed subsystems and other subsystems, and thus may dimension, and ensure that different sample instance
cause the cascaded failures. The increase of failed categories have the largest class spacing and the
subsystems may cause the whole large-scale network smallest class inner distance after projection.
failure.
K-Nearest Neighbor is a relatively simple
classification algorithm. Its principle is to store a
certain number of sample instances in advance , and
when new instances to be tested arrive, the system
will start to query the K sample instances closest to
the instance to be tested in the stored sample
instances, and then classify the samples to be tested
into the category they belong to.
As shown in Figure 2, the blue square and red
triangle are the two types of sample instances that Fig 3 The example 1 of LDA
have been stored, while the green circle is the
As shown in Fig 3 and Fig 4, the algorithm
instance to be tested. Then, it can be seen that when
includes LDA examples of instance samples of two
k = 3, red triangles account for two of the three
categories: in Fig 3, the projections of instance
neighbors of the green circle in the solid line circle,
samples of two categories on the blue line intersect
so the green circle should be classified as red
each other, and the effect of separation is not
triangle according to k-NN algorithm. However,
achieved; in Fig 4, the projections of instance
when k = 5, it can be seen from the dotted circle in
samples of two different categories on the blue line
the figure that at this time, the blue square accounts
realize good separation, and the corresponding
for 3, so the green circle should be classified as the
compactness is maintained within the same category.
blue square category.

Fig.2. The example of k-nearest neighbor.


Fig 4 The example 2 of LDA
From the above analysis, we can see the
following points: (1) the selection of k-NN general Although LDA algorithm achieves the
K value is odd, to prevent ambiguity; (2) when k classification of the side samples by the method of
value selection is different, it may produce intra class compact and inter class separation, when
completely opposite results, which shows that the the number of sample categories is large, this
selection of K value is very important. The algorithm will encounter the problem of "dimension
difference of K value actually produces such a huge disaster".

1098

Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.
C. Principal component analysis D. Cluster analysis
Principal component analysis (PCA) is a Cluster analysis is also a kind of analysis method
multivariate statistical method. When the sample set based on multivariate statistics. According to the
to be tested has multiple variables, a small number characteristics of "birds of a feather flock together",
of important samples are selected as the main the sample cases can be grouped spontaneously
reference by mathematical methods such as linear according to their own characteristics. The
transformation. advantage of this algorithm is that there is no need
to set the sample set of classification in advance,
The main idea is to reduce the dimension. Firstly, and when the sample to be tested cannot be
the index data is standardized by the corresponding integrated into any class that has been aggregated, it
software algorithm, then the correlation between the can create a new classification and form a new
indicators is determined, and then the expression of school.
the corresponding components is determined
according to the number of principal components. In
judging the correlation of each index, we need to
obtain the principal component, that is, we need to
select the index with the largest difference (the
largest variance), that is to say, we should select the
index with the largest variance as the first principal
component, and the second principal component is
the second principal component. Similarly, we
should select the number of principal components in
turn.

Fig 7 The example of CA

In CA analysis, many algorithms are extended,


mainly based on distance, such as k-means, Clara
etc. As shown in Figure 7, the red triangle, green
inverted triangle, blue inverted triangle and blue
square are four different types of substances. Under
the CA algorithm, they spontaneously form
aggregation and basically achieve the purpose of
classification.
Fig 5 the example 1 of PCA
III. CONCLUSIONS
As shown in Figure 5 and Figure 6, the five
In this paper, we clarify the working mechanism of the
samples in the figure are projected onto two straight
electronic nose system, describes three important parts of
lines respectively. In fact, the projection of Figure 5
the electronic nose system: sensor array, data
is better than that of Figure 6. In signal processing,
preprocessing and pattern recognition. Then we focus on
it is generally believed that the larger the variance
four common algorithms of the electronic nose
of the signal, the smaller the variance of the
Preprocessing Subsystem. Finally, we analyze the
corresponding noise, that is, the greater the signal -
effectiveness of these four methods through case
to-noise ratio is, the better. The projection variance
modeling. The future research direction will improve the
of each sample point in Fig 5 is obviously larger
analysis and identification ability of Preprocessing
than that in Fig 6.
Subsystem.

REFERENCES
[1] R.J. Ellison, D.A. Fisher, and R.C. Linger, Survivable Network
System: An Emerging Discipline. Technical Report, CMU/SEI-
97-TR-013, Carnegie Mellon University, 1997.
[2] R.J. Ellison, D.A. Fisher, and R.C. Linger, “Survivability:
Protectiong Your Critical Systems,” IEEE Internet Computing, vol.
3, issue 6, pp. 55-63, 1999.
[3] D. Dan, Y.Q. Zhang, “Research on Definition of Network
Survivability,” Journal of Computer Research and Development,
vol. 43(Suppl.), pp. 525-529, 2006.
[4] K.Q. Zhao, Set pair analysis and its preliminary applicatio.
Fig 6 the example 1 of PCA Hangzhou, Zhejiang Science and Technology Press, 2000.

1099

Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.
[5] K.Q. Zhao, A.L. Xuan, “Set Pair Theory - A New Theory Method Y.L. Jiang, C.F. Xu, “Advances in Set Pair Analysis Theory and
of Non-Define and Its Applications,” System Engineering, vol. 14, its Applications,” Computer Science, vol. 33, issue 1, pp. 205-209,
issue 1, pp. 18-23, 1996. 2006.

1100

Authorized licensed use limited to: Zhejiang University. Downloaded on April 20,2023 at 03:03:30 UTC from IEEE Xplore. Restrictions apply.

You might also like