0% found this document useful (0 votes)
3 views

Research_on_Image_Recognition_Algorithm_Based_on_Spiking_Neural_Networks

Uploaded by

subashg069
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Research_on_Image_Recognition_Algorithm_Based_on_Spiking_Neural_Networks

Uploaded by

subashg069
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2023 20th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) | 979-8-3503-1898-2/23/$31.

00 ©2023 IEEE | DOI: 10.1109/ICCWAMTIP60502.2023.10387150

RESEARCH ON IMAGE RECOGNITION ALGORITHM BASED ON

SPIKING NEURAL NETWORKS


CHEN SU1, JIANPING LI1, YAQI ZHANG2
1
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu
611731, China
2
Chongqing Industry Polytechnic College, Chongqing 401120, China

E-MAIL: [email protected], [email protected], [email protected]

Abstract: Neural Networks (SNN, Spiking Neural Networks), as the


With the continuous development of artificial intelligence third generation of neural networks, is based on brain
and computer vision technology, Artificial Neural Networks science research to build a spiking neural network with
(ANNs) have achieved remarkable results in image recognition temporal dynamics, using pulse sequences as input signals
tasks. However, with the development of deep learning to build spiking neurons based on biological synaptic
technology, the number of layers of neural networks is
increasing and the training power consumption of the models
structures. The neurons are activated only when receiving
is huge. Spiking Neural Network (SNN) is a kind of neural and transmitting spiking signals, and perform sparse
network that simulates the working mechanism of biological operations during information transfer computation to
nervous system, which is characterized by the fact that the realize event-driven processing with ultra-low power
information transfer between neurons occurs in the form of consumption. It is highly biomimetic and has unique
pulses, and it has higher computational efficiency, stronger advantages in terms of response speed and power
spatio-temporal information processing capability, and better consumption.
bio-interpretability compared with traditional artificial neural
networks. This paper analyzes the technical advantages of
2. The biological characteristics of spiking neurons
spiking neural network for image recognition, simulation
experiments on spiking neural network in terms of data
preprocessing, spiking coding, gradient substitution with Spiking neurons are a special type of neuron that
reference to the classical image recognition model, and puts differs from traditional neurons in terms of pulse discharge,
forward the proposed measures for the image recognition neuronal excitation and inhibition states, pulse encoding,
algorithm based on spiking neural network based on the and temporal characteristics. In order to make spiking
experimental results. neural networks more biologically interpretable, a large
number of scholars have conducted long-term research on
Keywords: the models of spiking neurons. This article introduces
Spiking Neural Networks; Image Recognition; artificial several representative spiking neuron models in SNNs,
intelligence algorithm; Artific Spiking Neural Networks; deep briefly introduces their working principles, compares and
learning;
analyzes their corresponding mechanism characteristics and
applicability.
1. Introduction
2.1. Spiking neuron model
In recent years, deep learning represented by
traditional Artificial Neural Networks (ANN: Artificial
In 1952, Hodgkin and Huxley proposed the first
Neural Networks) has been proven to be faster and more
spiking neuron model, i.e., the Hodgkin-Huxley (H-H)
efficient in tasks such as image classification and target
model, which established the foundation for spiking neural
recognition. However, Artificial Neural Networks have
networks. Then, many scholars proposed improved spiking
their own limitations: the models consume a lot of power
neuron models based on the H-H model, such as the
and have latency problems in the use of sensors. Spiking

979-8-3503-1898-2/23/$31.00 ©2023 IEEE

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.
Integrate-and-Fire (IF) model and LIF model proposed by
Lapicque and Brown, the Spike Response Model (SRM)
proposed by Gersterner and Kistler, and so on. Model
(SRM) proposed by Gersterner and Kistler, Common
formulas for spiking neuronal dynamics are shown in Table
1.
Table 1 Spiking neuron nodes represented by dynamic
formulas
Spiking neurons dynamics formula
𝑑𝑉𝑚
𝐼𝑚 = 𝐶𝑚 + 𝑔̅𝐾 𝑛4 (𝑉𝑚 − 𝑉𝐾 )
Hodgkin-Huxley 𝑑𝑡
+𝑔̅𝑁𝑎 𝑚3 ℎ(𝑉𝑚 − 𝑉𝑁𝑎 ) + 𝑔̅𝑙 (𝑉𝑚 − 𝑉𝑙 )
𝑑𝑉(𝑡)
𝜏𝑚 = −(𝑉 − 𝑉rest ) + 𝑅𝑚 𝐼.
LIF 𝑑𝑡
Fig. 1 LIF neuronal membrane potential change process
𝑢(𝑡) = 𝜂(𝑡 − 𝑡̂) ̂)
𝑖 + ∑ 𝑤𝑖𝑗 ∑ 𝜀𝑖𝑗 (𝑡 − 𝑡𝑖
SRM 𝑗 𝑓

𝑑𝑉 2.2. Spiking encoding method


= 0.04𝑉 2 + 5𝑉 + 140 − 𝑢 + 𝐼
Izhikevich 𝑑𝑡
𝑑𝑢
= 𝑎(𝑏𝑉 − 𝑢). The basic data structure of image data in neural
𝑑𝑡
networks is generally represented by a three-dimensional
Tensor or three-dimensional matrix, with three dimensions
LIF model and SRM model are the most commonly generally (height, width, channel). spiking neural networks
used neuron models for SNN, and the LIF model used in process spiking signals, so image information needs to be
this paper is relatively simple and efficient for simulation converted into a series of spiking sequences. In spiking
experiments, which is suitable for large-scale network neural networks, it is often necessary to preprocess the
simulation experiments, and not only has the characteristics input data. According to the biological characteristics of
of biological neuron, but also has a great advantage in terms spiking neural networks, common encoding methods for
of complexity and computational ability. The basic data preprocessing include frequency encoding, time
principle of this model is as follows: "When the input to the encoding, and Poisson encoding.
synaptic neuron does not exceed the threshold value Vth , the
membrane potential will fall back to the resting value Vreset . 2.2.1 Frequency-based encoding
When the membrane potential of the synaptic neuron
reaches the threshold value Vth after the integration of the Frequency coding belongs to the traditional
spiking received by the synaptic neuron, it will be excited information encoding method of spiking neural networks,
by the spiking signal, and will stimulate the post-synaptic mainly focusing on the spiking firing rate, which is the
neuron, and in addition to the reset of the neuron's number of spiking emitted by the spiking neuron n_ The
membrane potential V = Vreset , Increase the threshold ΔVt, average value of 𝑛𝑠𝑝 (𝑇) at its corresponding recording time
making the neuron's response to the next spiking weaker. 𝑇 is expressed as:
d𝑉 𝑛𝑠𝑝 (𝑇)
𝜏𝑚 = 𝑉rest − 𝑉 + 𝑅𝑚 𝐼 (1) 𝑉= (2)
d𝑡 𝑇
Where: 𝜏𝑚 denotes the membrane time constant, 𝑉rest Generally speaking, the spiking frequency is
denotes the resting potential, 𝑅𝑚 and 𝐼 denote the controlled between 0-1. By adjusting the spiking frequency,
impedance of the cell membrane and the input current information of different sizes can be transmitted. The
respectively shown in Fig.1. encoding process of the spiking frequency is shown in Fig.2

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.
2.2.3 Poisson coding

In spiking neural networks, Poisson encoding is a


commonly used encoding method. The input data is
converted by a Poisson encoder to obtain a spiking
sequence, and the number of spiking emitted in this
sequence follows a Poisson distribution. The mean of
Poisson distribution 𝜆 In the spiking sequence, represents
the average number of spiking emitted 𝑁 within time 𝑇 ,
which is 𝑁/𝑇, which is the average spiking emission rate.
The probability density function of the Poisson distribution
in Poisson coding is:
Fig. 2 Spiking frequency encoding
𝜆𝑘 −𝜆
𝑃(𝑋 = 𝑘) = 𝑒 , 𝑘 = 0,1, ⋯ (4)
𝑘!
2.2.2 Time-based encoding
𝑘 is the number of excitation spiking, and the average
number of spiking emitted within time step 𝑇 is 𝜆 , The
Unlike frequency coding, spiking temporal coding
focuses more on the differences in the temporal structure of probability of not generating a spiking within time t is 𝑒 −𝜆𝑡 .
spiking issuance, and common temporal codes include first Poisson coding estimates specific precise values through
spiking event coding and sequential coding. According to the Poisson distribution of a long time series, and the
biological studies, most of the spiking information occurs spiking sequence after Poisson coding satisfies independent
within 20ms∼50ms of stimulus generation, and based on increment, incremental stationarity, and commonality.
the assumption that the neuron emits only one spiking,
focusing on the first spiking, and integrating the rest of the 3. Spiking neural network learning method
spiking according to the weight for the mean, the principle
of the first- spiking coding is expressed as follows: Based on the LIF neuron model combined with the
classic LENET model, a spiking neural network structure is
𝑡2 − 𝑡1 𝑡1 ⋅ 𝑚𝑎𝑥 − 𝑡2 ⋅ 𝑚𝑖𝑛 used to train on the MNIST dataset. Different numbers of
𝑃(𝑥) = ⋅𝑓+ (3)
𝑚𝑎𝑥 − 𝑚𝑖𝑛 𝑚𝑎𝑥 − 𝑚𝑖𝑛 neurons are added for comparison, achieving image
Among them, 𝑥 ∈ [𝑚𝑖𝑛, 𝑚𝑎𝑥] is the input signal, and recognition and classification with an accuracy rate of up to
the encoded spiking sequence is 𝑃(𝑥) ∈ [𝑚𝑖𝑛, 𝑚𝑎𝑥]. In the 98%. Compared with similar classical artificial neural
time coding method, a signal is added to the spiking networks, the training power consumption is reduced while
sequence in the time step to avoid the insensitivity of the the accuracy is approximate, and the faster training speed is
average firing frequency to the timing information. The more biologically reasonable and interpretable, The training
time coding method is mainly shown in Fig.3 process is shown in Fig.4:

Fig. 3 Time encoding


Fig. 4 Process structure diagram of spiking neural network
image classification

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.
When using spiking neural network models for image feature values of temporal spiking belong to sparse matrices
recognition and classification prediction, the overall after convolution, and neurons are often very sensitive to
network layer is similar to artificial neural networks. When the frequency and timing of input spiking. If not
neurons process corresponding input data, specific spiking normalized, it is easy to cause data gradient explosion or
mechanisms are used, and the overall data presents the vanishing, and the network model overfits during training.
characteristics of sparse matrices. Therefore, in data In order to reduce the impact of data with different feature
preprocessing, the timing processing of input data, the scales, improve the generalization ability of neural network
normalization processing of convolutional pooling data, and models, and accelerate the learning process, it is necessary
the method of spiking encoding are used, Improvements are to add a normalization layer after the convolutional layer to
needed in key methods such as gradient substitution during adjust the weight parameters more smoothly and stably. The
the learning process of network models. specific calculation formula is:
𝑥 − E[𝑥]
3.1. Data preprocessing stage 𝑦= ∗𝛾+𝛽 (6)
√Var[𝑥] + 𝜖
Due to the excitation characteristics of the spiking
Where E[𝑥] is the mean μ, Var [𝑥] is the variance
mechanism, image data needs to undergo temporal 2
processing before being transmitted to neurons. Frequency 𝜎 . Obtained through statistics during forward propagation.
encoding can be adopted, with a time step of𝑇 = 4 and an
attenuation coefficient of tau = 0.75 . The ratio of the 3.4. Gradient substitution function
number of spiking emitted by neurons to the time step 𝑇 is
the spiking emission frequency 𝑉 . The corresponding In the training process of spiking neural networks
relationship between the temporal sequence is established, (SNNs), traditional gradient descent methods are difficult to
and the formula is: directly apply due to the differential non differentiability of
the received continuous spiking signals. The gradient
𝑛𝑠𝑝 (𝑇) substitution method is a commonly used solution, whose
𝑉= (5)
𝑇 basic principle is to use a differentiable function to
approximate the differentiation of the original function, so
3.2. Spiking neural network structure model that the optimization process can proceed. Common
substitution functions include linear functions, quadratic
Referring to LeNet5 in CNN, it is very common to use functions, Gaussian functions, etc. According to relevant
convolutional layers, pooling layers, and fully connected research, there is no significant difference in performance
layers in current artificial neural networks. A neural between them, but their sensitivity to hyperparameters
network with fewer layers and simpler layers is used to varies greatly. The commonly used gradient substitution
combine with the relevant mechanisms of spiking neural formulas are as follows:
networks in experiments. By comparing the learning decay 1 𝑎1
rate and recognition rate, the feasibility of spiking neural ℎ1 (𝑢) = sign (|𝑢 − 𝑉𝑡ℎ | < ) ,
networks in image recognition can be analyzed. The overall 𝑎1 2
model architecture is shown in Fig.5: √𝑎2 𝑎2 2
ℎ2 (𝑢) = ( − |𝑢 − 𝑉𝑡ℎ |) sign ( − |𝑢 − 𝑉𝑡ℎ |) ,
2 4 √𝑎2
𝑉𝑡ℎ−𝑢
1 𝑒 𝑎3
ℎ3 (𝑢) = ,
𝑎3 𝑉𝑡ℎ−𝑢 2
(1 + 𝑒 𝑎3 )

1 (𝑢−𝑉𝑡ℎ)2

ℎ4 (𝑢) = 𝑒 2𝑎4 , (7)
Fig.5 Process structure diagram of spiking neural network
√2𝜋𝑎4
image classification
1 𝑎1
We choose ℎ1 (𝑢) = sign (|𝑢 − 𝑉𝑡ℎ | < ) , as a
3.3. Data normalization processing 𝑎1 2
gradient replacement function.
According to the operation mechanism of spiking
neural networks, it can be found in experiments that the

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.
4. Experimental results to the 6703 samples with a true value of 0, which are
predicted to be 0. Compared to the data in the confusion
The MNIST dataset is an open-source dataset for matrix in Fig.A, Fig.9 has a larger number of correctly
handwritten digit images, which collected 70000 predicted samples at the diagonal, indicating that the model
handwritten digit images, including 60000 training samples training is effective.
and 10000 test samples. Each image has a size of 28x28
pixels, where each pixel represents a grayscale value
between 0 and 255. It is commonly used for testing image
classification and recognition tasks. We validated the graph
classification recognition rate of the spiking neural network
on this dataset, and the relevant parameters are shown in
Table 2:
Table 2 Main parameters of the model
Parameter Name Parameter Value
𝑇 4 (a)
a 1.0
𝜏 0.75
𝑉𝑡ℎ 1mV
𝑚𝑜𝑚𝑒𝑛𝑡𝑢𝑚_𝑆𝐺𝐷 0.9
Epoch 10
𝑁𝐵 64
In learning and training, first normalize the learning
𝑋 −𝜇
data according to the input criteria, and the formula is: 𝑖 ,
𝜎
The obtained data results satisfy a normal distribution. The (b)
reason for preprocessing the data here is that the sparse
matrix data after spiking processing in the model undergoes
gradient explosion or vanishing after multiple layers of
calculation. Combined with the image symmetry of the
subsequent gradient replacement function, using
normalized data processing with a normal distribution can
also improve the model's performance.
In the process of learning and training spiking neural
networks, N is used 𝑁𝐵 = 64 samples are used as the
number of samples for each iteration, so the number of
𝑚
iterations required to complete an epoch 𝑛 = . Print the (c)
𝑁𝐵
loss every 200 small batches, take the parameter epochs=10
data for 10 iterations, and gradually optimize and update
the parameters.
In order to observe the specific changes in
classification performance of spiking neural networks
during the learning and training process on the dataset more
intuitively, a confusion matrix is used to display the
classification results of images at different stages. A is the
confusion matrix after training the first epoch, and Figures
B, C, and D are the confusion matrices after training 1, 9,
and 10 epochs, respectively. The horizontal axis represents (d)
the predicted value, the vertical axis represents the true Fig. 5 Different epoch models training MNIST image
value, and the middle number refers to the number of classification confusion matrix: (a) epoch=1, Loss=2.2946 ;
samples. For example, the first number 6704 in Fig.B refers

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.
(b) Epoch=2, Loss=0.1734 ; (c) Epoch=9, Loss=0.0381 ; (d) Co., Ltd., Chengdu Civil-military Integration Project
Epoch=10, Loss=0.0398 Management Co., Ltd., and Sichuan Yin Ten Gu
Technology Co., Ltd.
After 10 rounds of training on the dataset, the accuracy
reached 98.51%. The results show that using pulse neural
References
networks for image recognition training, after a certain
degree of data preprocessing, gradient replacement, and
related parameter design, the classification accuracy [1] Deng B, Jiao Y, Gao T, et al. Object Tactical
Classification Based on Convolutional Spiking Neural
obtained is close to the accuracy of artificial neural
Network with Attention Mechanism, Systems
networks.
Engineering Society of China. Proceedings of the
42nd China Control Conference (14) School of
5. Conclusions
Electrical and Information Engineering, Tianjin
University;, 2023:6. DOI:
Spiking neural network is a third-generation neural
10.26914/c.cnkihy.2023.036668.
network that is more biologically interpretable. Its unique
[2] Li Pu, Shao Shuyi, Wu Qingxian, et al. Unmanned
advantages such as temporal encoding, spiking mechanism,
Aerial Vehicle Maneuver Decision Based on Pulse
and low-power mode are not available in current artificial
Neural Networks [C]//Northeastern University,
neural networks. Studying spiking neural networks can save
Information Physics System Control and Decision
computational performance and help us better understand
Professional Committee of the Chinese Society of
biological systems. This article improves the classical
Automation. Proceedings of the 35th China Control
model through specific simulation experiments,
and Decision Conference (6). School of Automation,
preprocesses data based on the characteristics of spiking
Nanjing University of Aeronautics and Astronautics;,
mechanism in learning and training, and verifies through
2023:6. DOI: 10.26914/c.cnkihy.2023.032126.
simulation experiments that spiking neural networks can
[3] Chen W, Li C An approximate gradient descent
approach the accuracy of artificial neural networks on some
algorithm for Spiking Neural Network
small-scale datasets. Currently, research and application
[C]//Northeastern University, Information Physical
based on spiking neural networks are still in the initial stage
System Control and Decision Professional Committee
of development, The training effectiveness in larger
of the Chinese Society of Automation. Proceedings of
datasets and deeper models also has great potential waiting
the 35th China Control and Decision Conference (9)
for further exploration.
College of Electronic and Information Engineering,
Southwest University;, 2023:5. DOI:
Acknowledgments 10.26914/c.cnkihy.2023.030579.
[4] Zhou J, Xiong J Object detection and tracking method
This work was supported by the National Natural based on spiking neural network [C]//Technical
Science Foundation of China, the National High Committee on Control Theory, Chinese Association of
Technology Research and Development Program of China, Automation, Chinese Association of Automation,
New Century Excellent Talent Support Project of Chinese Systems Engineering Society of China. Proceedings of
Ministry of Education, the project of Science and the 41st China Control Conference (12) School of
Technology Department of Sichuan Province(Grant No. Artificial Intelligence, Jianghan University;, 2022:5.
2021YFG0322), the project of Science and Technology DOI: 10.26914/c.cnkihy.2022.024349.
Department of Chongqing Municipality(Grant No. [5] Liu Y ,Wang L .Visual bio-inspired memristive
CSTC2022JXJL00017), the project of Science and spiking neural network in image edge
Technology Department of Chongqing Municipality extraction[C]//Information Engineering Research
(Grant No. CSTC2022JXJL0214), the Science and Institute, USA.Proceedings of 2014 International
Technology Research Program of Chongqing Municipal Conference on Simulation and Modeling
Education Commission (Grant No. KJZD-K202214401), Methodologies,Technologies and Applications(SMTA
the Science and Technology Research Program of 2014 VⅠ).Institute of Applied Electronics,Northeast
Chongqing Municipal Education Commission (Grant No. Normal University;,2014:9.
KJZD-K202114401),Chengdu Chengdian Network [6] Yin L, Pi X, Yang D Implementing an Artistic Spiking
Technology Co., Ltd., Chongqing Qinchengxing Different Nerve Using the VO_ 2 Thin-Film Device
Technology Co., Ltd., Chengdu Haitian Digital Technology on SiO_ 2 [C]//The Chinese Vacuum Society.

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the 10th Asian and Australian Vacuum
and Surface Science Conference, the 2021 Academic
Annual Meeting of the Chinese Vacuum Society, and
the 15th International Conference on Vacuum Science
and Engineering Applications State Key Laboratory of
Silicon Materials and School of Materials Science and
Engineering, Zhejiang University;, 2021:2. DOI:
10.26914/c.cnkihy.2021.038610.
[7] SONG Z ,XIANG S ,GUO X , et al.Nonlinear neural
computation in an integrated FP-SA spiking neuron
subject to incoherent dual-wavelength optical pulse
injections[J].Science China(Information
Sciences),2023,66(12):299-300.
[8] Feng Yifei, Wang Qingshan. Long Short Term
Memory Pulse Neural Network Sign Language
Recognition Model [J]. Journal of Hefei University of
Technology (Natural Science Edition), 2023,46 (11):
1479-1483+1541.
[9] Yue Bin. Design and Implementation of Handwritten
Digit Recognition System Based on Pulse Neural
Network [D]. University of Chinese Academy of
Sciences (Chinese Academy of Sciences Shenzhen
Institute of Advanced Technology), 2022. DOI:
10.27822/d.cnki.gszxj.2022.000152.
[10] Chen Yunxiang. Research on Deep Pulse Neural
Network Training Algorithm Based on STDP [D].
Guangdong University of Technology, 2022. DOI:
10.27029/dcnki.ggdgu.2022.001478.

Authorized licensed use limited to: ANNA UNIVERSITY. Downloaded on January 09,2025 at 06:48:52 UTC from IEEE Xplore. Restrictions apply.

You might also like