0% found this document useful (0 votes)
13 views20 pages

Sensors 23 07208

Uploaded by

1070038825
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views20 pages

Sensors 23 07208

Uploaded by

1070038825
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

sensors

Article
Machine Learning-Based Human Posture Identification
from Point Cloud Data Acquisitioned by FMCW
Millimetre-Wave Radar
Guangcheng Zhang 1 , Shenchen Li 1 , Kai Zhang 1 and Yueh-Jaw Lin 2, *

1 School of Mechanical Engineering, University of Shanghai for Science and Technology,


Shanghai 200093, China; [email protected] (G.Z.); [email protected] (S.L.);
[email protected] (K.Z.)
2 College of Engineering and Engineering Technology, Northern Illinois University, DeKalb, IL 60115, USA
* Correspondence: [email protected]

Abstract: Human posture recognition technology is widely used in the fields of healthcare, human-
computer interaction, and sports. The use of a Frequency-Modulated Continuous Wave (FMCW)
millimetre-wave (MMW) radar sensor in measuring human posture characteristics data is of great
significance because of its robust and strong recognition capabilities. This paper demonstrates how
human posture characteristics data are measured, classified, and identified using FMCW techniques.
First of all, the characteristics data of human posture is measured with the MMW radar sensors.
Secondly, the point cloud data for human posture is generated, considering both the dynamic and
static features of the reflected signal from the human body, which not only greatly reduces the
environmental noise but also strengthens the reflection of the detected target. Lastly, six different
machine learning models are applied for posture classification based on the generated point cloud
data. To comparatively evaluate the proper model for point cloud data classification procedure—in
addition to using the traditional index—the Kappa index was introduced to eliminate the effect due
to the uncontrollable imbalance of the sampling data. These results support our conclusion that
among the six machine learning algorithms implemented in this paper, the multi-layer perceptron
Citation: Zhang, G.; Li, S.; Zhang, K.; (MLP) method is regarded as the most promising classifier.
Lin, Y.-J. Machine Learning-Based
Human Posture Identification from
Keywords: human posture; FMCW millimetre-wave radar; machine learning; comprehensive evaluation
Point Cloud Data Acquisitioned by
FMCW Millimetre-Wave Radar.
Sensors 2023, 23, 7208. https://
doi.org/10.3390/s23167208
1. Introduction
Academic Editors: Adam M.
Human postures can visually convey information about the human body, which finds
Kawalec, Marta Walenczykowska
applications in various fields such as safety production, human vital signs monitoring,
and Ksawery Krenc
and information interaction. As society embraces informatization, accurately detecting
Received: 18 July 2023 and classifying human body postures can yield effective responses in recognition systems.
Revised: 8 August 2023 For instance, in coal mines’ underground operations, where working conditions can be
Accepted: 9 August 2023 extremely dangerous, identifying human body targets more effectively can reduce acci-
Published: 16 August 2023 dents [1]. Moreover, aging and accidental falls heavily impact the physical function of
the elderly, leading to severe injuries. Real-time posture recognition of the elderly enables
timely assistance and prevents falls [2]. In the domain of human-computer interaction,
human body postures act as information carriers, serving as valuable data for recognition
Copyright: © 2023 by the authors.
systems [3].
Licensee MDPI, Basel, Switzerland.
With technology advancements, various methods for detecting human body postures
This article is an open access article
are emerging. Vision-based systems utilize cameras to capture human postures, extract
distributed under the terms and
features from contours, and employ recognition algorithms for posture recognition [4–6].
conditions of the Creative Commons
Attribution (CC BY) license (https://
However, concerns over privacy limit the acceptance of cameras at home or work [7], and
creativecommons.org/licenses/by/
vision-based systems may suffer performance limitations during hostile weather conditions.
4.0/). Alternatively, wearable devices are used for posture detection, but they can be inconvenient

Sensors 2023, 23, 7208. https://doi.org/10.3390/s23167208 https://www.mdpi.com/journal/sensors


Sensors 2023, 23, 7208 2 of 20

and costly [8]. Radar-based systems, on the other hand, offer a non-intrusive solution
that addresses privacy concerns and remains robust under different lighting conditions.
This system utilizes radio waves to determine the position of a target by processing the
echo signal.
Due to its ability to obtain position information of target and static objects, along with
observing their slight vibrations and speeds, FMCW radio technology is widely used in
many scenarios. For example, in the field of measurement, Wang et al. [9] proposed a
calibration method using millimetre wave radar and camera fusion. Compared with the
traditional calibration method, the calibration error is reduced significantly. Other applica-
tions include automotive radar [10], drone detection [11], snow research [12], contactless
measurement [13,14], vital sign detection [15], remote sensing [16], gait recognition [17], etc.
With the increasing demand for healthcare technology, surveillance, and human-machine
interfaces, the use of FMCW mmWave radar to recognize human postures has become an
important topic. He et al. [18] proposed using FMCW radar for human target recognition
in non-direct sight scenes. The experimental results demonstrated accurate identification
of real humans and human-mimicking man-made objects, even in blocked scenes. Zhang
et al. [19] developed a multi-angle entropy feature and an improved ELM method for iden-
tifying human activity. The experiment achieved over 86% accuracy for outdoor scenes and
98% for indoor micro-movements. Aman Shrestha et al. [20] introduced a method based
on recurrent long and short-term memory (LSTM) and bi-directional LSTM network archi-
tecture for continuous human activity monitoring and classification, achieving an average
accuracy of over 90% when combined with Doppler domain data from FMCW radar. Liang
et al. [21] designed a fall detection system based on FMCW radar, using Bi-LSTM for clas-
sification. The system achieved a remarkable 99% classification accuracy. Zhou et al. [22]
presented a method for human sleep posture recognition based on MMW radar. The radar
echo signal was processed to obtain multi-channel 2D radar features, and neural networks
were employed for learning and classification. The results effectively distinguished differ-
ent sleeping postures. Overall, the use of FMCW mmWave radar for recognizing human
postures continues to be an important area of research and development.
Currently, 3D target recognition is a significant area of research. Point cloud data, which
includes 3D coordinates (x, y, z), density, reflection intensity, and other features, offers more in-
formation than images. Huang et al. [23] used point clouds for 3D face model reconstruction to
aid identification. Wang et al. [24] established a 3D mining area model using point cloud data
for environmental analysis. Poux et al. [25] utilized point clouds for indoor 3D modeling and
object classification. Point clouds are also widely used for generating and classifying human
postures. Zhao et al. [26] proposed a human tracking and identification system (mID1) based
on FMCW millimeter-wave radar, achieving an overall recognition accuracy of 89% among
12 individuals and 73% intruder detection accuracy. Meng et al. [27] developed mmGaitNet,
a deep learning-driven millimeter-wave gait recognition method with 90% accuracy for a
single-person scenario. Aiujaim et al. [28] used FMCW radar to recognize multiple human
activities, classifying motion with an 80% accuracy based on point cloud data. While single
neural network models offer automated feature extraction, machine learning is more suitable
for this study due to the high data volume and computational complexity [29–31]. Diraco
et al. [32] achieved a 97% classification accuracy using the SVM algorithm on 3D human
posture point clouds. Werghi et al. [33] employed a Bayesian classification model based on
wavelet transform coefficients, achieving 98% accuracy. However, the above studies only
use a single machine learning model and do not put forward a comprehensive method for
evaluating the classification effect of machine learning.
This paper aims to evaluate the performance of different machine learning models on
human posture point cloud data based on FMCW mmWave radar. This goal is achieved
in two stages. In the first stage, the characteristics data of human posture are measured
and collected using the radar sensor and then the human body posture point cloud data is
generated considering both the dynamic and static features of the reflected signal for the
human body. Based on the previous research method [34], six hundred sets of point cloud
Sensors 2023, 23, 7208 3 of 20

posture data were obtained from one hundred sets of point cloud data for each of the six
postures (hands up, horse stance, lunge, lying down, standing, and sitting). In the second
stage, the point cloud dataset is used for six machine learning classification models, namely
K-nearest neighbor (KNN), Gaussian process (GP), SVM, multi-layer perceptron (MLP),
naive Bayes (NB), and gradient boosting (GB). Finally, a comprehensive performance
evaluation for the different machine learning models is conducted.
The contributions of this paper are summarized as follows:
(1) This paper presents the application of FMCW millimetre-wave radar in multiple
human body posture characteristics data measurements. The experiment shows that
it can reflect the posture characteristics of the human body effectively.
(2) To delete the non-interesting reflection points and realize the grouping of objects
from the generated point cloud data, the clustering technique (DBSCAN algorithm) is
introduced to traverse all the points in the space based on the density characteristics
of the point cloud distribution.
(3) To achieve feature importance ranking, Gini index-based random forest algorithm
is utilized to obtain the normalized contribution of the feature, and further sort the
feature according to the size of the contribution.
(4) To avoid the side effects from the uneven number of samples and compare the clas-
sification performance of different machine learning models, the Kappa Index is
included along with other traditional evaluation criteria to evaluate the classification
performance based on the proposed signal processing methods.
The rest of this paper is organized as follows. Data collection and processing is de-
scribed in Section 2. Section 3 represents the proposed classification research methodology.
Section 4 presents the results and analysis, and Section 5 concludes the paper.

2. Data Collection and Processing


In this paper, taking advantage of the simple system structure, simple signal process-
ing, and low cost, the linear frequency modulated continuous wave is selected as the signal
of millimetre-wave radar for generating chirp signals. During collecting and processing
the data, the signals are transmitted by the transmitting antenna and reflected after encoun-
tering the target. The receiving antenna receives the reflected frequency modulation pulse
and then mixes with the local transmitting signal for amplification and filtering processing.
Finally, sampling and digital-to-analog conversion are carried out to obtain the original
matrix data of signal processing.

2.1. Data Collection


A peaceful and clutter-free workspace is necessary for the experiment’s accuracy. An
office has been chosen as the experimental scene, and no other items are in the office except
for desks and experimental devices. The interior dimensions and layout of items have been
illustrated in Figure 1. The size of the room is 3.1 m × 4.9 m, and the radar is located at the
center of the left side of the office, approximately 1 m above the floor. On the right side
of the office, there are three desks evenly spaced, each measuring 1.2 m × 0.6 m × 1 m.
The height of the experimenter is 175 cm, weighs 75 kg, and is 1.5 m away from the radar.
The subject then faces the radar in six different postures, known as hands up, horse stance,
lunges, lying down, standing, and sitting, as shown in Figure 2.
To collect the raw radar data and perform subsequent processing, this paper uses the
TI IWR6843ISK-ODS millimetre-wave sensor as the experimental device. The operating
frequency is 60 GHz and there are four receiving antennas and three transmitting antennas,
with 120◦ azimuth and 120◦ elevation angle coverage. The specific radar parameter config-
uration is listed in Table 1. For each posture, 20 s were captured with 100 frames of data
per posture. The DCA1000 evaluation module is used to provide real-time data capture
and streaming for radar sensors. The computer reads and processes the raw data captured
by the evaluation module.
configuration is listed in Table 1. For each posture, 20 s were captured with 100 frames of
data per posture. The DCA1000 evaluation module is used to provide real-time data cap-
Sensors 2023, 23, 7208 ture and streaming for radar sensors. The computer reads and processes the raw data
4 of 20
captured by the evaluation module.

Figure 1. Experimental environment and layout.

Table 1. Radar specific configuration.

Parameter Description
Start frequency 60 GHz
Bandwidth 3.92 GHz
Sampling frequency 2200 ksps
Frequency slope 98 MHz/μs
Frame rate 5 fps
ADC Samples 64
Figure1.
Figure
Number ofenvironment
Experimental
1. Experimental Chirps per frame
environmentand
andlayout.
layout.
200

Table 1. Radar specific configuration.

Parameter Description
Start frequency 60 GHz
Bandwidth 3.92 GHz
Sampling frequency 2200 ksps
(a) Frequency(b)
slope (c)
98 MHz/μs
Frame rate 5 fps
ADC Samples 64
Number of Chirps per frame 200

(d) (e) (f)


Figure
Figure 2.
2. Six
Six postures
postures for
for the
the data
data collection:
collection: (a)
(a)hands
hands up,
up, (b)
(b)lunge,
lunge,(c)
(c)horse
horsestance,
stance,(d)
(d)lying
lyingdown,
down,
(e) standing and (f) sitting.
(e) standing and (f) sitting.

2.2.
TableData Processing
1. Radar specific configuration.
(a) (b) (c)
First, the Range fast Fourier transform (FFT) is employed
Parameter
on the raw radar data to
Description
obtain the target range information. In order to remove static clutter in the signal, a mov-
Start frequency 60 GHz
ing target indication (MTI) algorithm is applied. Second, Range Doppler Images (RDIs)
Bandwidth 3.92 GHz
are introduced to reduce
Sampling multipath reflection noise in MTI results.
frequency 2200The direct Range Angle
ksps
Frequency slope 98 MHz/µs
Frame rate 5 fps
ADC Samples 64
(d) Number of Chirps per(e)
frame 200 (f)

Figure 2. Six postures for the data collection: (a) hands up, (b) lunge, (c) horse stance, (d) lying down,
2.2. Data Processing
(e) standing and (f) sitting.
First, the Range fast Fourier transform (FFT) is employed on the raw radar data to
obtain
2.2. Datathe target range information. In order to remove static clutter in the signal, a moving
Processing
target indication (MTI) algorithm is applied. Second, Range Doppler Images (RDIs) are
First, the
introduced to Range
reduce fast Fourier
multipath transform
reflection (FFT)
noise is employed
in MTI on direct
results. The the raw radar
Range data to
Angle
obtain the target range information. In order to remove static clutter in the
Images (RAIs) are obtained from the results of the Range FFT with the help of the minimumsignal, a mov-
ing targetundistorted
variance indication (MTI) algorithm
response (MVDR)isangle
applied. Second,algorithm
estimation Range Doppler
combined Images
with (RDIs)
the
are introduced
RAIs. After MTI toand
reduce
MVDR,multipath
the morereflection
detailednoise in MTI
features results.
of the direct The
RAIsdirect Rangeand
are located Angle
extracted, and finally, the combined RAIs are used to generate point clouds. The specific
processing steps are illustrated in Figure 3.
cells, guard cells, and cells under test, as shown in Figure 5. A certain range of guard cells
is set near the cell under test to prevent energy leakage that may lead to a high threshold
and affect judgment. Outside the guard cells are the training cells, and the mean value of
the training cells is used as the detection threshold. The value of the cell under test is
compared with the threshold to determine whether there is a target point in the cell under
Sensors 2023, 23, 7208 5 of 20
test. Through this algorithm, the human target points can be separated from the combined
RAI.

Figure 3.
Figure Data processing
3. Data processing flow.
flow.

(1) Introduction to Methods of Processing Data Usage: In processing raw radar data,
FFT (Range FFT and Doppler FFT), MVDR angle measurement, and MTI are used. The FFT
method is often used in radar signal processing but will not be elaborated here.
The MVDR is a commonly used digital beamforming algorithm, and its essence is
spatial filtering. It employs a beam with a certain shape to selectively pass the target signal,
while the interference signal is suppressed to a certain extent. There are two types of
beamforming: analog and digital, among which digital beamforming is the main method
of spatial filtering. This paper assumes that the receive antenna is an array of N, and
the received signal of the receive antenna is Sr (t); the signal received by the array can be
expressed as:
x ( t ) = Sr ( t ) ∗ a ( θ ) (1)
2πdsin (θ )
T
(a) (b) (c)

2πdsin (θ )
where a(θ ) = 1, e j λ , . . . , e j(( N −1) λ ) .
The output power at different angles is calculated as follows:

1
Pmvdr = H
(2)
a ( θ ) R −1 a ( θ )

where R = xt ∗ xtH , the angle value of the targets can be obtained.


Moving target indication (MTI) is a technology for extracting moving targets from
radar reflected signals. Their premise is that the reflection value of the stationary object
is stable, while the reflection signal value of the moving object changes with the change
of the object’s distance from the sensor position. After the Range FFT, chirps in the frame
obtain frequencies corresponding to their respective distances. The distance of stationary
targets remains constant within a frame and the distance of moving targets varies within a
frame. Therefore, when considering the Range FFT results of all chirp signals in a frame,
the chirp vector at each distance corresponds to the centering process. This means that the
chirp vector at each distance is subtracted from the mean of the chirp vector. The method
of processing the chirp signal is as follows:
 
n n FFT (i_chirp)
DI = ∑i=1 FFT (i_chirp) − ∑i=1 (3)
n

where FFT (i_chirp) represents the Range FFT result for the i-th chirp, and n represents that
there are n chirp signals in each frame of data.
RAI.

Sensors 2023, 23, 7208 6 of 20

(2) Acquisition of Combined RAI: The traditional radar signal processing is to perform
Range FFT on the AD sampled data for each chirp to gain the distance information of the
target. After the Range FFT, Doppler FFT is applied to the chirp signal at each location, and
the speed information of the target is obtained.
However, it is difficult to include multiple target information contemporaneously. This
paper presents the acquisition method of the human’s range-angle image, which includes
the human’s distance, angle, and reflection intensity. MTI is used to eliminate static clutter.
In order to remove the multipath reflection noise, the doppler information in the RDI is
fused based on the RAI. Seen as the numerical value in the RAI based on the MTI refers to
the intensity of the movement and not the reflection intensity of the static human postures,
the original human reflection can be obtained by performing the MVDR algorithm on the
data after Range FFT and the combined RAI can be obtained which includes the reflection
intensity of human posture and remove multipath noise and static clutter. The combined
Figure 3. Data processing flow.
RAIs are shown in Figure 4.

Sensors 2023, 23, x FOR PEER REVIEW 7 of 20

(a) (b) (c)

(d) (e) (f)


Figure 4.
Figure 4. Combined
Combined RAI:
RAI:(a)
(a)hands
handsup,
up,(b)
(b)lunge,
lunge,(c)
(c)horse
horsestance,
stance,(d)
(d)lying
lyingdown,
down,(e)(e)
standing, and
standing, and
(f) sitting.
(f) sitting.

(3) Generate Point Clouds: After obtaining the Combined RAIs of the six postures
to express the posture features more clearly and intuitively, a constant false alarm rate
(CFAR) algorithm is used to generate the human target point cloud based on the RAI of
two different planes. CFAR shows that the false alarm rate of the detection performance of
the radar system is kept at a certain value [35]. This is a detection algorithm that guarantees
the performance of radar detection and is used for point cloud detection.
In order to apply this algorithm to the combined RAI, this paper introduces the
Figure 5. 2D-CFAR
2D-CFAR peak The
algorithm. detection.
algorithm divides data cells into three types during detection:
training cells, guard cells, and cells under test, as shown in Figure 5. A certain range of
guard Each
cellsRAI contains
is set thecell
near the reflected power
under test values of
to prevent the target
energy at different
leakage that maydistances
lead to aand
high
angles (horizontal angle or pitch angle). To obtain the spatial 3D point cloud
threshold and affect judgment. Outside the guard cells are the training cells, and the meanof human
posture,
value it istraining
of the necessary toisfuse
cells usedthe
as reflected power
the detection values ofThe
threshold. human
valueposture on under
of the cell two
angular
test planes atwith
is compared different distances.toNamely,
the threshold determinethewhether
range, azimuth,
there is and elevation
a target angle
point in the of
cell
the target point need to be determined. Assume that the peak list of the RAI obtained from
the azimuth angle direction is represented by the set 𝐻∗ =
{𝑃(𝑟𝑎𝑛𝑔𝑒, 𝑎𝑧𝑖𝑚𝑢𝑡ℎ 𝑎𝑛𝑔𝑙𝑒, 𝑝𝑜𝑤𝑒𝑟)} , including the range, azimuth angle, and human
reflected power. 𝐻 ∗ = {𝑃(𝑟𝑎𝑛𝑔𝑒, 𝑒𝑙𝑒𝑣𝑎𝑡𝑖𝑜𝑛 𝑎𝑛𝑔𝑙𝑒, 𝑝𝑜𝑤𝑒𝑟)} represents the peak list of the
RAI in elevation angle direction, including the range, elevation angle, and human
Sensors 2023, 23, 7208 7 of 20

(d) (e) (f)


Figure
under4.test.
Combined RAI:this
Through (a) hands up, (b)the
algorithm, lunge, (c) horse
human stance,
target (d) lying
points down,
can be (e) standing,
separated fromand
the
(d) (f) sitting.
combined RAI. (e) (f)
Figure 4. Combined RAI: (a) hands up, (b) lunge, (c) horse stance, (d) lying down, (e) standing, and
(f) sitting.

Figure
Figure 5.
5. 2D-CFAR
2D-CFAR peak
peak detection.
detection.
Figure 5. 2D-CFAR peak detection.
Each RAI
Each RAI contains
contains the the reflected
reflected power
power values values of of the
the target
target at at different
different distances
distances and and
angles (horizontal
Each RAI angle
contains or pitch
the angle).
reflected powerTo obtain
values the
of spatial
the
angles (horizontal angle or pitch angle). To obtain the spatial 3D point cloud of human 3D
target atpoint
differentcloud of human
distances and posture,
it isangles
posture, (horizontal
necessary
it to fuseangle
is necessary theto or pitch the
reflected
fuse angle).
power Tovalues
reflected obtain thehuman
of
power spatial
values 3Dofpoint
posture humanoncloud
two of human
angular
posture planes
on two
posture,
at different it is necessary
distances. to
Namely, fuse the reflected
the range, power
azimuth, values of
and elevation human posture on two point
angular planes at different distances. Namely, the range, azimuth,angle of the target
and elevation angle of
needangular planes
to bepoint at different distances.
determined. Namely, theofrange, azimuth, and fromelevation angle of angle
the target need toAssume that the
be determined. peak list
Assume that thethe RAI obtained
peak list of the RAItheobtained
azimuth from
the
directiontarget point need
is represented to be determined.
by the ∗ Assume
set H1 = {isP(range, that the peak list
azimuth angle,of the RAI
powerobtained from ∗the
)}, including
the the azimuth
azimuth angle
angle direction
direction is represented
represented by by the the set set
𝐻 =∗ 𝐻 =
range,
{𝑃(𝑟𝑎𝑛𝑔𝑒, azimuth
𝑎𝑧𝑖𝑚𝑢𝑡ℎ angle, and 𝑝𝑜𝑤𝑒𝑟)}
𝑎𝑛𝑔𝑙𝑒, human reflected
, including power. theH2∗range,
= {P(range,
azimuth elevation
angle, angle,
and power
human )}
{𝑃(𝑟𝑎𝑛𝑔𝑒, 𝑎𝑧𝑖𝑚𝑢𝑡ℎ 𝑎𝑛𝑔𝑙𝑒, 𝑝𝑜𝑤𝑒𝑟)} , including the range, azimuth angle, and human
representspower. the peak∗list of the RAI 𝑒𝑙𝑒𝑣𝑎𝑡𝑖𝑜𝑛in elevation angle𝑝𝑜𝑤𝑒𝑟)}
direction,represents
including the the range,list elevation
reflected power.𝐻𝐻 ∗=={𝑃(𝑟𝑎𝑛𝑔𝑒,
reflected {𝑃(𝑟𝑎𝑛𝑔𝑒, 𝑒𝑙𝑒𝑣𝑎𝑡𝑖𝑜𝑛 𝑎𝑛𝑔𝑙𝑒, 𝑎𝑛𝑔𝑙𝑒, 𝑝𝑜𝑤𝑒𝑟)} represents the peak peaklist of the of the
angle,
RAIRAI and human
in inelevation reflected
elevation angle
power.
angle direction,
Correlate the
includingthetherange,
direction, including
pointsrange,of the two
elevation
elevation
planes
angle,
with
angle, and distance
the
and human human
value and
reflected the power
power.
reflected power. Correlate value
Correlate to obtain
the points of the two planes with the distance value andpoint
points the
of point
the set
two of
planesthe target
with thethree-dimensional
distance value andspace
the the
cloud, the generation
value to obtain method
power value to obtain the point set
power the pointis shown
set of
ofthe
the intarget
Equation
target (4), where ⊕space
three-dimensional
three-dimensional represents
pointpoint
space a fusion
cloud, the of the
cloud, the
data of the two
generation planes
method is [34].
shown in Equation (4), where ⊕ represents a fusion of the data of
generation method is shown in Equation (4), where ⊕ represents a fusion of the data of
thethe twotwo planes [34].
planes [34]. ∗ ∗
n o
H𝐻 ⊕

1 ⊕𝐻 H ∗ −→ P
2 ⟶ {𝑃 (range, (4)
)}
azimuth angle, elevation angle, power ) (4)
𝐻 ∗ ⊕ 𝐻 ∗ ⟶ {𝑃(( ,
,
,
,
,
, )} (4)
The
The sixposture
six posturepoint
point cloud
cloudimages
imagesobtained obtained byby thisthis
method
method are shown
are shown in Figure 6.
in Figure 6.
The six posture point cloud images obtained by this method are shown in Figure 6.

Sensors 2023, 23, x FOR PEER REVIEW 8 of 20

(a) (b) (c)

(a) (b) (c)

(d) (e) (f)


Figure 6. Point cloud of six postures: (a) hands up, (b) lunge, (c) horse stance, (d) lying down, (e)
Figure 6. Point cloud of six postures: (a) hands up, (b) lunge, (c) horse stance, (d) lying down,
standing, and (f) sitting.
(e) standing, and (f) sitting.
3. The Proposed Classification Research Method
The flow of the research method for human posture classification based on FMCW
millimetre-wave radar proposed in this paper is shown in Figure 7. There are three key
parts, namely object detection with the Density-Based Noise Applied Spatial Clustering
(DBSCAN) algorithm, feature extraction, and posture classification. The classification uses
(d) (e) (f)
Sensors 2023, 23, 7208 8 of 20
Figure 6. Point cloud of six postures: (a) hands up, (b) lunge, (c) horse stance, (d) lying down, (e)
standing, and (f) sitting.

3. The
3. The Proposed
Proposed Classification Research
Classification Research Method
Method
The flow of the research method for human posture classification based on FMCW
The flow of the research method for human posture classification based on FMCW
millimetre-wave radar proposed in this paper is shown in Figure 7. There are three key
millimetre-wave
parts, namelyradar
objectproposed in this
detection with thepaper is shownNoise
Density-Based in Figure
Applied7. There
Spatial are three key parts,
Clustering
namely object detection
(DBSCAN) algorithm,with theextraction,
feature Density-Based Noise
and posture Applied Spatial
classification. Clustering
The classification (DBSCAN)
uses
algorithm, feature
six different extraction,
supervised andlearning
machine posturemodels,
classification. The classification
namely KNN, GP, SVM, MLP,uses six different
NB, and
supervised
GB. machine learning models, namely KNN, GP, SVM, MLP, NB, and GB.

Figure 7. Research method flow.


Figure 7. Research method flow.
3.1. Target
3.1. Target Detection
Detection
The CFAR algorithm detects the RAI in both angular planes and matches the detected
The CFAR algorithm detects the RAI in both angular planes and matches the detected
values that are beyond the power threshold to yield point clouds. Point clouds can be
values that are
denoted as a beyond the power threshold
set of four-dimensional points: to yield point clouds. Point clouds can be
denoted as a set of four-dimensional points:
𝑝 = {𝑝 = (𝑥_𝑖, 𝑦_𝑖, 𝑧_𝑖, 𝑝𝑜𝑤𝑒𝑟 )|𝑖 = 1,2, … , 𝑛} (5)
where n represents p { pi = of
the=number ( x_i, y_i,inz_i,
points thepower i )|i = each
point cloud, }
. . . , ncontains
1, 2,point (x, y, z)- (5)
coordinates information and reflected power. Nevertheless, the CFAR algorithm tends to
where n represents
return the number
reflection points of points
that are not targets ofin interest
the point cloud,
resulting eachalarms.
in false point When
contains the (x, y, z)-
DBSCANinformation
coordinates algorithm is used, all points that
and reflected correspond
power. to the same
Nevertheless, item
the CFARof interest can be tends to
algorithm
grouped
return and non-interesting
reflection points that are reflection pointsof
not targets can be deleted.
interest DBSCAN
resulting in isfalse
a density-based
alarms. When the
spatial clustering technique. All the points in the space can be traversed using the density
DBSCAN algorithm is used, all points that correspond to the same item of interest can be
characteristics of the point cloud distribution, and the peak points can be divided to
grouped and
realize thenon-interesting
grouping of objects. reflection
A point points
is centeredcanon beitself
deleted.
and EpsDBSCAN is a density-based
as the radius, if the
spatial clustering
circle technique.
contains more All Minpts,
points than the points in the
the point space canabe
is considered coretraversed using
point. If the number the density
characteristics of the point
of points contained cloud
is fewer distribution,
compared to MinPts, andthethe peak
point points can
is defined as a be divided
border point.to realize
An outlier of
the grouping point is one that
objects. is neither
A point a core point
is centered onnor a border
itself and point.
Eps as If athe
point P is in if
radius, thethe circle
Eps more
contains neighborhood of theMinpts,
points than core point
theQ, the object
point P has been
is considered referred
a core point.to asIfbeing directly of points
the number
density reachable from the object Q. A density cluster is formed by a core point 9Qofand 20 all
Sensors 2023, 23, x FOR PEER REVIEW contained is fewer compared to MinPts, the point is defined as a border point. An outlier point
is one that is neither a core point nor a border point. If a point P is in the Eps neighborhood of
the core point Q, the object P has been referred to as being directly density reachable from
objects whose
the object Q.density is reachable
A density cluster [36]. The radius
is formed by a Eps
coreand the Q
point threshold
and all of the number
objects whose density is
ofreachable
items in the neighborhood MinPts are the two input parameters for the method.
[36]. The radius Eps and the threshold of the number of items in the In this
neighborhood
study,
MinPts Epsarewas
theset
twoto input
0.5 andparameters
MinPts wasfor setthe
to 20. The result
method. of clustering
In this study, Epshas been
was set to 0.5 and
demonstrated in Figure 8. The point cloud splits the human posture into a cluster, and
MinPts was set to 20. The result of clustering has been demonstrated in Figure 8. The point
there is no noise in the result.
cloud splits the human posture into a cluster, and there is no noise in the result.

Figure8. 8.
Figure Result
Result of DBSCAN
of DBSCAN algorithm
algorithm applied
applied to largeto large character
character posture
posture point point cloud.
cloud.

3.2. Feature Extraction


Feature extraction in this research can be divided into two parts, the former is feature
extraction, and the latter is feature selection.
(1) Extract features: After human posture is detected, a set of interesting reflection
Sensors 2023, 23, 7208 9 of 20

3.2. Feature Extraction


Feature extraction in this research can be divided into two parts, the former is feature
extraction, and the latter is feature selection.
(1) Extract features: After human posture is detected, a set of interesting reflection
points is obtained, which is called the human posture point cloud. Different from the
point cloud generated by other methods, the point cloud generated by millimetre-wave
radar integrates the information of range, angle, and reflection power of the target, which
can accurately reflect the morphological characteristics of the target. Additionally, further
processing is required to extract information for each posture. This section recommends
that twelve features taken from human posture point clouds and radiation intensity be used
to characterize the posture type, and that posture classification can be performed using
these features. Table 2 shows the symbols and brief descriptions of the twelve-point cloud
features. The following is a detailed description of each of the suggested features. The
geometry of the point cloud for the six human postures varies widely. Therefore, this paper
proposes using a rectangular box to represent the shape of the posture. The rectangular box
has three dimensions: length, width, and height, which correspond to the x-, y-, and z-axis
values respectively. Thus, this paper defines the first, second, as well as third object features
F0, F1, and F2 as the difference between the maximum value and the minimum value on
the x, y, and z-axis, namely the length (L), width (W), and height (H) of the rectangular box,
the calculation formula is expressed as:

F0 : L = max( X ) − min( X ) (6)

F1 : W = max(Y ) − min(Y ) (7)

F2 : H = max( Z ) − min( Z ) (8)


where X represents the x-axis coordinate value of all object points, Y represents the y-axis
coordinate value of all target points, and Z is the z-axis coordinate value of all object points.

Table 2. Twelve proposed features and brief descriptions.

Serial Number Symbol Explanation


F0 L The length of human 3D point clouds
F1 W The width of human 3D point clouds
F2 H The height of human 3D point clouds
The mean value of human 3D point clouds in the
F3 Xmean
length direction
The mean value of human 3D point clouds in the
F4 Ymean
width direction
F5 Zmean The mean of human 3D point clouds in the height direction
The standard deviation of human 3D point clouds in the
F6 Xsd
length direction
The standard deviation of human 3D point clouds in the
F7 Ysd
width direction
The standard deviation of human 3D point clouds in the
F8 Zsd
height direction
The center coordinate of the reflection intensity of human
F9 Xc
3D point clouds in the length direction
The center coordinate of the reflection intensity of human
F10 Yc
3D point clouds in the width direction
The center coordinate of the reflection intensity of human
F11 Zc
3D point clouds in the height direction

The mean value of each posture on the three-dimensional coordinates is different.


Therefore, the fourth, fifth, and sixth object features, F3, F4, and F5, are defined as the mean
Sensors 2023, 23, 7208 10 of 20

values on the x-axis, y-axis, and z-axis, which are represented by Xmean , Ymean , and Zmean .
The calculation formula is expressed as:

F3 : Xmean = mean( X ) (9)

F4 : Ymean = mean(Y ) (10)

F5 : Zmean = mean( Z ) (11)


Similarly, the standard deviation of each posture on the three-dimensional coordinates
are defined as the 7-th, 8-th, and 9-th features, which are represented by Xsd , Ysd , and Zsd .
The calculation formula is expressed as:
q
n
F6 : Xsd = ∑i=1 (Xi − Xmean )2 /n (12)

q
n
F7 : Ysd = ∑i=1 (Yi − Ymean )2 /n (13)

q
n
F8 : Zsd = ∑i=1 (Zi − Zmean )2 /n (14)
where n denotes the number of points in the point cloud, Xi denotes the coordinate value
of the i-th point on the x-axis, Yi is the coordinate value of the ith point on the y-axis, and
Zi represents the coordinate value of the i-th point on the x-axis.
The amplitude of the reflected radar echo signal determines the intensity of the target
point cloud’s reflection. Radar Cross Section (RCS) is often utilized to characterize the echo
strength of an object under the illumination of radar waves. The value of RCS is influenced
by the size of the object. The RCS is greater, and the reflection intensity is higher in the
human thoracic cavity due to the larger reflection area. Because the reflection intensity
distribution of different postures is different, the center coordinates of the reflection intensity
in different coordinate dimensions of the point cloud have been used as features, which are
represented by Xc , Yc , and Zc respectively. The calculation formula is:

∑in Xi ∗ SNRi
F9 : X c = (15)
∑in SNRi

∑in Yi ∗ SNRi
F10 : Yc = (16)
∑in SNRi

∑in Zi ∗ SNRi
F11 : Zc = (17)
∑in SNRi
where SNRi represents the signal-to-noise ratio of the i-th point.
(2) Feature selection: Among the twelve features extracted from the point cloud
data, not all of them can achieve the optimal classification of the target posture, and the
effectiveness of point cloud data classification is related to the contribution of the feature to
the classification. To omit unimportant features and improve the efficiency of classification,
it is necessary to rank the importance of features.
Random forest algorithms can achieve feature importance ranking. The algorithm
consists of multiple decision trees. The importance order is based on the contribution made
by the feature in each decision tree. The calculation method of the contribution is to solve
the difference of the Gini index before and after the branch of the feature on a certain node.
The same method is applied to other features, and finally, the change value of a certain
characteristic Gini index is divided by the change value of all the characteristic Gini indices
to obtain the normalized contribution of the feature, the features are sorted based on the
Sensors 2023, 23, 7208 11 of 20

size of the contribution [37]. The formula for calculating the Gini index of the i-th tree node
q is as follows:
(i ) 2
 
m
Giniq = 1 − ∑c=1 pqc
(i )
(18)
(i )
where m represents the number of categories, and pqc represents the proportion of category
c in node q on the i-th tree.
( Gini )(i )
The importance of feature j in the node q of the i-th tree V I M jq , that is, the
change of the Gini index before and after the branch of node q. The calculation formula is
expressed as:
( Gini )(i ) (i ) (i ) (i )
V I M jq = Giniq − Ginie − Ginir (19)
(i ) (i )
where Ginie and Ginir denote the Gini indices of the two new nodes e and r after
branching, respectively.
When there are L decision trees in the random forest and the node where feature j
appears in decision tree i is set to Q, and then the importance of feature j can be expressed as:
L
= ∑i=1 ∑q∈Q V I M jq
( Gini ) ( Gini )(i )
V I Mj (20)

Normalize all feature importance scores to obtain:

( Gini )
V I Mj
V I Mj = ( Gini )
(21)
J
∑j V I Mj

where J represents the total number of features.


Python was used to import the calculated point cloud feature data into the defined
random forest classifier and set 100 decision trees in the random forest model. The order
of importance of all point cloud features is shown in Figure 9. The abscissa is the point
cloud feature defined above, and the ordinate represents the importance of the feature.
In this paper, six features with high importance are selected from the extracted features
as classification features. It can be seen from the figure that F4, F5, F6, F7, F10, and F11
account for a sizeable proportion. From the point of view of physical significance, different
postures are different in height, so the mean value of height direction is the most beneficial
to distinguish postures. The body centroid coordinates of different postures are different in
the width direction, so the mean value of the width direction is also useful to distinguish
postures. In the direction of length, however, the positions of all gestures are constant, so
the mean value in the direction of length does not change significantly and it is difficult
to distinguish postures. There is a large gap between hands up and other postures in
the direction of length and width, so it is reasonable to choose the standard deviation of
length and width as the distinguishing standard. From the perspective of body reflection
intensity, the positions of the main body parts (chest) in width and height are different
in various postures, which will cause different central coordinates of reflection intensity
in the direction of width and height, while the central coordinates of reflection intensity
in the direction of length do not change significantly. Therefore, the central coordinates
of reflection intensity in the direction of width and height are also important features to
distinguish postures. Combining the result of the random forest feature importance ranking
and the physical significance perspective of the features, these six features were selected to
classify the postures.
various postures, which will cause different central coordinates of reflection intensity in
the direction of width and height, while the central coordinates of reflection intensity in
the direction of length do not change significantly. Therefore, the central coordinates of
reflection intensity in the direction of width and height are also important features to
distinguish postures. Combining the result of the random forest feature importance
Sensors 2023, 23, 7208 ranking and the physical significance perspective of the features, these six features were
12 of 20
selected to classify the postures.

Rankingthe
Figure9.9.Ranking
Figure theimportance
importanceofofextracted
extractedpoint
pointcloud
cloudfeatures.
features.

3.3. Machine Learning Model


3.3. Machine Learning Model
Various machine learning algorithms are not inherently good or bad, and the focus is
Various machine learning algorithms are not inherently good or bad, and the focus
to evaluate the accurate performance of different machine learning models and determine
is to evaluate the accurate performance of different machine learning models and
the most accurate classification model when faced with complex application problems. This
determine the most accurate classification model when faced with complex application
paper selects six different machine learning models which work differently, which provide
problems. This paper selects six different machine learning models which work
an opportunity to determine the best model for millimetre-wave point cloud postures
differently, which provide an opportunity to determine the best model for millimetre-
classification. A brief introduction of all adopted machine learning models is given below.
wave point cloud postures classification. A brief introduction of all adopted machine
(1) KNN:
learning K-nearest
models neighbor
is given below. is a non-parametric learning method. When a new sample
is input, the algorithm can find the K training samples that are most similar to the
(1) KNN: K-nearest neighbor is a non-parametric learning method. When a new sample
new sample, so the adjustable parameters of KNN are only K values. By calculating
is input, the algorithm can find the K training samples that are most similar to the
the Euclidean distance or Manhattan distance between samples as the dissimilarity
new sample, so the adjustable parameters of KNN are only K values. By calculating
index of each sample.
the Euclidean distance or Manhattan distance between samples as the dissimilarity
(2) GP: The probabilistic-based parameter-free model for regression and classification
index of each sample.
problems. Its principle is based on Bayesian inference, which treats the input data
(2) GP: The probabilistic-based parameter-free model for regression and classification
as random variables and models the output data as Gaussian distributions. The
problems. Its principle is based on Bayesian inference, which treats the input data as
algorithm is based on probabilistic and kernel functions, which are used to model
random variables
correlations andinput
between models
data the
pointsoutput
and todatamakeaspredictions
Gaussian using
distributions.
Bayesian The
infer-
algorithm is based on probabilistic and kernel functions, which are used to model
ence. It is suitable for regression and classification problems and provides predictions
correlations between input data points and to make predictions using Bayesian
with confidence.
inference. It
(3) SVM: The support is suitable
vectorfor regression
machine and classification
is a classic problems
supervised learning and provides
algorithm. Around
predictions with confidence.
the concept of “margin”, either side of the hyperplane separates two data classes,
(3) SVM: The is
so SVM support
a binaryvector machine is
classification a classic supervised
algorithm, learning algorithm.
as well as multiple Around
binary classification
the
problems, can be constructed to solve the multi-classification problem. Because ofsoits
concept of “margin”, either side of the hyperplane separates two data classes,
SVM is a binary
robustness classification
in multiple algorithm,
application types, itasiswell as multiple
regarded binarymethod
as a must-try classification
[38].
(4) MLP: The multi-layer perceptron is a forward-structured artificial neural network,
consisting of an input layer, hidden layer, and output layer. Feature data has been
passed from the input layer to the hidden layer, which implements the nonlinear
mapping to the input space, as well as the output layer implements classification. It is
noteworthy that features can be classified even with only one hidden layer because
enough units are included in the hidden layer.
(5) NB: The Bayes theorem and the premise of feature condition independence underpin
the Naive Bayes classification algorithm. The idea is to use the prior probability to
calculate the posterior probability that a variable belongs to a certain category. The
algorithm is also a type of supervised learning.
(6) GB: Gradient boosting is an efficient ensemble learning algorithm based on the lifting
principle. The algorithm continuously iterates through a weak prediction model
composed of decision trees to train a strong prediction model in a way that minimizes
Sensors 2023, 23, 7208 13 of 20

the error of the previous round [39]. It can handle large datasets with high accuracy
but is slower to train due to the sequential nature of gradient boosting.

3.4. Multi-Class Evaluation Index


The purpose of this research is to comprehensively evaluate the performance of
different machine learning models for the classification of six human postures. To achieve
this purpose, traditional performance metrics are used: precision, recall, and F1 score.
For classification problems, consistency in classification refers to the agreement between
model predictions and actual classifications [40]. In the background of FMCW millimetre-
wave human posture point cloud classification performance evaluation, the Kappa index
performance index is introduced for consistency check, because the Kappa index contains
the relationship between prediction accuracy and actual accuracy, two of the most important
indicators. Therefore, it is of certain significance to introduce the Kappa index as an
evaluation index for the classification performance of machine learning models. Meanwhile,
in this research, there are 100 frames of point cloud data for each posture, and the number
of target points in each frame of point cloud data is different, so it is inevitable to cause
data imbalance in the process of data set division, and Kappa index can weaken the
influence of unbalanced data on classification results. Furthermore, the classification
outputs of each machine learning model for individual postures are visualized using ROC
curves. Since the calculation of the Kappa index is based on a confusion matrix, this paper
generates a corresponding confusion matrix for six machine learning models, respectively,
to verify whether the introduced Kappa index can be used as an evaluation index for the
performance of machine learning models. The calculation formula of each performance
index is expressed as:
TP + TN
A= (22)
TP + TN + FP + FN
TP
P= (23)
TP + FP
TP
R= (24)
TP + FN
2×P×R
F1 = (25)
P+R

A−E
K= (26)
1−E
where A represents accuracy, P represents precision, R represents recall, F1 and K represent
F1 score and Kappa index, respectively, TP is the number of predicted positives and
actual positives, FP is the number of predicted positives and actual negatives, FN is the
number of predicted negatives and actual positives, TN is the predicted negatives and
the actual number of negative examples. E represents the expected accuracy, which is
defined as the expected accuracy of the classifier based on the confusion matrix, expressed
mathematically as:

( TP + FN )( TP + FP) + ( TN + FN )( TN + FP)
E= (27)
( TP + TN + FP + FN )2

The accuracy rate represents the ratio of correctly recognized postures to the total
number of postures. While the accuracy rate can judge the overall correct rate, it is not
a perfect indicator in the case of imbalanced samples. Precision and recall, commonly
used for classification evaluation, may ignore sample imbalances. Precision represents the
probability that each recognized posture is correct, and recall represents the probability
that a certain posture is recognized correctly. It can be seen from the definition that the two
Sensors 2023, 23, 7208 14 of 20

indicators are a contradiction, and the F1 score indicator is a combination of precision and
recall, which can evaluate a classifier more comprehensively.
In the actual classification process, the uneven number of samples in each category
would cause the model to bias the large category as well as give up the small category,
particularly when in the face of multi-classification problems. The more imbalanced the
confusion matrix is, the higher the E value is, the lower the K value is, and the model with
significant bias can be evaluated, according to Kappa’s calculation formula. Assigns labels
to different kappa ranges, as illustrated in Table 3 (see [41] for details).

Table 3. Labels corresponding to different kappa indexes.

Kappa Index (%) Label


Less than 0 Poor
0–20 Slight
21–40 Fair
41–60 Moderate
61–80 Substantial
81–100 Nearly perfect

Confusion matrices are examples of actual and predicted values used in the proposed
model to visualize the performance of the machine learning classifier process. The deeper
the color depth of the diagonal line, the higher the recognition accuracy.
The ROC curve has also been regarded as the receiver operating characteristic curve. It
is a diagram that could be used to evaluate, represent, as well as select forecasting systems.
The curve has two parameter values, the true positive rate (TPR) and the false positive rate
(FPR), which are expressed mathematically as:

TP
TPR = (28)
TP + FN

FP
FPR = (29)
FP + TN

4. Results and Discussions


Classifications are conducted based on the research method in Section 3. All adopted
machine learning model parameters are shown in Table 4, and more details given in the
table can be found in Scikit-Learn, a Python-based machine learning library [42].

Table 4. Machine learning parameters used.

ML Model Parameter Detail


KNN n_neighbors = 5, weights = ‘uniform’, algorithm = ‘auto’
GP kernel = 1.0 ∗ rbf(1.0), random_state = 0
SVM C = 33, kernel = ‘rbf’
MLP hidden_layer_sizes = (175), activation = ‘relu’, solver = ‘lbfgs’
NB priors = None
GB Loss = deviance, learning_rate = 0.1, n_estimators = 100

For the data set division of point cloud feature data, this study randomly divided
six hundred sets of point cloud data into the training set and testing set according to the
ratio of 8:2, and input them into the machine learning model. To ensure the reliability
of the results, 5-fold cross-validation was used to analyze the accuracy and Kappa index.
Figure 10 presents A and K for all adopted machine learning models in the form of a bar
graph. It can be seen from Figure 10a that MLP has the highest accuracy, reaching 94%,
followed by KNN, SVM, GB, and the three models are close to each other, while GP and
NB have poor accuracy, respectively only 90.5% and 87.5%. As can be seen from Figure 10b,
suitable to be used in this paper’s dataset for classification.

Table 4. Machine learning parameters used.

ML Model Parameter Detail


Sensors 2023, 23, 7208 KNN n_neighbors = 5, weights = ‘uniform’, algorithm15=of‘auto’ 20

GP kernel = 1.0 ∗ rbf(1.0), random_state = 0


SVM C = 33, kernel = ‘rbf’
except thatMLP
KNN and SVM have different trends in accuracy,
hidden_layer_sizes = (175,), other Kappa
activation indexes
= ‘relu’, solverare
= ‘lbfgs’
consistent with accuracy. MLP has the highest K value, followed by KNN. Among them,
NB priors = None
NB has the lowest K value and its accuracy rate is also the lowest, which can be proved
GBsuitable to be usedLoss
that NB is not = deviance,
in this learning_rate
paper’s dataset = 0.1, n_estimators = 100
for classification.

(a) (b)

Figure 10. Performance


Figure 10. Performanceofof sixsix machine
machine learning
learning models models on Accuracy
on Accuracy and Kappaandindex.
Kappa (a)index. (a) The ac-
The accu-
curacy
racy of of various
various machine
machine learning
learning models;models;
(b) The(b) The Kappa
Kappa index ofindex
variousof machine
various learning
machinemodels.
learning mod-
els.
In the case of multi-classification, a confusion matrix can be used to represent the
indicator
In theof case
the model performance, where athe
of multi-classification, horizontalmatrix
confusion direction canis be
the used
predicted label,
to represent the
and the vertical direction is the true label. The accuracy can
indicator of the model performance, where the horizontal direction is the predicted be understood as the sum label,
of the diagonals divided by the sum of the entire confusion matrix data. Therefore, the
and the vertical direction is the true label. The accuracy can be understood as the sum of
larger the diagonal data, the smaller the off-diagonal data, and the higher the recognition
the diagonals
accuracy. divided by
The confusion the sum
matrix of the
data for thisentire
study isconfusion
shown in matrix
Figure 11.data.
It isTherefore,
obvious from the larger
the diagonal matrix
the confusion data, thethat smaller
MLP andthe KNN off-diagonal
have higherdata, and This
accuracy. the ishigher
consistentthe with
recognition
accuracy. The confusion matrix data for this study is shown
the conclusion drawn by the Kappa index. The accuracy of the machine learning model in Figure 11. It is obvious
from the confusion
evaluated by the Kappa matrixindex that
was MLP and KNN
verified. Among have
them,higher accuracy.
the accuracy rateThis is consistent
of MLP’s
recognition
with of horse stance
the conclusion drawn is only
by the86%,Kappa
and there is a 14%
index. The probability
accuracy of thattheit is wrong tolearning
machine
think that
model it is sitting
evaluated byposture,
the Kappa whileindex
the KNN’s recognition
was verified. rate ofthem,
Among sitting the
posture is not rate of
accuracy
high, with an accuracy of 89%, 7% probability of wrongly thinking
MLP’s recognition of horse stance is only 86%, and there is a 14% probability that it is that it is horse stance
and 4% probability that it is wrongly regarded as lunge. This may be due to the fact that
wrong to think that it is sitting posture, while the KNN’s recognition rate of sitting posture
the point cloud shapes of horse stance and sitting are somewhat similar, and the main
isdifference
not high, with an accuracy of 89%, 7% probability of wrongly thinking that it is horse
is in the height of the human posture, which is the reason sitting and horse stance
stance
are easilyandconfused.
4% probability
From thethat six it is wrongly
algorithms, regarded
it can be found as that
lunge.
the This may beaccuracy
recognition due to the fact
that the point cloud shapes of horse stance and sitting are somewhat
is 100% for both lying and standing postures, which also proves the validity of the data similar, andof the main
difference
the point cloudsis in generated
the heightin of thisthe human posture, which is the reason sitting and horse
paper.
stance TheareROCeasilycurve is drawnFrom
confused. with the
different thresholds it
six algorithms, andcanis be
based
foundon the
thatconfusion
the recognition
matrix, with TPR and FPR as the axes. In general,
accuracy is 100% for both lying and standing postures, which also proves the the curve’s turning point is near
validity of
(0, 1)—the upper left corner of the coordinates—the
the data of the point clouds generated in this paper. classifier’s classification performance.
AUC refers to the area under the ROC curve and is often used as an indicator of the
The ROC curve is drawn with different thresholds and is based on the confusion
model’s strength or weakness. The value range is (0.5–1), with a bigger value indicating a
matrix, with TPR and FPR as the axes. In general, the curve’s turning point is near (0, 1)—
stronger categorization effect. The macro average is to average these area values, and the
micro average needs to consider the values of each dimension. Figure 12 shows that the
classification performance of SVM, MLP, and GB is slightly better. The labels [0, 1, 2, 3, 4, 5]
in the figure correspond to hands up, horse stance, lunges, lying down, sitting, and standing,
respectively. From the area, it can be seen that the values of horse stance and sitting are
relatively small.
refers to the area under the ROC curve and is often used as an indicator of the model’s
strength or weakness. The value range is (0.5–1), with a bigger value indicating a stronger
strength or weakness. The value range is (0.5–1), with a bigger value indicating a stronger
categorization effect. The macro average is to average these area values, and the micro
categorization effect. The macro average is to average these area values, and the micro
average needs to consider the values of each dimension. Figure 12 shows that the
average needs to consider the values of each dimension. Figure 12 shows that the
classification performance of SVM, MLP, and GB is slightly better. The labels [0, 1, 2, 3, 4,
classification performance of SVM, MLP, and GB is slightly better. The labels [0, 1, 2, 3, 4,
5] in the figure correspond to hands up, horse stance, lunges, lying down, sitting, and
Sensors 2023, 23, 7208 5] in the figure correspond to hands up, horse stance, lunges, lying down, sitting, and 16 of 20
standing, respectively. From the area, it can be seen that the values of horse stance and
standing, respectively. From the area, it can be seen that the values of horse stance and
sitting are relatively small.
sitting are relatively small.

(a) (b) (c)


(a) (b) (c)

(d) (e) (f)


(d) (e) (f)
Figure 11. Confusion matrix for all classification models. (a) KNN; (b) GP; (c) SVM; (d) MLP; (e) NB;
Figure 11. Confusion
Figure11. matrixfor
Confusion matrix forall
allclassification
classificationmodels.
models.(a)(a) KNN;
KNN; (b)(b)
GP;GP; (c) SVM;
(c) SVM; (d) MLP;
(d) MLP; (e) NB;
(e) NB;
(f) GB.
(f)GB.
(f) GB.

Sensors 2023, 23, x FOR PEER REVIEW 17 of 20

(a) (b) (c)


(a) (b) (c)

(d) (e) (f)


Figure12.
Figure 12.ROC
ROCcurves
curves of
ofall
allclassification
classification models.
models. (a)
(a) KNN;
KNN; (b)
(b) GP;
GP; (c)
(c) SVM;
SVM; (d)
(d) MLP;
MLP; (e)
(e) NB;
NB; (f)
(f) GB.
GB.

Combining the accuracy, precision, recall, F1 score, Kappa value, confusion matrix,
and ROC curve, it can be concluded that MLP is the classification model with the best
comprehensive performance for the human posture point cloud dataset, and KNN is very
stable in many indicators. However, the calculation time of the model is also one of the
comprehensive performance for the human posture point cloud dataset, and KNN is very
stable in many indicators. However, the calculation time of the model is also one of the
indicators that cannot be ignored. Figure 13 shows the training time comparison of the six
models. It is evident from the figure that MLP and GB have longer computational time,
Sensors 2023, 23, 7208 while KNN and NB have the shortest computational time. The results of this training 17time
of 20
are reasonable through the principles of the model. KNN belongs to lazy learning, which
takes almost no training time because training examples are simply stored. Naive
Bayesian models train fast because only one data pass is required to compute the
Combining the accuracy, precision, recall, F1 score, Kappa value, confusion matrix,
frequency
and or normal
ROC curve, probability
it can density
be concluded function.
that MLP isThese models train model
the classification orders with
of magnitude
the best
faster than neural network models. Gradient boosting requires
comprehensive performance for the human posture point cloud dataset, and KNN is constant iterations which
very
makes its training slow.
stable in many indicators. However, the calculation time of the model is also one of the
Table that
indicators 5 shows
cannotthe performance
be ignored. Figureof13theshowssix the
classification
training time models in terms
comparison of theofsixP
(precision), R (recall) and F1 (F1 score). It can be easily seen that the
models. It is evident from the figure that MLP and GB have longer computational time, recognition accuracy
of MLP
while KNNin theandthree postures
NB have of the lunge,
the shortest sitting, and
computational standing
time. is higher
The results of thisthan other
training
models, and KNN and SVM have the highest recognition accuracy
time are reasonable through the principles of the model. KNN belongs to lazy learning, in the hands-up
posture,
which reaching
takes almost100%. Lying time
no training posture
becauseis recognized well in all
training examples are six machine
simply stored.learning
Naive
models, while the horse stance has the worst recognition effect among the
Bayesian models train fast because only one data pass is required to compute the frequency six models. The
probable reason is that the lying posture is different from other postures,
or normal probability density function. These models train orders of magnitude faster than with the highest
degree network
neural of discrimination, while the
models. Gradient horse stance
boosting requiresis easily confused
constant with
iterations the lunge
which makesand its
sitting postures,
training slow. and the degree of discrimination is low.

Figure 13. The training time of the algorithm.

Table 5 shows the performance of the six classification models in terms of P (precision),
R (recall) and F1 (F1 score). It can be easily seen that the recognition accuracy of MLP in the
three postures of the lunge, sitting, and standing is higher than other models, and KNN
and SVM have the highest recognition accuracy in the hands-up posture, reaching 100%.
Lying posture is recognized well in all six machine learning models, while the horse stance
has the worst recognition effect among the six models. The probable reason is that the lying
posture is different from other postures, with the highest degree of discrimination, while
the horse stance is easily confused with the lunge and sitting postures, and the degree of
discrimination is low.

Table 5. Evaluate 80% of the training model dataset on the remaining 20% of the testing dataset.

KNN GP SVM MLP NB GB


Posture
P(%) R(%) F1(%) P(%) R(%) F1(%) P(%) R(%) F1(%) P(%) R(%) F1(%) P(%) R(%) F1(%) P(%) R(%) F1(%)
hands up 100 100 100 100 100 100 100 100 100 96 100 98 100 91 95 100 96 98
horse
88 100 93 78 100 88 76 93 84 100 86 92 85 79 81 92 86 89
stance
lunge 95 95 95 94 84 89 94 89 92 100 100 100 89 89 89 95 95 95
lying
100 100 100 100 100 100 100 100 100 100 100 100 95 100 98 100 100 100
down
sitting 96 89 93 92 85 88 96 89 92 93 96 95 89 89 89 89 93 91
standing 100 100 100 100 100 100 100 100 100 100 100 100 89 100 94 94 100 97

5. Conclusions
This paper demonstrated how human posture characteristics information can be
measured using FMCW Millimetre-wave radar as well as how to apply machine learning
Sensors 2023, 23, 7208 18 of 20

to develop a trained model having the ability to identify the human postures from the point
cloud generated. The experimental study shows that FMCW millimetre-wave radar can
measure the range and angle of human postures with high accuracy. The point cloud is
generated from the measured feature data of human posture, which serves as the initial
dataset for training machine learning models to effectively recognize human postures with
new FMCW measurements. Furthermore, the comprehensive performance of different
human posture classification models under the background of FMCW Millimetre-wave
radar is compared and evaluated. The data input into machine learning is optimized and
the dynamic and static features of human posture are integrated to make the outline of
human posture in the data clearer. To show it more intuitively, the data is generated into
point clouds. The clustering technique (DBSCAN algorithm) is introduced to realize the
grouping of objects from the generated point cloud data. Random Forest algorithm is
applied to generate feature importance ranking.
What is noteworthy is that the selection of the optimal machine learning model from
the analysis is not one-size-fits-all, especially for a specific problem such as human posture
classification. The neural network-based MLP method outperforms other machine learning
approaches in terms of recognition accuracy, despite requiring more training time. However,
it is found that in our experimental results that the NB model has the worst performance in
accuracy under the given conditions.
Based on the proposed method and analysis of the results, future research can fo-
cus on increasing the number of trained models or combining the best two models in
this classification, such as MLP and KNN, to further improve the accuracy of human
posture classification.

Author Contributions: Conceptualization, methodology, software, validation, formal analysis, in-


vestigation, resources, data curation and writing—original draft preparation, G.Z., S.L. and K.Z.;
writing—review and editing, G.Z. and Y.-J.L. All authors have read and agreed to the published
version of the manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data presented in this study are available upon request from the
corresponding author.
Acknowledgments: The authors acknowledge the editors and reviewers for their valuable comments
and suggestions. The authors acknowledge Zefu Deng for his contribution to the data collection.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Li, X.Y.; Wang, S.; Liu, B.; Chen, W.; Fan, W.Q.; Tian, Z.J. Improved YOLOv4 network using infrared images for personnel
detection in coal mines. J. Electron. Imaging 2022, 31, 13017.
2. Batistela, R.A.; Oates, A.; Moraes, R. Haptic information and cognitive-visual task reduce postural sway in faller and non-faller
older adults. Hum. Mov. Sci. 2018, 60, 150–161.
3. Arshad, M.H.; Bilal, M.; Gani, A. Human Activity Recognition: Review, Taxonomy and Open Challenges. Sensors 2022, 22, 6463.
4. Alanazi, M.A.; Alhazmi, A.K.; Alsattam, O.; Gnau, K.; Brown, M.; Thiel, S.; Jackson, K.; Chodavarapu, V.P. Towards a Low-Cost
Solution for Gait Analysis Using Millimeter Wave Sensor and Machine Learning. Sensors 2022, 22, 5470.
5. Hussein, F.; Mughaid, A.; AlZu’bi, S.; El-Salhi, S.M.; Abuhaija, B.; Abualigah, L.; Gandomi, A.H. Hybrid CLAHE-CNN Deep
Neural Networks for Classifying Lung Diseases from X-ray Acquisitions. Electronics 2022, 11, 3075.
6. Koo, J.H.; Cho, S.W.; Baek, N.R.; Kim, M.C.; Park, K.R. CNN-Based Multimodal Human Recognition in Surveillance Environments.
Sensors 2018, 18, 3040.
7. Milon, I.M.; Sheikh, N.; Fakhri, K.; Ghulam, M. Multi-level feature fusion for multimodal human activity recognition in Internet
of Healthcare Things. Inf. Fusion 2023, 94, 17–31.
8. Kim, J.; Jeong, H.; Lee, S. Simultaneous Target Classification and Moving Direction Estimation in Millimeter-Wave Radar System.
Sensors 2021, 21, 5228–5241.
Sensors 2023, 23, 7208 19 of 20

9. Wang, X.Y.; Wang, X.S.; Zhou, Z.Q. A high-accuracy calibration method for fusion systems of millimeter-wave radar and camera.
Meas. Sci. Technol. 2022, 34, 15103.
10. Guo, P.; Wu, F.; Tang, S.; Jiang, C.; Liu, C. Implementation Method of Automotive Video SAR (ViSAR) Based on Sub-Aperture
Spectrum Fusion. Remote Sens. 2023, 15, 476.
11. Park, J.; Park, S.; Kim, D.; Park, S. Leakage mitigation in heterodyne FMCW radar for small drone detection with stationary point
concentration technique. IEEE Trans. Microw. Theory Tech. 2019, 67, 1221–1232.
12. Marshall, H.; Koh, G. FMCW radars for snow research. Cold Reg. Sci. Technol. 2008, 52, 118–131.
13. Mutschler, M.A.; Scharf, P.A.; Rippl, P.; Gessler, T.; Walter, T.; Waldschmidt, C. River Surface Analysis and Characterization Using
FMCW Radar. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 2493–2502.
14. Lang, S.N.; Cui, X.B.; Zhao, Y.K.; Xu, B.; Liu, X.J.; Cai, Y.H.; Wu, Q. A Novel Range Processing Method of Surface-Based FMCW
Ice-Sounding Radar for Accurately Mapping the Internal Reflecting Horizons in Antarctica. IEEE J. Sel. Top. Appl. Earth Obs.
Remote Sens. 2020, 13, 3633–3643.
15. Yan, J.M.; Zhang, G.P.; Hong, H.; Chu, H.; Li, C.Z.; Zhu, X.H. Phase-based human target 2-D identification with a mobile FMCW
radar platform. IEEE Trans. Microw. Theory Tech. 2019, 67, 5348–5359.
16. Li, Y.; Feng, B.; Zhang, W. Mutual Interference Mitigation of Millimeter-Wave Radar Based on Variational Mode Decomposition
and Signal Reconstruction. Remote Sens. 2023, 15, 557.
17. Wu, J.M.; Wang, J.; Gao, Q.H.; Pan, M.; Zhang, H.X. Path-independent device-free gait recognition using mmwave signals. IEEE
Trans. Veh. Technol. 2021, 70, 11582–11592.
18. He, J.; Terashima, S.; Yamada, H.; Kidera, S. Diffraction Signal-Based Human Recognition in Non-Line-of-Sight (NLOS) Situation
for Millimeter Wave Radar. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4370–4380.
19. Zhang, Z.L.; Meng, W.G.; Song, M.Q.; Liu, Y.H.; Zhao, Y.N.; Feng, X.; Li, F.C. Application of multi-angle millimeter-wave radar
detection in human motion behavior and micro-action recognition. Meas. Sci. Technol. 2022, 33, 105107.
20. Shrestha, A.; Li, H.B.; Le Kernec, J.; Fioranelli, F. Continuous human activity classification from FMCW radar with Bi-LSTM
networks. IEEE Sens. J. 2020, 20, 13607–13619.
21. Liang, T.X.; Xu, H.T. A posture recognition-based fall detection system using a 24GHz CMOS FMCW radar SoC. In Proceedings
of the 2021 IEEE MTT-S International Wireless Symposium (IWS), Nanjing, China, 23–26 May 2021.
22. Zhou, T.; Xia, Z.Y.; Wang, X.F.; Xu, F. Human sleep posture recognition based on millimeter-wave radar. In Proceedings of the
2021 Signal Processing Symposium (SPSympo), Lodz, Poland, 20–23 September 2021.
23. Huang, Y.; Da, F.P. Three-dimensional face point cloud hole-filling algorithm based on binocular stereo matching and a B-spline.
Front. Inf. Technol. Electron. Eng. 2022, 23, 398–408. [CrossRef]
24. Wang, J.L.; Zhang, H.Y.; Gao, J.X.; Xiao, D. Dust Removal from 3D Point Cloud Data in Mine Plane Areas Based on Orthogonal
Total Least Squares Fitting and GA-TELM. Comput. Intell. Neurosci. 2021, 2021, 9927982.
25. Poux, F.; Mattes, C.; Kobbelt, L. Unsupervised Segmentation of Indoor 3d Point Cloud: Application to Object-Based Classification.
Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 44, 111–118. [CrossRef]
26. Zhao, P.J.; Lu, C.X.X.; Wang, J.N.; Chen, C.H.; Wang, W.; Trigoni, N.; Markham, A. Human tracking and identification through a
millimeter wave radar. Ad Hoc Netw. 2021, 116, 102475.
27. Meng, Z.; Fu, S.; Yan, J.; Liang, H.Y.; Zhou, A.; Zhu, S.L.; Ma, H.D.; Liu, J.H.; Yang, N. Gait recognition for co-existing multiple
people using millimeter wave sensing. Proc. AAAI Conf. Artif. Intell. 2020, 34, 849–856. [CrossRef]
28. Alujaim, I.; Park, I.; Kim, Y. Human motion detection using planar array FMCW Radar through 3D point clouds. In Proceedings
of the 2020 14th European Conference on Antennas and Propagation (EuCAP), Copenhagen, Denmark, 15–20 March 2020.
29. Fang, Q.; Ibarra-Castanedo, C.; Maldague, X. Automatic Defects Segmentation and Identification by Deep Learning Algorithm
with Pulsed Thermography: Synthetic and Experimental Data. Big Data Cogn. Comput. 2021, 5, 9. [CrossRef]
30. Massaro, A.; Dipierro, G.; Cannella, E.; Galiano, A.M. Comparative Analysis among Discrete Fourier Transform, K-Means and
Artificial Neural Networks Image Processing Techniques Oriented on Quality Control of Assembled Tires. Information 2020,
11, 257.
31. Wang, L.; Xu, X.; Gui, R.; Yang, R.; Pu, F. Learning Rotation Domain Deep Mutual Information Using Convolutional LSTM for
Unsupervised PolSAR Image Classification. Remote Sens. 2020, 12, 4075.
32. Diraco, G.; Leone, A.; Siciliano, P. Human posture recognition with a time-of-flight 3D sensor for in-home applications. Expert
Syst. Appl. 2013, 40, 744–751. [CrossRef]
33. Werghi, N.; Xiao, Y.J. Recognition of human body posture from a cloud of 3D data points using wavelet transform coefficients.
In Proceedings of the Fifth IEEE International Conference on Automatic Face Gesture Recognition, Washington, DC, USA,
20–21 May 2002.
34. Zhang, G.C.; Geng, X.Y.; Lin, Y.J. Comprehensive mpoint: A method for 3d point cloud generation of human bodies utilizing
fmcw mimo mm-wave radar. Sensors 2021, 21, 6455.
35. Cao, Z.H.; Fang, W.W.; Song, Y.Y.; He, L.; Song, C.Y.; Xu, Z.W. DNN-based peak sequence classification CFAR detection algorithm
for high-resolution FMCW radar. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–15. [CrossRef]
36. Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X.W. A density-based algorithm for discovering clusters in large spatial databases with
noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD, Portland, OR,
USA, 2–4 August 1996; Volume 96, pp. 226–231.
Sensors 2023, 23, 7208 20 of 20

37. Song, J.H.; Wang, Y.; Fang, Z.C.; Peng, L.; Hong, H.Y. Potential of ensemble learning to improve tree-based classifiers for landslide
susceptibility mapping. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 4642–4662.
38. Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell.
Appl. Comput. Eng. 2007, 160, 3–24.
39. Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [CrossRef]
40. Rehman, A.U.; Lie, T.T.; Vallès, B.; Tito, S.R. Comparative evaluation of machine learning models and input feature space for
non-intrusive load monitoring. J. Mod. Power Syst. Clean Energy 2021, 9, 1161–1171. [CrossRef]
41. Landis, J.R.; Koch, G.G. The measurement of observer agreement for categorical data. Biometrics 1977, 33, 159–174. [CrossRef]
42. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.
Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like