Efficient Detection of DDoS Attacks Using A Hybrid Deep Learning Model With Improved Feature Selection
Efficient Detection of DDoS Attacks Using A Hybrid Deep Learning Model With Improved Feature Selection
sciences
Article
Efficient Detection of DDoS Attacks Using a Hybrid Deep
Learning Model with Improved Feature Selection
Daniyal Alghazzawi 1 , Omaimah Bamasag 2 , Hayat Ullah 3 and Muhammad Zubair Asghar 3, *
Abstract: DDoS (Distributed Denial of Service) attacks have now become a serious risk to the
integrity and confidentiality of computer networks and systems, which are essential assets in today’s
world. Detecting DDoS attacks is a difficult task that must be accomplished before any mitigation
strategies can be used. The identification of DDoS attacks has already been successfully implemented
using machine learning/deep learning (ML/DL). However, due to an inherent limitation of ML/DL
frameworks—so-called optimal feature selection—complete accomplishment is likewise out of reach.
This is a case in which a machine learning/deep learning-based system does not produce promising
results for identifying DDoS attacks. At the moment, existing research on forecasting DDoS attacks
has yielded a variety of unexpected predictions utilising machine learning (ML) classifiers and
conventional approaches for feature encoding. These previous efforts also made use of deep neural
networks to extract features without having to maintain the track of the sequence information. The
Citation: Alghazzawi, D.; Bamasaq,
O.; Ullah, H.; Asghar, M.Z. Efficient
current work suggests predicting DDoS attacks using a hybrid deep learning (DL) model, namely a
Detection of DDoS Attacks Using a CNN with BiLSTM (bidirectional long/short-term memory), in order to effectively anticipate DDoS
Hybrid Deep Learning Model with attacks using benchmark data. By ranking and choosing features that scored the highest in the
Improved Feature Selection. Appl. Sci. provided data set, only the most pertinent features were picked. Experiment findings demonstrate
2021, 11, 11634. https://doi.org/ that the proposed CNN-BI-LSTM attained an accuracy of up to 94.52 percent using the data set
10.3390/app112411634 CIC-DDoS2019 during training, testing, and validation.
Academic Editor: Jerry Chun-Wei Lin Keywords: deep learning; DDoS attacks; hybrid deep learning; feature selection
Measures for DDoS mitigation should have been performed in the case of DDoS
attacks, as described in [4]. Before any mitigation strategies can be used, DDoS strikes
must first be detected. DDoS attacks were initially detected by traffic engineers using
criteria they had written. This strategy appears to have fallen behind the changing and
developing pattern of DDoS attacks. Academics and industry are studying the prospect
of implementing machine learning/deep learning (ML/DL) for DDoS attack detection
as ML/DL unlocks its potentiality in several domains. In identifying risks, conventional
manual approaches have limited performance and a high latency. Attacks can be caught
faster rapidly and effectively using machine learning techniques like Naïve bayes Bayesian,
K-nearest neighbors, and support vector machine [5]. In machine learning, features for
classification should always be chosen by humans or by feature selection algorithms.
Selection of features, on the other hand, is an essential element of DL. Deep learning
methods like CNN and recurrent neural networks use a succession of nonlinear processing
elements to learn several levels of data interpretation from a large number of labeled data.
As a result, DL can be a useful technique for DDoS attack detection [2]. DDoS detection
using machine learning and deep learning has been shown to be successful. Section 2
will look at some prominent cases of ML/DL use for DDoS attack detection. We use the
Bi-Directional Long Short-Term Memory (BI-LSTM) after considering various possibilities.
effective DDoS attack prediction, it is critical to explore and use state-of-the-art hybrid DL
models on DDoS data.
2. Related Works
This section summarizes and evaluates existing research papers on detecting attacks
using the different IDS techniques listed above.
Various machine learning algorithms have been used to identify DDoS attacks, mostly
as classifiers. To mention a few, there are k-Nearest Neighbors (KNN), the Nave Bayes
Classifier, support vector machines (SVM), random forest (RF), and neural networks
(ANNs) [1,14] presented an interactive intelligent detection method for detecting
DoS/DDoS attacks. The detection algorithm made use of the random forest tree tech-
nique to identify different DoS/DDoS attacks, including flood TCP, flood UDP, flood HTTP,
and sluggish HTTP. However, Ref. [15] employed bio-inspired machine learning metrics
to quickly and accurately identify HTTP flood attacks. The developers of [15] used the
Bat algorithm, which is a low-complexity algorithm, as a bio-inspired method. While [16]
presented a TCP flood DDoS detection methodology. Different ML classifiers, such as SVM,
Nave Bayes, and KNN, were all used in this model. However, Ref. [17] demonstrated detec-
tion based on the covariance matrix method. The suggested detection was separated into
training and testing steps. A training phase was designed to build a typical network traffic
profile. The testing step was designed to detect any anomalous traffic by measuring the
difference between usual and any other network activity. The regular traffic was recorded
in their cloud from end-users surfing the Internet, whilst the flooding attack traffic was
created using the PageRebooter application. It was analyzed using the confusion matrix
and the findings were presented for a public and private cloud system. The authors of [18]
utilized the NB technique to accurately anticipate the occurrence of DDoS attacks based
Appl. Sci. 2021, 11, 11634 5 of 22
and tagging before being sent back into the system as new training examples. The exper-
iment findings show that the suggested BI-LSTM-GMM can obtain recall, accuracy, and
precision up to 94 percent using the data sets CIC-IDS2017 and CIC-DDoS2019 for training,
testing, and validation. However, Ref. [11] proposed a deep learning model for detecting
DDoS attacks on a collection of frames collected from network activity in this research.
Since it incorporates the extraction of features and classification methods in its structure, as
well as layers that upgrade themselves as it is learned, the DNN model can perform rapidly
and accurately even with tiny data. Experimentation using the CICDDoS2019 dataset,
which contains the most recent DDoS attack types produced in 2019, revealed that attacks
on network activity were identified with 99.99 percent accuracy and attack categories were
characterized with a 94.57 percent accuracy rate. The authors of [28] suggested a hybrid
deep generative model that efficiently detects malware variations by combining global
and local data. While its virus is transformed into an image to effectively describe global
features using a pre-defined implicit space, it retrieves local features utilizing machine code
sequencing. The two features retrieved from the dataset are synthesized and passed to the
intrusion detection systems. By combining all these features, the suggested model obtains
a 97.47 percent accuracy, which is considered to be state-of-the-art efficiency. The CAM
findings indicate that the created malware enhances the detection accuracy.
Research gap: While machine learning/deep learning-based approaches have been
successful, an important issue—choosing optimal features—has been unaddressed. The
authors of [11] developed a feed-forward DL-based approach for forecasting DDoS attacks
using benchmark data. It attempted to predict DDoS attacks by utilizing a DL technique
known as a feed-forward DL classifier. However, the selection of effective predictors
prior to applying deep learning to large data sets may produce encouraging outcomes.
As a result, traditional deep learning classifiers may be ineffective in predicting DDoS
attacks using benchmark data, if optimal sets of features are not selected. To address the
limitations of the baseline study [11], we propose an efficient hybrid deep learning model
(CNN+BiLSTM) augmented with feature selection. For the sake of being practically useful,
this study addresses the problem of DDoS attack detection by adding optimum feature
selection into the proposed hybrid DL-based architecture.
3. Methodology
Recent increases in the arrival rates of online data streams have placed a premium on
the amount of resources required by data mining processing systems. DataStream Mining
(i.e., stream learning) is a technique for extracting knowledge structures from an infinitely
long and ordered series of data that occurs throughout time (data in the stream) [29].
Incremental learning is a term that refers to the process of acquiring information
using stream data mining [30]. Both academics and industry have placed a premium
on incremental learning. It is a form of machine learning in which previously learned
information is applied when new examples come, and previously learned knowledge is
updated in response to the new occurrences [6].
Using two class labels, the proposed system will deploy hybrid classifiers with im-
proved FS. The traits associated with normal behavior are labeled “normal” or aberrant
behavior is labeled “attack.” Based on the suggestions in previous research [1,11], it was
discovered that some classifiers produce superior detection results than others. To construct
the prediction models, we chose hybrid deep learning classifiers with improved FS.
on reflection utilize authorized servers, such as Domain Name Server (DNS), Lightweight
Directory Access Protocol (LDAP), Network Basic Input/Output System (NETBIOS), and
(Simple Network Management Protocol) SNMP, that render various services over the
network. DDoS attacks relying on exploits, such as WebDDoS, SYN flood, UDP flood,
and UDPLag, make use of vulnerabilities in the TCP and UDP communication protocols.
The dataset has become helpful for the training of the model by eliminating extraneous
attributes from the CICDDoS2019 dataset, which we chose for the detection and charac-
terization of DDoS attacks. Packets using the TCP connection can be differentiated from
the simplified UDP packets by the SYN, ACK, FIN, URG, PSH, RST, ECE, and CWR flag
sections in the header elements. The network traffic of the first and second days is included
in [31]. The excel spreadsheets were merged and utilized entirely during the investigation.
3.2. Pre-Processing
The initial step before training the deep learning models is always to preprocess
the dataset in order to make it more appropriate for training and minimize overfitting.
Preprocessing is accomplished in the following ways:
• The CICDDoS2019 dataset in csv file format, which we utilized in the investigations,
was condensed to facilitate simpler training because it comprises a huge number of
socket information like flowID, destination IP, scoure IP, etc. To conform to the sug-
gested framework’s numeric composition, the non-numeric elements were converted
to numeric data using a one-shot encoding method.
• During importing of the dataset, the downsizing procedure was accomplished by
omitting records at random times to ensure that the sample was randomized. The
‘infinity’ number was changed with ‘−10 , and the rows with ‘NaN’ entries were
removed. The dataset was cleaned of nine attributes with a just ‘00 value, and the
model was provided with training with 69 attributes. The discarded 9 features include:
Fwd Bulk Rate (Avg), Fwd URG (Flags), Bwd URG (Flags), Fwd Bytes/Bulk (Avg),
Fwd Packet/Bulk (Avg), Bwd PSH (Flags), Bwd Bytes/Bulk (Avg), Bwd Packet/Bulk
(Avg), and Bwd Bulk Rate (Avg).
• CICDDoS2019 class tags were categorized according to reflection- and exploitation-
based attacks [31]. To identify an attack on network activity, the term ‘BENIGN’ was
tagged with a value of ‘00 , whereas other attacks were marked with a value of ‘10 .
The normalization technique was used in the range of 0–1 numbers to ensure that the
quantities in the dataset did not have an undue impact on training [11].
( Ai − Bi )2
Yc2 = ∑ Bi
where c represents the degree of freedom, A represents the observed value, and B represents
the anticipated observation in the ith class. On the original data set, the x2 test was used
to pick the most relevant features that had a strong relationship with the target variables.
The Python-based Sklearn package was used to pick relevant features, which were then
combined using the Select KBest score and the Chi2 function, because the more optimum
features have a greater correlation with the target attribute. To build our learning model,
Appl. Sci. 2021, 11, 11634 8 of 22
The maximum pooling process yielded a matrix L ∈ Ru+q−1, z+s−1 of pooled feature
maps, as shown in Equation (4).
L = l1,1 , l1,2 , l1,3 , . . . , lu+q−1, z+s−1 (4)
ti = ∑ wi li + b (5)
where ‘w’ represents the weight vector, ‘l’ represents the input vector, and ‘b’ represents the
bias. Equation (6) describes the SoftMax computation:
Appl. Sci. 2021, 11, 11634 10 of 22
expti
so f tmax (ti ) = (6)
∑m
n=1 exp
tn
t 1 = l1 × w2 + l2 × w2 + b
t2 = 0.921
The softmax activation function was used using Equation (7) to compute the probabil-
ity of each label (t1 ,t2 ):
expt1
so f tmax (t1 ) = (7)
exp 1 + expt2
t
exp2.14
so f tmax (t1 ) =
exp2.14 + exp0.921
8.499
so f tmax (t1 ) = = 0.77
11.01
The SoftMax functions for the other DDoS attack/normal classes were derived in the
same way:
so f tmax (t2 ) = 2.512/11.011 = 0.23
The T1 DDoS traffic(normal) had the highest probability, according to this calculation.
As a result, the projected DDoS attack decision was “A” based on the presented historical
traffic data (Figure 3).
Algorithm 1 shows the pseudocode processes of the suggested model for predicting
DDoS attacks.
Appl. Sci. 2021, 11, 11634 11 of 22
# Compilation Function
Table 5. The tested CNN + BiLSTM models’ test accuracy, test loss, and training time.
By varying the parameters of the DL model, we discovered that reducing the unit size
of the BiLSTM model leads to increased accuracy. In other words, the CNN + BiLSTM-10
model with feature selection performed best (91.52 percent) with smaller unit sizes.
During testing, it was discovered that the CNN-BiLSTM (10) model, which had a total
of 16 filters, an average filter size of 8, and a BiLSTM unit size of 10 (neurons), outperformed
all other models by 76 percent. The training time of the model is enhanced by reducing the
filter size.
Appl. Sci. 2021, 11, 11634 13 of 22
Table 6. Confusion matrix based on our suggested technique for four unique occurrences (TP, FP, TN,
and FN).
Actual
Attack Normal
Predicted
Attack 0.95 0.05
Normal 0.05 0.95
Additionally, we assess our proposed model using the different metrics that are widely
utilized in intrusion detection systems. The mathematical equations of precision, recall,
and f-score are presented below (Equations (8)–(10)).
Accuracy: Equation (8)’s accuracy reveals the model’s accurate predictive perfor-
mance. Accuracy is a measure that quantifies the total percent of detected and erroneous
alarms generated by an IDS model; it represents the total rate of success of any IDS and is
calculated as:
TP + TN
Accuracy = (8)
TP + TN + FP + FN
TP = true positive, TN = true negative, FP = f alse positive, and FN = f alse negative
Precision: the false negative rate (FNR), sometimes referred to as precision, is the pro-
portion of miscategorized attacks to the overall number of attack occurrences. The precision
produced from Equation (9) indicates how many positive forecasts are predicted exactly:
TP
Precision( p) = (9)
FP + TP
p = precision, TP = true positive, FP = f alse postive, and FN = f alse negative
Recall: the detection rate (DR), also known as the true positive rate (TPR) or recall, is
the percentage of properly identified malicious occurrences in relation to the total number
of malicious vectors. Equation (10), which calculates recall, reveals how many true positives
are successfully forecasted:
TP
Recall (r ) = (10)
FN + TP
r = recall, TP = true positive, and FN = f alse negative
F-score: The F1 score is critical since it provides further information about the IDS’s
performance. It takes into account false positives and negatives. The F1 score is advan-
tageous in particular when the distribution of class labels is unequal or unbalanced. The
F-score, which can be calculated using Equation (11), demonstrates the consistency of recall
and sensitivity:
PxR
Fscore = 2x (11)
P+R
R = Recall, P = Precision
Table 7 summarizes the accuracy, recall, and F1-score of the different CNN + BiL-
STM models.
Appl. Sci. 2021, 11, 11634 14 of 22
Table 7. Evaluation of the performance of CNN-BiLSTM models with and without feature selection [FS(no) = without
selection of features, FS(yes) indicates with selection of features].
Model Name Accuracy (%) Precision (%) Recall (%) F1-Score (%)
FS(No) FS(Yes) FS(No) FS(Yes) FS(No) FS(Yes) FS(No) FS(Yes)
CNN+
73 85.11 75 85 74 85 74 85
BiLSTM-1
CNN+
76 86.01 78 86 76 86 74 86
BiLSTM-2
CNN+
69 86.52 71 86 70 86 68 86
BiLSTM-3
CNN+
64 87.56 69 87 64 87 62 87
BiLSTM-4
CNN+
73 87.77 76 87 72 87 73 87
BiLSTM-5
CNN+
72 91.46 77 91 73 89 74 90
BiLSTM-6
CNN+
63 92.00 69 91.71 66 91 66 90.71
BiLSTM-7
CNN+
77 92.05 81 92.47 77 91.31 77 91.16
BiLSTM-8
CNN+
71 93.10 74 93.41 71 91.92 72 92.62
BiLSTM-9
CNN+
79 94.52 80 94.74 79 92.04 79 93.44
BiLSTM-10
Table 7 summarizes the accuracy, recall, and F1-score of the different CNN + BiLSTM
models, with and without feature selection. The best accuracy of 94.52 percent was attained
by our suggested model CNN-BiLSTM (10).
Accuracy, precision, recall, and the f-measue are all used to assess performance.
Multiple linear regression has the best incremental learning accuracy, which is roughly
78 percent on both localhost and distant virtual hosts. However, K-neighbors had the best
accuracy of 71%. Each experiment’s details can be found in Table 8. The SVM algorithm
produced an efficient accuracy of 74% for cloud testing.
• CNN + BiLSTM vs. Multiple Linear Regression: The purpose of this experiment
was to evaluate the efficacy of the proposed CNN + BiLSTM model with research
by [1], which utilized a Multiple Linear Regression classifier to predict DDoS attacks
using historical traffic data. In terms of precision (78), recall (79), F1-score (78), and
accuracy (78), Multiple Linear Regression classifiers provided inferior results (Table 8).
The Multiple Linear Regression model’s poor performance might be attributable to a
variety of factors as identified by [1].
• CNN + BiLSTM vs. XGBoost: The objectives of this investigation was to evaluate the
suggested CNN + BiLSTM model against an extreme gradient boosting (XGBoost)
classifier. As shown in Table 8, XGBoost classifiers yielded lower precision (76), recall
(76), F1-score (76), and accuracy (76 percent). XGBoost receives a poor score because
it is susceptible to overfitting in the presence of noisy data, needs a longer training
period, and is difficult to tweak [38].
• SVM vs. CNN + BiLSTM: The objectives of this investigation were to evaluate the
efficiency of the suggested CNN + BiLSTM model to that of SVM classifier to predict
DDoS attacks using historical traffic. SVM classifiers performed worse in terms of
precision (74), recall (74), F1-score (74), and accuracy (74), as seen in Table 8. The SVM
model’s poor performance might be attributed to: (i) long training times, (ii) expensive
computation, (iii) increased size requirements for training and testing, and (iv) more
complexity [9].
• CNN + BiLSTM vs. Random Forest: the purpose of this experiment was to see how
well the suggested CNN + BiLSTM model compared to a random forests (RF) classifier.
Table 8 demonstrates that RF classifiers have lower precision (75), recall (75), F1-score
(75), and accuracy (75) than the proposed system. The RF model’s poor performance is
due to the following factors: (i) its legitimate predictions takes time, (ii) it is unreliable
for categorical attributes, and (iii) comparable sets of related attributes in the data are
favored over bigger sets [39].
• CNN + BiLSTM vs. Logistic Regression: the objective of this experiment was to
evaluate the suggested CNN + BiLSTM model against a logistic regression (LR)
classifier. As shown in Table 8, LR classifiers provided lower precision (64), recall (64),
F1-score (64), and accuracy (64) outcomes (0.64 percent). LR is rated poor because it is
prone to overfitting [39] and only makes relatively brief predictions [39].
• CNN + BiLSTM vs. KNN: The objective of this investigation was to evaluate the
suggested CNN + BiLSTM model against a k-nearest neighbors (KNN) classifier.
Table 8 demonstrates that KNN classifiers have lower precision (71), recall (71), F1-
score (71), and accuracy (71). KNN is a low-ranking algorithm because it is (i) time-
consuming when working with big data sets, and (ii) sensitive to irrelevant and noisy
data [39].
Proposed model
(without FS)
Proposed CNN + BiLSTM
94.52 94.74 92.04 93.44
• CNN + BiLSTM vs. CNN: the purpose of this experiment was to evaluate the sug-
gested CNN + BiLSTM model against a single-layer CNN model in terms of effec-
tiveness. As shown in Table 9, the CNN model demonstrated lower precision, recall,
F1-score, and accuracy. CNN is ranked low since it (i) lacks information about the
text’s sequence context and (ii) needs a large data set to give an enhanced classifica-
tion performance.
• CNN + BiLSTM vs. LSTM: this investigation compared the suggested CNN + BiLSTM
model’s effectiveness against that of an LSTM model. As shown in Table 9, the
CNN model demonstrated lower precision, recall, F1-score, and accuracy. LSTM
models retain only previous contextual knowledge and discard upcoming contextual
information, which would aid in comprehending the meaning of the reviewed text.
As such, it performed the worst of all the models.
• CNN + BiLSTM vs. BiLSTM: this investigation compared the proposed CNN +
BiLSTM model against a BiLSTM model in terms of predicting court judgments from
previous legal data. As shown in Table 9, the BiLSTM model performed worse in
terms of precision, recall, F1-score, and accuracy. A BiLSTM model’s principal aim is
to store contextual information for both forward and reverse directions in a sequence.
BiLSTM is ranked low due to its ineffectiveness in extracting features.
• CNN + BiLSTM vs. CNN + LSTM: the purpose of this experiment was to evaluate the
suggested CNN + BiLSTM model against a CNN+LSTM model. The CNN + LSTM
model underperformed in terms of precision, recall, F1-score, and accuracy, as shown
in Table 9. This is because the unidirectional LSTM layer is ineffective at retaining
subsequent contextual information, leading to suboptimal efficiency.
• CNN + BiLSTM vs. RNN: the aim of this experiment was to evaluate the suggested
CNN + BiLSTM model against an RNN model in terms of effectiveness. As shown
in Table 9, the RNN model achieved suboptimal performance in terms of precision,
recall, F1-score, and accuracy. Due to the fact that RNN models are unable to man-
age exceptionally long-term sequencing, they would not retain information for an
extended length of time. As a consequence, the RNN model produced suboptimal
outcomes.
• CNN + BiLSTM vs. CNN + RNN: the purpose of this experiment was to contrast the
suggested CNN + BiLSTM model against a CNN+RNN model. As seen in Table 9,
the CNN + RNN model performed poorly in terms of precision, recall, F1-score, and
accuracy. This is because RNN models do not retain context data over extended
periods of time.
Appl. Sci. 2021, 11, 11634 17 of 22
The above comparison tests show that the suggested CNN + BiLSTM model out-
performs other deep learning models (LSTM, GRU, CNN, BiLSTM, and RNN) in terms
of precision, recall, F1-score, and accuracy. This resulted in an increase in classification
accuracy when two deep learning models, namely CNN and BiLSTM, were combined
along with feature selection.
Explanation for better outcomes: our work proposes combining a BiLSTM with a CNN
model. The capacity of BiLSTM models to store two-directional context data efficiently—
forward (next) and backward (previous)—is the chief factor for the suggested model’s
enhanced performance. Its improved representation of data (input text), gained via the
CNN model, allows it to gather information not only from the current input but also
from previous ones, preventing information decay. This results in effective court decision
prediction using past legal data, since the BiLSTM model maintains both present and
previous context data, whereas the CNN model extracts just localized features. Due to
the improved representation of the input text, this results in high classification results.
This study makes a unique contribution by demonstrating the capability of hybrid DL
with effective feature selection for anomaly detection methods. To classify incoming
traffic into legitimate or malicious categories, we presented a hybrid DL method based
on CNN-BiLSTM. In comparison to shallow learners, our method outperformed them in
terms of accuracy, recall, F1-measure, and precision. Because of their capacity to cope
with a large degree of complicated nonlinear relationships, DL methods hold promise for
accurately detecting intrusions. It may be used to overcome the limitations of conventional
classification approaches, which rely on classical feature encoding to detect anomalous
traffic [5].
Table 11. Cross validation of the proposed system contrasted with different classifiers.
Mean Mean
Mean Standard Standard Standard Mean F-I Standard
Classifiers Precision Recall
Accuracy Deviation Deviation Deviation Macro Deviation
Macro Macro
Random Forest 82 0.06 83 0.05 83 0.06 83 0.07
SVM 72 0.06 72 0.06 72 0.06 72 0.06
XGBoost 84 0.07 83 0.06 83 0.06 83 0.07
CNN 86 0.07 87 0.05 86 0.05 83 0.06
BiLSTM 85 0.06 85 0.05 83 0.05 84 0.05
CNN + BILSTM
93.11 0.05 91 0.04 91 0.05 92 0.05
(proposed)
Table 12. Significant differences between the SVM (ML) and CNN + BILSTM (DL) models.
After randomly choosing 100 records from the dataset, the individual records were
classified using the CNN + BILSTM (DL) and KNN classifiers (ML). Two hypotheses were
tested in the experimental context:
Hnull : The error rates across both models are identical.
Halternate : The error rates of both models are significantly dissimilar.
Equation (12) gives McNemar’s chi-squared statistic test:
(| x − y| − 1)2
χ2 = (12)
( x + y)
The cells x and y were used to generate discordant test statistics, with 1 representing
the degree of freedom and χ2 representing chi-squared.
Analysis
Table 9 demonstrates the CNN + BILSTM model’s utility, demonstrating a significant
improvement in predicting DDoS attacks from historical traffic data, with an accuracy of
94.52 percent. The utility of the SVM model is seen in Table 8, since it scored poorly on all
estimation metrics: precision, recall, F1-score, and accuracy. According to the significance
test, the disparity between both the DL model (CNN+BiSLTM) and the ML model was
substantial (SVM). Using continuous correction, the p-value for McNemar’s statistic test
was computed. The Chi-squared coefficient was 2.2, with one degree of freedom and a
p-value of 0.135 for two-tailed analysis. A p-value less than 0.5 validated the alternative
hypothesis and disproved the null hypothesis. As a consequence, the suggested model
Appl. Sci. 2021, 11, 11634 19 of 22
with word embedding outperformed the SVM model based on conventional features by a
statistically significant margin. This demonstrates how the addition of word representation
characteristics enhanced the CNN + BILSTM model’s resistance against DDoS assaults
using historical traffic data.
4.7. Comparing the Proposed System to Existing Systems and Qualitative Evaluation
This section compares the proposed approach to benchmark studies. It is challenging
to do a real comparison of the stated procedures due to a variety of constraints. For
example, such algorithms are evaluated on a variety of datasets, making comparison
complicated. Furthermore, the participating researchers offer the methodologies in their
studies at an abstract level with little information, which could render them unfeasible for
future investigations.
Bearing the aforesaid difficulties in mind, we implemented the strategies outlined in
the published works using two datasets. During implementation, we did our best to adhere
to the original experiment and procedure described in the papers; nevertheless, owing to
inadequate discussions and an absence of adequate information in certain cases, we had
to remove such aspects of the technique or presume what the researchers wanted. For
example, using historical traffic, Ref. [1] suggested a supervised ML model for predicting
DDoS attacks. On historical traffic, an ML algorithm called Multiple Linear Regression was
used. The model’s performance is poor, as evidenced by the experimental results (accuracy:
75%, precision: 75%, recall: 75%, and F1-score: 75%), obtained on the given benchmark
dataset. However, they did not indicate specifications on system parameters and feature
engineering, which may differ from the authors.
We conducted a quantitative evaluation of the various DDoS attacks detection algo-
rithms using two cutting-edge datasets obtained from [27]. We used an Anaconda-based
Jupyter notebook to apply known methodologies [12]. The findings from our tests diverge
from the stated results in a few instances owing to the use of various datasets, parameteri-
zation, and software. For example, although Sambangi and Gondi [1] claimed 85 percent
accuracy, we achieved 75 percent; Refs. [9,12] reported 84 percent accuracy, while our stud-
ies generated 74 percent on the CICDDoS2019 dataset and 77 percent on the CIC-IDS2017
dataset. The variations in the claimed and tested accuracies were induced by the authors’
utilization of multiple datasets. In the study on DDoS traffic data, Ref. [12] used the DL
method. It was discovered that combining better feature selection approaches with a DL
model might increase the model’s efficacy. In another work, Ref. [11] proposed a super-
vised DL model for predicting DDoS attacks based on historical traffic. A deep learning
method called the feed forward model was employed to analyze historical traffic. The
model’s performance on the provided benchmark dataset demonstrated low results in the
absence of an optimal set of features. In our implementation, a hybrid deep learning model,
especially CNN + BiLSTM with an improved FS, outperformed earlier methodologies,
and it is recommended that more research into various combinations of deep learning
models for predicting DDoS attacks will yield more remarkable results. The suggested
DL-based solution for predicting DDoS attacks was based on a hybrid deep neural network
model and an enhanced feature selection strategy. The experimental findings show that
the suggested approach outperforms baseline research (Table 13), and that the selected
predictor factors (10) have a substantial impact on the projected (target) variable.
Appl. Sci. 2021, 11, 11634 20 of 22
Table 13. The suggested model and baseline results are compared (A: Accuracy, P: Precision, R: Recall, and F: F1.
5.1. Limitations
However, significant drawbacks of the proposed system include:
(i) The use of a single data set;
(ii) The use of a single statistical technique, the chi-squared measure, to identify important
features (predictors);
(iii) The use of embeddings rather than a pre-trained CNN model; and
(iv) In this paper, we employed a binary classification system to categorize input traffic as
normal or malicious.
(v) Additionally, by combining the CICDDoS2019 dataset with hybrid DL, we can give
direction to other academics focusing on DDoS vulnerability scanning. When it comes
to detecting intrusions and securing software-based networks, it appears that the
hybrid DL model with improved FS is an excellent choice due to improved accuracy.
(vi) We intend to create a dataset similar to the CICDDoS2019 dataset in the future by
capturing network activity through virtual computers and Internet of things devices.
By including DNN and deep learning models in the dataset that will be created, it
will be possible to identify real-time DDoS attacks and plan appropriate responses.
Author Contributions: Conceptualization, D.A. and O.B.; methodology, D.A., O.B; software, H.U.
and M.Z.A.; validation, M.Z.A., and H.U.; formal analysis, D.A.; investigation, O.B.; resources,
D.A., OB, data curation, O.B., and H.U; writing—M.Z.A., D.A. original draft preparation, M.Z.A.,
D.A; writing—D.A., H.U.; visualization, O.B., M.Z.A.; supervision, D.A.; project administration,
D.A.; funding acquisition, D.A. All authors have read and agreed to the published version of
the manuscript.
Funding: This project was funded by the Deanship of Scientific Research (DSR) at King Abdulaziz
University, Jeddah, under Grant No. (RG-95-611-42). The authors, therefore, acknowledge with
thanks to DSR technical and financial support.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: The data underlying this work can be requested from the correspond-
ing author.
Acknowledgments: This project was funded by the Deanship of Scientific Research (DSR) at King
Abdulaziz University, Jeddah, under Grant No. (RG-95-611-42). The authors, therefore, acknowledge
with thanks to DSR technical and financial support.
Conflicts of Interest: The authors declare that they have no conflict of interest.
References
1. Sambangi, S.; Gondi, L. A Machine Learning Approach for DDoS (Distributed Denial of Service) Attack Detection Using Multiple
Linear Regression. Proceedings 2020, 63, 51. [CrossRef]
2. Shieh, C.S.; Lin, W.W.; Nguyen, T.T.; Chen, C.H.; Horng, M.F.; Miu, D. Detection of Unknown DDoS Attacks with Deep Learning
and Gaussian Mixture Model. Appl. Sci. 2021, 11, 5213. [CrossRef]
3. Genie-Networks. DDoS Attack Statistics and Trends Report for 2020. 2021. Available online: https://www.genie-networks.com/
gnnews/DDoS-attack-statistics-and-trends-report-for-h1-2020/ (accessed on 6 May 2021).
4. Jonker, M.; Sperotto, A.; Pras, A. DDoS Mitigation: A measurement-based approach. In Proceedings of the NOMS 2020—2020
IEEE/IFIP Network Operations and Management Symposium, Budapest, Hungary, 20–24 April 2020; pp. 1–6.
5. Alsaeedi, A.; Bamasag, O.; Munshi, A. Real-Time DDoS flood Attack Monitoring and Detection (RT-AMD) Model for Cloud
Computing. In Proceedings of the 4th International Conference on Future Networks and Distributed Systems (ICFNDS), Saint
Petersburg, Russia, 26–27 November 2020; pp. 1–5.
6. Khattak, A.; Asghar, M.Z.; Ali, M.; Batool, U. An efficient deep learning technique for facial emotion recognition. Multimed. Tools
Appl. 2021. [CrossRef]
7. Khattak, A.; Khan, A.; Ullah, H.; Asghar, M.U.; Arif, A.; Kundi, F.M.; Asghar, M.Z. An Efficient Supervised Machine Learning
Technique for Forecasting Stock Market Trends. In Information and Knowledge in Internet of Things; Springer: Cham, Switzerland,
2022; pp. 143–162.
8. Zubair Asghar, M.; Subhan, F.; Imran, M.; Masud Kundi, F.; Khan, A.; Shamshirband, S.; Mosavi, A.; Varkonyi-Koczy, A.R.;
Csiba, P. Performance evaluation of supervised machine learning techniques for efficient detection of emotions from online
content. Comput. Mater. Contin. 2020, 63, 1093–1118. [CrossRef]
9. Khan, A.; Khattak, A.M.; Asghar, M.Z.; Naeem, M.; Din, A.U. Playing First-Person Perspective Games with Deep Reinforcement
Learning Using the State-of-the-Art Game-AI Research Platforms. In Deep Learning for Unmanned Systems; Springer: Cham,
Switzerland, 2021; pp. 635–667.
10. Ahmad, S.; Asghar, M.Z.; Alotaibi, F.M.; Khan, S. Classification of poetry text into the emotional states using deep learning
technique. IEEE Access 2020, 8, 73865–73878. [CrossRef]
11. Cil, A.E.; Yildiz, K.; Buldu, A. Detection of DDoS attacks with feed forward based deep neural network model. Expert Syst. Appl.
2021, 169, 114520. [CrossRef]
Appl. Sci. 2021, 11, 11634 22 of 22
12. Cheng, J.; Liu, Y.; Tang, X.; Sheng, S.V.; Li, M.; Li, J. DDoS attack detection via multi-scale convolutional neural network. Comput.
Mater. Contin. 2020, 62, 1317–1333. [CrossRef]
13. Ahmad, S.; Asghar, M.Z.; Alotaibi, F.M.; Awan, I. Detection and classification of social media-based extremist affiliations using
sentiment analysis techniques. Hum. Centr. Comput. Inf. Sci. 2019, 9, 1–23.
14. Lima Filho, F.S.D.; Silveira, F.A.; de Medeiros Brito, A., Jr.; Vargas-Solar, G.; Silveira, L.F. Smart detection: An online approach for
DoS/DDoS attack detection using machine learning. Secur. Commun. Netw. 2019, 2019, 1574749. [CrossRef]
15. Sreeram, I.; Vuppala, V.P.K. HTTP flood attack detection in application layer using machine learning metrics and bio inspired bat
algorithm. Appl. Comput. Inform. 2019, 15, 59–66. [CrossRef]
16. Sahi, A.; Lai, D.; Li, Y.; Diykh, M. An efficient DDoS TCP flood attack detection and prevention system in a cloud environment.
IEEE Access 2017, 5, 6036–6048. [CrossRef]
17. Aborujilah, A.; Musa, S. Cloud-based DDoS HTTP attack detection using covariance matrix approach. J. Comput. Netw. Commun.
2017, 2017, 7674594. [CrossRef]
18. Fadlil, A.; Riadi, I.; Aji, S. Review of detection DDOS attack detection using naive bayes classifier for network forensics. Bull. Electr.
Eng. Inform. 2017, 6, 140–148. [CrossRef]
19. Dincalp, U.; Güzel, M.S.; Sevine, O.; Bostanci, E.; Askerzade, I. Anomaly based distributed denial of service attack detection and
prevention with machine learning. In Proceedings of the 2018 2nd International Symposium on Multidisciplinary Studies and
Innovative Technologies (ISMSIT), Ankara, Turkey, 19–21 October 2018; pp. 1–4.
20. Zhang, Y.L.; Li, L.; Zhou, J.; Li, X.; Zhou, Z.H. Anomaly detection with partially observed anomalies. In Proceedings of the
Companion Proceedings of the Web Conference; Lyon, France, 23–27 April 2018, pp. 639–646.
21. Wang, N.; Zhang, Z.; Zhao, X.; Miao, Q.; Ji, R.; Gao, Y. Exploring high-order correlations for industry anomaly detection.
IEEE Trans. Ind. Electron. 2019, 66, 9682–9691. [CrossRef]
22. Krupp, J.; Backes, M.; Rossow, C. Identifying the scan and attack infrastructures behind amplification DDoS attacks. In Proceedings of
the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 1426–1437.
23. Yuan, Z.; Lu, Y.; Wang, Z.; Xue, Y. Droid-sec: Deep learning in android malware detection. In Proceedings of the 2014 ACM
Conference on SIGCOMM, Chicago, IL, USA, 17–22 August 2014; pp. 371–372.
24. Su, X.; Zhang, D.; Li, W.; Zhao, K. A deep learning approach to android malware feature learning and detection. In Proceedings
of the 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 23–26 August 2016; pp. 244–251.
25. Li, Y.; Lu, Y. LSTM-BA: DDoS detection approach combining LSTM and Bayes. In Proceedings of the 2019 Seventh International
Conference on Advanced Cloud and Big Data (CBD), Suzhou, China, 21–22 September 2019; pp. 180–185.
26. Lin, P.; Ye, K.; Xu, C.Z. Dynamic network anomaly detection system by using deep learning techniques. In Proceedings of
the International Conference on Cloud Computing, San Diego, CA, USA, 25–30 June 2019; Springer: Cham, Switzerland, 2019;
pp. 161–176.
27. Li, Z.; Rios, A.L.G.; Xu, G.; Trajković, L. Machine learning techniques for classifying network anomalies and intrusions. In Proceed-
ings of the 2019 IEEE International Symposium on Circuits and Systems (ISCAS), Sapporo, Japan, 26–29 May 2019; pp. 1–5.
28. Kim, J.Y.; Cho, S.B. Obfuscated Malware Detection Using Deep Generative Model based on Global/Local Features. Comput. Secur.
2021, 112, 102501. [CrossRef]
29. Gomes, H.M.; Bifet, A.; Read, J.; Barddal, J.P.; Enembreck, F.; Pfharinger, B.; Holmes, G.; Abdessalem, T. Adaptive random forests
for evolving data stream classification. Mach. Learn. 2017, 106, 1469–1495. [CrossRef]
30. Ramírez-Gallego, S.; Krawczyk, B.; García, S.; Woźniak, M.; Herrera, F. A survey on data preprocessing for data stream mining:
Current status and future directions. Neurocomputing 2017, 239, 39–57. [CrossRef]
31. Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing realistic distributed denial of service (DDoS) attack dataset
and taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India,
1–3 October 2019; pp. 1–8.
32. Lashkari, A.H. CICFlowMeter. 2020. Available online: https://github.com/ISCX/CICFlowMeter (accessed on 8 November 2020).
33. Li, Y.; Yan, C.; Liu, W.; Li, M. A principle component analysis-based random forest with the potential nearest neighbor method
for automobile insurance fraud identification. Appl. Soft Comput. 2018, 70, 1000–1009. [CrossRef]
34. Brownlee, J. A Gentle Introduction to the Bag-of-Words Model. Available online: https://machinelearningmastery.com/gentle-
introduction-bag-words-model/ (accessed on 7 August 2019).
35. Vuong, T.H.; Thi, C.V.N.; Ha, Q.T. N-tier machine learning-based architecture for DDoS attack detection. In Proceedings of
the Asian Conference on Intelligent Information and Database Systems, Phuket, Thailand, 7–10 April 2021; Springer: Cham,
Switzerland, 2021; pp. 375–385.
36. Ikram, S.T.; Cherukuri, A.K. Intrusion detection model using fusion of chi-square feature selection and multi class SVM. J. King
Saud Univ. Comput. Inf. Sci. 2017, 29, 462–472.
37. Asghar, J.; Akbar, S.; Asghar, M.Z.; Ahmad, B.; Al-Rakhami, M.S.; Gumaei, A. Detection and Classification of Psychopathic
Personality Trait from Social Media Text Using Deep Learning Model. Comput. Math. Methods Med. 2021, 2021, 5512241. [CrossRef]
38. Khattak, A.; Asghar, M.Z.; Ishaq, Z.; Bangyal, W.H.; Hameed, I.A. Enhanced concept-level sentiment analysis system with
expanded ontological relations for efficient classification of user reviews. Egypt. Inform. J. 2021, in press. [CrossRef]
39. Ullah, H.; Ahmad, B.; Sana, I.; Sattar, A.; Khan, A.; Akbar, S.; Asghar, M.Z. Comparative study for machine learning classifier
recommendation to predict political affiliation based on online reviews. CAAI Trans. Intell. Technol. 2021, 6, 251–264. [CrossRef]