Research_Into_the_Security_Threat_of_Web_Application
Research_Into_the_Security_Threat_of_Web_Application
of Web Application
Abstract
In order to effectively analyze the security threat of web application, the
security threat model of web application is established. Firstly, the main
problems with web application are summarized. Secondly, the main security
threat of web application are analyzed, and the corresponding optimization
model is constructed. An algorithm based on the improved Butterfly Opti-
mization Algorithm (BOA) security threat optimization model is designed.
Finally, a SQL injection loophole is selected for example research, and the
security threat path of web application is obtained. The results show IBOA
has the advantages of high optimization accuracy, global optimization and
stable solution, and average accuracy rate is 99.1% and the average recall
rate is 99.1%, which shows that the model has better classification effect,
therefore it has the best performance.
1 Introduction
With the acceleration of China’s economic development and social informa-
tization process, the Internet has become an indispensable part of people’s
work and life. In order to adapt to social development, establish their own
good image, expand social influence and improve work efficiency, more and
more government agencies, banks, enterprises and institutions have estab-
lished their own portal websites. However, because websites are usually
in a relatively open environment, such as the Internet, the complexity and
diversity of various web application systems lead to a large amount of
vulnerabilities to the system. Trojan horse and other viruses and malicious
code are rampant on the Internet. Hackers invade and tamper with website
security from time to time, and even some acts of tampering with websites
directly escalate into political events, seriously endangering national security
and people’s interests. Therefore, network security threats are becoming more
and more serious. Therefore, it is necessary to ponder over how to protect
network security and how to ensure the security of web application programs
and provide users with fast and stable services [1, 2]. There are many
factors affecting network security, mainly including the impact of viruses
and network systems, the limitations of firewalls and the impact of users’
daily operations.
Web application is a program that can be accessed through the web.
The biggest advantage of the program is that users can easily access it.
What users need is to have a browser, and there is no need to install other
software. Compared with the traditional software engineering practice, the
system based on Web application pays more attention to users, which makes
it difficult to model web application with interactive behaviors. Therefore, in
order to ensure the correctness and security of the system, web application
development needs system modeling. System security specifically refers to
the correctness, mutual exclusion and deadlock free of the system, which is
used to explain that “bad things will never happen”. Due to the interaction
between web application program and browser, it is very important to ensure
its reliability and security through verification and validation technology
[3–5]. In the web browser cache, users can not only interact with the web
page, but also interact with the web browser itself through the buttons in the
web browser, such as back, forward, refresh or rewrite the URL address. They
can also interact with Web databases through browsers. The user will not be
prompted if the page loses its position. User’s operation on the browser and
web page will affect the establishment of the whole web page navigation
model. This interaction behavior may have a negative impact on the security
Research Into the Security Threat of Web Application 1709
be provided to the public. Second, the server puts the data that should be
private into the area of public access, resulting in the disclosure of sensitive
information. Third, the server trusted the data from untrusted data sources,
resulting in being attacked [14].
Many web server administrators have never looked at their servers from
another perspective and have not checked the security risks of the servers,
such as using port scanners for system risk analysis. If they had done so, they
wouldn’t have run so many services on their own systems, which didn’t need
to run on the machines that officially provide web services, or they didn’t
need to be open to the public. In addition, they did not modify the banner
information of the application program providing services to the outside, so
that the attacker can easily obtain the relevant version information of the
application program provided by the web server, and find the corresponding
attack method and attack program according to the information. Many web
application programs are vulnerable to attacks through server, programme,
and inner code. The attacks directly bypass the security measures of the sur-
rounding firewall. There are some problems in web application security, such
as improper entry, unavailable estimate regulate, unavailable description and
risk management, buffer overflow, injection attack, exception error handling,
denial of service attack, unsafe allocation management and so on. The four
main attacks are as follows:
to simulate illegal operations within the enterprise and external attacks that
know nothing about the internal status of the enterprise. However, the effect
of this security testing method depends too much on the ability of tools and
technicians, and is usually distributed throughout the enterprise in a discrete
status to penetrate the website, which cannot fully simulate the process of
hacker attack. A vulnerability may have certain preconditions before it can
be found. Therefore, this “separate” status is easy to unilaterally estimate the
vulnerability threat, thus hiding the existence of the vulnerability. Therefore,
we need to model various types of Web vulnerabilities to find the greatest
threat to the website [15].
The impact of attack graph as a model of attack conditions, processes
and results in penetration testing This paper will design Vulnerability Threats
Testing Model (V T T M ) according to the basic idea of attack graph, and find
the biggest threat of the website comprehensively and accurately by solving
the model. The attack graph takes the form of directed graph. The vertex
represents the security status of the website. The conversion from edge to
edge contains various conditions. According to the basic idea of attack graph,
this paper will establish a V T T M model for the web vulnerability testing
process of the website, which will be represented by weighted directed
graph. Generally, the steps of establishing deletion model are: establishing
status conversion diagram → combine → optimize → determine the optimal
function and constraints.
The V T T M model has five basic elements: website security status S,
initial status S0 , purpose status St , security status conversion γ and value cost
ratio σ. V T T M model can be expressed by quintet (S0 , S,St , γ, σ), and its
basic thought S is to use a large number of test samples to test each security
status, and after the conversion condition is met, conduct status conversion.
If the target status can be reached in the end, it indicates that the website has
vulnerabilities corresponding to S0 and St .
Each status in s contains the following elements:
(1) User permission: current user level ID. It is usually represented by
enumeration, including guest (anonymous), IUSR-xxxx (Internet access
user), normal (public), administrator, etc.
(2) Target: specify the target to be tested, which can be single or multiple
servers. Status conversion does not involve target migration. This ele-
ment can be omitted.
(3) Vulnerability: exploitable vulnerability obtained through status
conversion.
1714 Y. Zhang and T. Zhang
(4) Ability: the harm that can be done to the website or the attack that can be
carried out under the current status, such as session spoofing, password
eavesdropping, Trojan horse implantation and SQL injection.
γij means to convert the website security status Sj to after applying the test
sample to Si , and such conversion is expressed by < Si , Sj >, where Si
refers to the test condition, and Sj refers to the test result. Test cases should
be atomic. That is, the smallest test unit in the test link can obtain a single
effect by a single means. In model V T T M , γij refers to directed edge, which
can assign value to each edge. Here, the weight can be expressed by the
binary (C, V ), whereC refers to the cost of implementing the test sample.
It is the comprehensive value of execution cycle, the possibility of being
recorded by IDs or IIS and overcoming system resources: V refers to the cost
of implementing the test sample. The above two elements are set according
to the tester’s own needs and experience.
The process of vulnerability testing can be regarded as a status conversion
process T P (called test path in this paper) based on test conditions, which is
represented by linear status sequence [16]:
T P = {S0 , S 1 , . . . , Sn , . . . , Sm | Sn ∈ S0 , s1 ∈ S, s1 ∈ Sm , 0 ≤ i ≤ |S|}
(1)
The key factor to reach the target status in V T T M is to reasonably and
fully design the test sample that can cover all the function modules in the
website program.
After establishment, traverse the status conversion diagram to obtain all
test paths from the initial status to the target status. Generally, there are
depth first traversal and width first traversal. In model V T T M , use deep
traverse first. And during the traverse, the path from s0 to s1 is the linear
status order. Then determine the optimal sequence according to the needs,
and exploit the vulnerabilities according to the optimal sequence will achieve
the maximum benefits. The optimal standard is determined by the needs of
the tester. For example, without considering the cost, only considering the
value, the optimal function is:
X
max vij (2)
If the value is not considered and only the cost is considered, the optimal
function is: X
min cij (3)
Research Into the Security Threat of Web Application 1715
Vij
If both are considered, the value cost ratio Pij = Cij can be introduced,
and the optimal function is:
X
min Pij (4)
The conversion conditions of each status in the model can be regarded as
constraints, which are mainly reflected in test condition S’s internal elements
of “vulnerability” and “capability”.
In the establishment of the V T T M model of Web loophole, all types
of vulnerability threats need to be considered. Determine the initial status
and target status respectively. Create their own sub status conversion dia-
grams, and then merge them into a global status conversion diagram. In
order to reduce the complexity of calculating the optimal test sequence, the
global status conversion diagram must be optimized. Optimization follows
the following principles:
(1) When merging status nodes, each element of the node must be consis-
tent. Otherwise, it cannot be merged.
(2) There is status node with conditional dependency, i.e., the two nodes
Si and Sj that can be expressed by ordered pair < Si , Sj > cannot be
merged.
(3) According to the actual situation of the website, check whether the
conversion conditions of each status are true. If not, remove the
conversion.
(4) Add necessary paths to different sub status conversion diagrams,
because the test process should be comprehensive and dynamic, and will
not be confined to a specific link. Testers may find the conditions for
this vulnerability through other defects of the web program, so that the
security status conversion can be transferred with preconditions.
(5) If there is a loop in the optimized status conversion diagram, the status
conversion in the loop may be atomic, so it is necessary to re analyze
and add new status nodes to avoid generating a loop.
In the formula, Xit , Xjt , Xkt refers to the solution vector of the Butterfly
i, j, k at generation t, among which Xjt and Xkt are the position vectors of two
individuals randomly selected in the t generation population.
Step 6: Judge whether the algorithm operation meets the end conditions. If so,
record the current optimal solution and its target value. If not, update the
sensory form C of fragrance concentration according to Equation (8) and
return to step 2 to continue the next round of iteration.
0.025
c(t + 1) = c(t) + (8)
c(t) · Niter
In the formula, t is the current iteration times, and Niter is the maximum
iteration times.
In iterative evolutionary algorithms, inertia weight can be used to adjust
and control the global survey and local mining ability of the algorithm.
Aiming at the disadvantages of slow convergence speed and low optimization
accuracy of the basic butterfly algorithm for complex functions, this paper
introduces adaptive inertia weight in the global search stage, which decreases
nonlinearly with the increase of evolutionary algebra. The inertia weight
function is proposed as follows:
!
1 α · tα α · tβ
ω= · α + β (9)
(t + 1)(α+β)3 Niter Niter
1718 Y. Zhang and T. Zhang
In formula (11), Xjt and Xkt are the random solutions within the current
group. ε is the mutation factor, and its calculation formula is as shown in for-
mula (12), among which ε0 is the initial mutation factor. After a large number
of tests, when ε0 is 0.1, the algorithm has the best improvement effect. f pdf ()
is a random number that obeys the F distribution. The F distribution function
presents an asymmetric peak shape, which first increases rapidly and then
decreases slowly, which helps butterfly individuals adaptively converge to
the position of the global optimal solution with the increase of the number of
iterations. The formula combining the number of iterations and F distribution
Research Into the Security Threat of Web Application 1719
is as follows [22]:
( µ+1 −1)
Γ µ+λ
2 µ µ t 2
F = · · · 1+
Γ µ2 Γ λ2 λ λ Niter
(− µ+1 )
µ t 2
· 1+ · 1+ (13)
λ Niter
In the formula, λ and µ are the degrees of freedom with non-
interchangeable positions. After repeated tests, when λ = 3 and µ = 4,
the algorithm has the strongest ability in global search and the fastest
convergence speed. The global search formula is as follows:
the new solution move towards the optimal solution and accelerate the con-
vergence speed. It is verified from a large number of experiments that when
ξ = 0.9, the algorithm has the strongest local search ability and is easier to
jump out of local extremum; η is the disturbing factor. It is to prevent the
algorithm from falling into local extremum due to the introduction of the
current optimal value, further improve the optimization performance of the
algorithm and find the global optimal solution. The calculation formula of η
is as follows:
5 Example Analysis
To validate effectiveness of constructed web application threat model, a
V V T M model is established with SQL injection vulnerability as the research
object. The target status is to get the webshell of the website. The five
elements of the model are described above, and the corresponding status
conversion diagram is illustratedPin Figure 1.
d(i, j, k) refers to the max Pij corresponding to the path from node
Si to node Sj , and the maximum node on this path is Sk . Thus, it can be
defined that:
0, i = j, k = 0
−∞, < si , sj >∈ γ
d(i, j, k) = (18)
max{d(i, j, k − 1), d(i, k, k − 1)
+d(i, j, k − 1), < si , sj >∈ γ, k ≥ 0
294506 pieces of data were used in the experiment, including XSS, SQL
injection, LFI, command execution, directory traversal, RFI and normal
traffic sample data. The data distribution is shown in Figure 2.
The parameter setting of IBOA is as follows: N = 45, dim =
10/50/1000 and Niter = 50. In order to verify the effectiveness of this
algorithm, the same model is solved by using PSA and GA. The analysis
results are shown in Table 1.
Research Into the Security Threat of Web Application 1721
It can be seen from Table 1 that for different dimensions, IBOA obtains
the largest optimal value among the three algorithms, so IBOA can obtain
the optimal path. As the dimension decreases, the optimal values obtained by
different methods tend to converge. Therefore, IBOA has the advantages of
high optimization accuracy, global optimization and stable solution.
1722 Y. Zhang and T. Zhang
6 Conclusion
Anomaly detection of web application is a new direction of anomaly detec-
tion. At present, the research on anomaly detection of web application is in its
infancy. This paper discusses the method of vulnerability threat model based
on Web. According to the basic idea of attack graph, the V V T M model is
established by taking sol injection attack as an example. The dynamic pro-
gramming algorithm is used to solve this problem, and the safety evaluation
is carried out. Compared with the traditional anomaly detection, it also has
the problem of too high false alarm, as well as the challenges of effective
utilization of load information, protection of users’ privacy, response to data
encryption and so on. This research constructs an improved BOA algorithm,
which avoids the ability of individuals to jump out of local optimal solution,
speeds up the convergence speed, improves the optimization accuracy, has
good optimization stability when the dimension changes, and improves the
optimization level of web application threat model. Through performance
comparison analysis between IBOA and PSA and GA, the better performance
of IBOA has been verified.
Research Into the Security Threat of Web Application 1723
References
[1] Zohreh S. Gatmiry, Ashkan Hafezalkotob, Morteza Khakzar bafruei,
Roya Soltani, Food web conservation vs. strategic threats: A security
game approach, Ecological Modelling, 442, 2021, 109426.
[2] Simon Applebaum, Tarek Gaber, Ali Ahme, Signature-based and
Machine-Learning-based Web Application Firewalls: A Short Survey,
Procedia Computer Science, 189, 2021, 359–367.
[3] Giuseppe Cascavilla, Damian A. Tamburri, Willem-Jan Van Den
Heuvel, Cybercrime threat intelligence: A systematic multi-vocal litera-
ture review, Computers & Security, 105, 2021, 102258.
[4] Andrea Tundis, Samuel Ruppert, Max Mühlhäuser, A Feature-driven
Method for Automating the Assessment of OSINT Cyber Threat
Sources, Computers & Security, 113, 2022, 102576.
[5] Massimiliano Rak, Giovanni Salzillo, Daniele Granata, ESSecA: An
automated expert system for threat modelling and penetration testing
for IoT ecosystems, Computers and Electrical Engineering, 99, 2022,
107721.
[6] Bin Zhao, Yi Ren, Diankui Gao, Lizhi Xu, Yuanyuan Zhang, Energy uti-
lization efficiency evaluation model of refining unit Based on Contourlet
neural network optimized by improved grey optimization algorithm,
Energy, 185, 2019, 1032–1044.
[7] Adem Tekerek, A novel architecture for web-based attack detection
using convolutional neural network, Computers & Security, 100, 2021,
102096.
[8] Stefano Calzavara, Hugo Jonker, Benjamin Krumnow, Alvise Rabitti,
Measuring Web Session Security at Scale, Computers & Security, 111,
2021, 102472
[9] Adem Tekerek, A novel architecture for web-based attack detection
using convolutional neural network, Computers & Security, 100, 2021,
102096.
[10] Bin Zhao, Yi Ren, Diankui Gao, Lizhi Xu, Performance ratio prediction
of photovoltaic pumping system based on grey clustering and second
curvelet neural network, Energy, 171, 2019, 360–371.
[11] Göksel Uçtu, Mustafa Alkan, İbrahim Alper Doğru, Murat Dörterler,
A suggested testbed to evaluate multicast network and threat prevention
performance of Next Generation Firewalls, Future Generation Computer
Systems, 124, 2021, 56–67.
[12] Waleed Bin Shahi, Baber Aslam Haider, Abbas Hammad AfzalSaad,
Bin Khalid, A deep learning assisted personalized deception system for
1724 Y. Zhang and T. Zhang
Biographies
Yanling Zhang was born in Jiaozuo City, Henan Province, China, she
obtained a master’s degree in computer application technology from the
Information Engineering University of the People’s Liberation Army in 2006.
She is currently an associate professor at the School of Information Engi-
neering, Jiaozuo University, Henan Province. Her research interests include
big data, machine learning and other technologies is applied research in
various fields of society. Since 2017, she has presided over or participated
in 4 provincial and ministerial projects, presided over the completion of 16
municipal and departmental projects, and participated in 10 projects; received
funding for many times; published 1 academic monograph; and completed
10 utility model patents as the first inventor 2 invention patents; 10 papers
published; 13 achievement awards.
1726 Y. Zhang and T. Zhang
Ting Zhang was born in Jiaozuo City, Henan Province, China, She received
the B.S. degree in computer science and technology from Anyang Normal
University in 2015 and the M.S. degree in computer application technology
from Kunming University of science and technology in 2018. She is currently
a assistant at the School of Information Engineering, Jiaozuo University,
China. Her research interests include Data Mining and Cloud Computing.
She has published a paper, participated in writing a SCI paper and has been
published, participated in three invention patents, and participated in writing
two core papers.