0% found this document useful (0 votes)
25 views

Performance comparison of optical flow and background subtraction and discrete wavelet transform methods for moving objects

The research article compares the performance of optical flow, background subtraction, and discrete wavelet transform (DWT) methods for detecting and tracking moving objects in video analysis. The study evaluates these methods based on simulation response time, accuracy, sensitivity, and specificity, finding that DWT outperforms the others with a response time of 0.27 seconds and an accuracy of 95.34%. The findings highlight the importance of selecting appropriate algorithms for effective object detection in various environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Performance comparison of optical flow and background subtraction and discrete wavelet transform methods for moving objects

The research article compares the performance of optical flow, background subtraction, and discrete wavelet transform (DWT) methods for detecting and tracking moving objects in video analysis. The study evaluates these methods based on simulation response time, accuracy, sensitivity, and specificity, finding that DWT outperforms the others with a response time of 0.27 seconds and an accuracy of 95.34%. The findings highlight the importance of selecting appropriate algorithms for effective object detection in various environments.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

IAES International Journal of Robotics and Automation (IJRA)

Vol. 14, No. 1, March 2025, pp. 93~102


ISSN: 2722-2586, DOI: 10.11591/ijra.v14i1.pp93-102  93

Performance comparison of optical flow and background


subtraction and discrete wavelet transform methods for moving
objects

Monika Sharma1, Kuldeep Singh Kaswan1, Dileep Kumar Yadav2


Department of Computer Science and Engineering, Galgotia’s University, Greater Noida, India
1
2
Department of Computer Science and Engineering, SCSET, Bennett University, Greater Noida, India

Article Info ABSTRACT


Article history: Self-driving cars and other autonomous vehicles rely on systems that can
recognize and follow objects. The ways help people make safe decisions and
Received Jul 19, 2024 navigate by showing things like people, cars, obstacles, and traffic lights.
Revised Oct 27, 2024 Computer vision algorithms encompass both object detection and tracking.
Accepted Nov 19, 2024 Different methods are specifically developed for picture or video analysis
not only to identify items within the visual content but also to accurately
determine their precise locations. This can operate independently as an
Keywords: algorithm or as a constituent of an item-tracking system. Object tracking
algorithms can be used to follow objects over video frames, providing a
Discrete wavelet transform contrasting approach. The research article focuses on the mathematical
Image subtraction model simulation of optical flow, background subtraction, and discrete
Moving objects wavelet transform (DWT) methods for moving objects. The performance
Optical modeling evaluation of the methods is done based on simulation response time,
Signal detection accuracy, sensitivity, and specificity doe several images in different
environments. The DWT has shown optimal behavior in terms of the
response time of 0.27 seconds, accuracy of 95.34 %, selectivity of 95.96 %,
and specificity of 94.68 %.
This is an open access article under the CC BY-SA license.

Corresponding Author:
Monika Sharma
Department of Computer Science and Engineering, Galgotia’s University
Greater Noida, NCR, India
Email: [email protected]

1. INTRODUCTION
Object detection is a technique used in computer vision to identify and locate items in both video
and still images. Object detection algorithms typically rely on machine learning or deep learning techniques
to obtain meaningful findings. Humans can quickly identify and find objects of interest when they view
visual material [1]. Object detection seeks to replicate this level of cognitive ability in a computational
framework. Various disciplines are currently allocating resources to the investigation of automated video
surveillance. Advancements in modern technology have reached a stage where it is economically
advantageous to install cameras in a certain location and capture video footage, rather than employing
individuals to constantly examine the recorded footage [2]. Numerous enterprises have already installed
security cameras, capable of capturing footage that can be stored on tape, subject to being overwritten or
stored in a video archive. Subsequently, detectives can scrutinize the recorded video material to ascertain the
sequence of events in the occurrence of a criminal act [3], such as a robbery in a store or the pilfering of a
valuable automobile. However, then, it is evidently beyond the point of prevention or intervention. To
mitigate the occurrence of these situations, we can implement continuous monitoring and analysis of video

Journal homepage: http://ijra.iaescore.com


94  ISSN: 2722-2586

surveillance systems. In this manner, if security agents identify an ongoing robbery or someone exhibiting
suspicious behavior in the parking lot, they can promptly intervene to avert criminal activity.
Video-based surveillance systems [4] allow for the monitoring of many scenes. Video streams can
be utilized to extract information that captures our attention in various applications, such as security,
entertainment, safety, and efficiency enhancement. Task Video surveillance is utilized in the field of
recognition. Recognizing events from an area of interest has numerous possible applications, including but
not limited to traffic analysis [5], tracking limited vehicle movements, and analyzing multi-object interaction.
Compared to the need for continuous human supervision, it helps solve several problems. The first crucial
step in this approach is to determine whether video samples include motion. The approach must not only be
free from noise, but it must also segment the video stream to eliminate the presence of moving objects. The
presence of rapid variations in light intensity, such as those caused by a light switch, poses a substantial
challenge for detecting moving objects. If the algorithm fails to cope with variations in lighting and camera
movement, it will result in the inclusion of background noise in the final output [6]. The problem would be
worsened by dynamic backgrounds, which would enable objects to move around. Weather variations and
swaying trees may create inaccurate results during the detecting step. Alterations in scenery introduce an
additional level of difficulty. Regardless of whether one is asleep or awake, a moving item has the potential
to momentarily halt and gradually blend into the surrounding environment. A motion detection system should
possess the capability to effectively navigate through these various hurdles [7].
The video surveillance system commences [8] with the detection of motion and objects. Motion
detection involves the process of separating the areas of an image that contain moving objects from the rest
of the image. Background modeling and motion segmentation are commonly employed in the task of
detecting motion and identifying objects. In an image sequence, the objective of motion segmentation is to
identify the sections or areas that correspond to moving objects, such as automobiles, birds, humans, animals,
and so on [9]. When motion is identified in a specific area or region, it is necessary to study these detected
regions for further procedures such as object tracking and behavior analysis. Following the process of motion
and object identification, the video surveillance system typically traces the movement of objects from one
frame to the next in a sequence of images. Behavior analysis entails the examination and identification of
motion patterns, the description of actions, and the relationships between things.

2. RELATED WORK
Automated cars must be able to access accurate, real-time data on the state of objects in their
immediate surroundings if we are to guarantee safe driving. Object occlusion, clutter interference, and a
limited sensor-detecting capability produce false alarms and missed object detection [10]. Thus, it is difficult
to guarantee tracking stability and state prediction in complex traffic conditions. Background subtraction [11]
requires a training sequence devoid of objects to construct a background model, in contrast to object
detectors, which require instances that have been explicitly tagged to train a binary classifier. An important
step toward analytical automation is object recognition without a distinct training phase. Attempts to solve
this problem by analyzing motion data have been made. A popular method for detecting moving objects is
discriminative modeling (DM), which seeks to improve performance in foreground-background separation
using discriminative features and well-designed classifiers [12]. Because class separability is typically poor
in camouflaged locations, DM may fail when confronted with the camouflage problem. To detect foreground
pixels that have been camouflaged, we present a novel approach in this work: camouflage modeling (CM).
Because of the two-part nature of camouflage, we must represent both the foreground and the backdrop.
An innovative framework that incorporates information about color and texture has been developed
for backdrop modeling [13]. The foreground choice equation in this framework is composed of three
components: the left section is for the integration of the two parts, the right portion is for the information
about the texture, and the third part is for the information about the color. The use of this structure allows you
to take advantage of the power of color and texture while avoiding the downsides associated with them. To
accelerate the modeling of the background even more, we recommend using a block-based technique. To be
more specific, texture information modeling is distinct from the traditional multi-histogram model for block-
based background modeling in that it creates a single histogram model for each block. This model contains
bins that indicate the occurrence probabilities of various patterns. Based on this process, the dominant
background patterns are selected to determine the background likelihood of upcoming blocks. A novel
method based on fuzzy color difference histogram (FCDH) has been suggested to incorporate fuzzy c-means
(FCM) clustering [14]. The utilization of the FCM clustering technique in CDH mitigates the impact of
intensity variation resulting from fake motion or changes in background illumination, while also reducing the
substantial complexity of the computation's histogram bins. The suggested approach was tested using various
publicly accessible video sequences featuring complex scenarios. The method is suggested based on

IAES Int J Rob & Autom, Vol. 14, No. 1, March 2025: 93-102
IAES Int J Rob & Autom ISSN: 2722-2586  95

extracting moving objects from a frame sequence, hence neither human interaction in the form of empirical
threshold tuning, nor background modeling with which other systems are built are necessary [15]. The
suggested approach rents out moving objects to be extracted without using any of them. The saliency map of
the current frame with complete resolution is created by use of the constant symmetric difference between the
frames adjacent to the present frame. Saliency variables on this map help to highlight moving items while
also hiding the backdrop.
An image descriptor and nonlinear classification technique for optical flow orientation and a
histogram-based method have been used to characterize motion information in each video frame [16]. The
nonlinear one-class support vector machine classification approach initially learns from training frame
behavior to identify unusual events in the current frame. The optical flow approach begins with a Gaussian
filter to remove noise from each frame [17]. Next, it calculates the optical flow for the present frame the
previous frame the current frame, and the forthcoming frame. Merging the two optical flow constituents
yields the gross optical flow. An adaptive thresholding post-processing phase removes distracting foreground
components. Morphological techniques are then used to the equalized output to locate moving items. The
methodology was implemented, deployed, and evaluated on numerous authentic video datasets [18]. The 2D
discrete wavelet transform (DWT) and variance approach were used for object detection and tracking [19].
An examination of the proposed variance-based method for object detection and localization in comparison
to the widely utilized mean-shift method reveals that the latter is slower, leading to slower item detection
overall. To wrap things up, this analysis helps detect and track moving objects by using only the bandpass
components of the 2D-DWT outputs. The Daubechies complex wavelet transform is well-suited for tracking
because of its approximate shift-invariance property. The recommended method can perform object
segmentation from scenes [20]. Following the initial segmentation of the first frame, achieved through the
computation of multiscale correlation of the imaginary component of complex wavelet coefficients, the
subsequent frames track the object by calculating the energy of the complex wavelet coefficients assigned to
the object's region and comparing it to the energy of the surrounding region. The research gap is in the
identification of suitable methods for specific object detection problems. Optical flow provides the most
accurate and detailed motion data, but it is also the most computationally expensive. Background subtraction
usually works well when used in real-time scenarios with well-maintained backdrop models. To ensure its
efficacy in motion detection, additional processing may be necessary after using the DWT, which provides a
unique type of information.

3. METHODS
Identification and tracking objects that are in motion in photos or videos is a fundamental task in the
field of computer vision. This task has a wide range of applications, including surveillance, autonomous
driving, and human-computer interaction. Various methodologies and strategies are employed for the
detection of moving objects. Below are many frequently employed techniques. The common steps for object
detection are given in Figure 1.

Object Image Object


Input Image Object Detection
recognition Classification Localization

Figure 1. Steps in object detection in image and video

Computer vision can detect objects in video or still images. Preprocessing the image before feeding
it to an object detection model is possible. Scaling or normalizing pixel values may be needed to meet model
input requirements. Mathematical models help object detection models extract features. These networks learn
hierarchical characteristics from photos to distinguish things. Localizing objects in an image is as crucial as

Performance comparison of optical flow and background subtraction … (Monika Sharma)


96  ISSN: 2722-2586

categorizing them for object detection. Predicting bounding boxes that securely contain items of interest is
common. The model classifies all observable elements after object localization. Post-processing is done after
categorization refines results.

3.1. Background subtraction


One of the most used and time-tested methods for finding moving objects in movies or picture
sequences is background removal. Separating the moving foreground items from the still background is the
fundamental principle of background subtraction [21]. Background statics presupposes that the background is
always changing at a slow pace. For example, this could be a static view from a surveillance camera showing
a deserted corridor. Changes to the backdrop over time are considered by the dynamic background. Such as
in natural settings where the sun, clouds, and shadows all play a role in creating varying degrees of
illumination. In this type of model, each pixel in the background is represented by a statistical model. These
models can be codebook models or non-parametric models. The initialization of the background model is
done using the initial frames of the film or series. For dynamic backgrounds to adjust to small but noticeable
changes in the scene, the backdrop model is refreshed periodically. To find areas or pixels that are drastically
different from the backdrop, the background model is compared with each new frame. Tuning the
background model settings is crucial for optimal performance in diverse scenarios. Examples of these
parameters are the threshold for foreground detection and the learning rate for model updates. Real-time
processing of high-resolution video feeds can be difficult due to computationally expensive background
subtraction procedures. Multi-modal approaches use depth and colored data to improve background models.
A common method for detecting moving objects using background subtraction is to compare each pixel in
the current frame of the video series with a model of the background. The fundamental ideas and equations of
background subtraction are presented here. The initialization of the background model I b(𝑥,𝑦,t) for pixel (𝑥,𝑦)
at time t t is accomplished by utilizing the initial frames of the video series. This can be done with simple
averaging as well as more advanced methods like gaussian mixture models (GMM).

𝐼𝑏 (𝑥, 𝑦, 𝑡) = (1 − 𝑘) 𝐼𝑏 (𝑥, 𝑦, 𝑡 − 1) + 𝑘. 𝐼𝑐 (𝑥, 𝑦, 𝑡) (1)

𝐼𝑏 (𝑥, 𝑦, 𝑡) denotes the background model, 𝐼𝑐 (𝑥, 𝑦, 𝑡) denotes the color intensity of the image (x, y) in the ‘t’
frame or current frame, and k is the learning rate (0 < k < 1).
The current frame's absolute difference (or other metrics like squared difference) from the previous
frame can be used to identify items in the foreground (𝐼𝑓 (𝑥, 𝑦, 𝑡).

𝐼𝑓 (𝑥, 𝑦, 𝑡) = |𝐼𝑐 (𝑥, 𝑦, 𝑡) − 𝐼𝑏 (𝑥, 𝑦, 𝑡)| (2)

The results image after threshold comparison is given as. It is used to classify that image belongs to the
background region or foreground region.

1 if 𝐼𝑓 (𝑥, 𝑦, 𝑡) > 𝑇
𝐼𝑅 (𝑥, 𝑦, 𝑡) = { (3)
0 if 𝐼𝑓 (𝑥, 𝑦, 𝑡) < 𝑇

To eliminate noise morphological operations such as erosion and dilation can be applied to get the masked
image. The learning rate α is modified via adaptive approaches according to the size of the pixel differences
to accommodate different levels of scene dynamics

3.2. Optical flow method


Recent advancements in computer vision research have enabled robots to sense their surroundings
through techniques such as semantic segmentation, which classifies pixels based on their meaning, and object
identification, which identifies instances of a certain object class [22]. However, many of these algorithms do
not consider the time information (t) when processing real-time video input. Instead, they solely focus on
analyzing the relationships between objects inside the same frame (x, y). For each run, they consider each
frame as an individual image and reassess it accordingly. To identify areas of motion in a picture, optical
flow methods look at the vectors of a moving object's motion across time [23]. The optical flow has been
employed by many researchers. In video sequences, objects can be detected using the optical flow method
even when the camera is in motion. This theory is derived from the consensus of optical signal processing.

𝐼(𝑥, 𝑦, 𝑡) = 𝐼(𝑥 + ∆𝑥, 𝑦 + ∆𝑦, 𝑡 + ∆𝑡) (4)

IAES Int J Rob & Autom, Vol. 14, No. 1, March 2025: 93-102
IAES Int J Rob & Autom ISSN: 2722-2586  97

𝛿𝐼 𝛿𝐼 𝛿𝐼
𝐼(𝑥 + ∆𝑥, 𝑦 + ∆𝑦, 𝑡 + ∆𝑡) = 𝐼(𝑥, 𝑦, 𝑡) + ∆𝑥 + ∆𝑦 + ∆𝑡 + ⋯ 𝐻𝑖𝑔ℎ𝑒𝑟 𝑇𝑒𝑟𝑚 (5)
𝛿𝑥 𝛿𝑦 𝛿𝑡

By excluding higher-order terms, the equation is simplified in forms of (6) to (9).

𝛿𝐼 𝛿𝐼 𝛿𝐼
∆𝑥 + ∆𝑦 + ∆𝑡 = 0 (6)
𝛿𝑥 𝛿𝑦 𝛿𝑡

𝛿𝐼 ∆𝑥 𝛿𝐼 ∆𝑦 𝛿𝐼 ∆𝑡
( )+ ( )+ ( )= 0 (7)
𝛿𝑥 ∆𝑡 𝛿𝑦 ∆𝑡 𝛿𝑡 ∆𝑡

𝐼𝑝𝑥 𝑉𝑝𝑥 + 𝐼𝑝𝑦 𝑉𝑝𝑦 + 𝐼𝑝𝑡 𝑉𝑝𝑡 = 0 (8)

𝐼𝑝𝑥 𝑉𝑝𝑥 + 𝐼𝑝𝑦 𝑉𝑝𝑦 = −𝐼𝑝𝑡 𝑉𝑝𝑡 (9)

Vpx, Vpy, and Vpt denote the velocity or optical flow vectors, Ipx, Ipy, and Ipy show the variants of the image
intensities at a coordinate in the form of derivatives for the image I m(x, y, t). By employing the approach of
thresholding to derive the motion vector for object detection. The magnitude of the motion vector is
presented as (10).

𝑇 = √𝑉𝑝𝑥 2 + 𝑉𝑝𝑦 2 (10)

Optical flow vectors, in their most fundamental form, provide input to a large variety of higher-level
operations that need scene awareness of video sequences. These activities are necessary for proper operation.
The optical flow method ensures object velocity across consecutive frames using the apparent motion of
brightness patterns in a picture.

3.3. DWT transform


The ability of the DWT to capture signals at many resolutions and accurately localize them in the
time-frequency domain makes it a crucial tool for object detection and tracking. When analyzing data at
multiple resolutions, the DWT is used to break down an input signal into various frequency bands. Each
frequency band corresponds to a specific scale. This enables the simultaneous examination of many levels of
signals utilizing object detection techniques. This enables the effective retrieval of characteristics (such as
shapes, patterns, and boundaries) at different levels, which is beneficial in the identification and monitoring
of objects. The efficient implementation of DWT enables it to handle large amounts of data in real-time
applications [24]. It is crucial for tracking and object detection systems to operate in dynamic environments
and require rapid decision-making. The DWT is a valuable tool to estimate motion between frames in a video
series. Evaluating the wavelet coefficients across frames enables the estimation of motion vectors, which is
crucial for object tracking across time. The DWT) is a mathematical technique used to process and analyze
data, especially photos [25]. The DWT divides an image into separate frequency components that differ in
scale, enabling the examination of several resolutions [26]. The forward 2D DWT of an Image I m(x,y) is
decomposed with dimensions 𝑁×𝑁 into low-frequency approximation coefficients and high-frequency detail
coefficients at various scales. The image is decomposed in LL, LH, HL, and HH frequency bands [27]. The
mathematical equations for the same to present 2D-DWT are given as follows.
− Approximation coefficient equation

𝑁/2−1 𝑁/2−1

𝐴𝐿𝐿 = ∑ ∑ ℎ[𝑝]. ℎ[𝑞]. 𝐼𝑚 [2𝑝, 2𝑞] (11)


𝑝=0 𝑞=0

− Horizontal element coefficient equation

𝑁/2−1 𝑁/2−1

𝐻𝐿𝐻 = ∑ ∑ ℎ[𝑝]. ℎ[𝑞]. 𝐼𝑚 [2𝑝, 2𝑞 + 1] (12)


𝑝=0 𝑞=0

Performance comparison of optical flow and background subtraction … (Monika Sharma)


98  ISSN: 2722-2586

− Vertical element coefficient equation

𝑁/2−1 𝑁/2−1

𝑉𝐻𝐿 = ∑ ∑ ℎ[𝑝]. ℎ[𝑞]. 𝐼𝑚 [2𝑝 + 1, 2𝑞] (13)


𝑝=0 𝑞=0

− Diagonal element coefficient equation

𝑁/2−1 𝑁/2−1

𝐷𝐻𝐻 = ∑ ∑ ℎ[𝑝]. ℎ[𝑞]. 𝐼𝑚 [2𝑝 + 1, 2𝑞 + 1] (14)


𝑝=0 𝑞=0

Figure 2 presents the DWT image decomposition and level processing. Applying filters in both the
horizontal and vertical axes separates [28] the image into different frequency components in a 2-level DWT
decomposition. The decomposition process produces detail coefficients that capture high-frequency
information in the horizontal, vertical, and diagonal dimensions, as well as approximation coefficients at
various resolutions (levels) [29]. Object detection and tracking, compression, and denoising are just a few of
the many image-processing applications that benefit from this multi-resolution representation [30].

Figure 2. DWT image decomposition and level processing

4. RESULTS AND DISCUSSION


We used MATLAB 2023 to analyze the image's response time. Table 1 provides the specifics of the
response times required by these simulations of all algorithms. In Table 2, the outcomes of the algorithm's
simulation run on the author's camera's random still images and moving video. MATLAB simplifies the
utilization of GPU acceleration for computationally intensive tasks, such as deep learning-based object
detection. Utilizing GPUs in processing, as opposed to relying solely on CPUs, can greatly reduce response
times, hence enabling faster inference speeds. The response time of MATLAB object detection methods is
crucial for achieving real-time performance in various applications, optimizing algorithm selection and
implementation, leveraging hardware capabilities, facilitating iterative development, enhancing user
experience, and identifying optimization opportunities. Object detection systems can meet the performance
requirements of their intended applications when they effectively manage response time. Table 1 presents the
simulation response time of the different algorithms used for object detection.
In detecting and tracking moving objects, the three primary metrics that shed light on the system's
efficiency, dependability, and resilience in different real-world contexts are specificity, sensitivity, and
accuracy. The accuracy with which the system can identify which pixels or regions belong to moving objects

IAES Int J Rob & Autom, Vol. 14, No. 1, March 2025: 93-102
IAES Int J Rob & Autom ISSN: 2722-2586  99

or the backdrop is reflected in this metric. The sensitivity, recall, or true positive rate is a measure of how
well the system detects real positives or moving objects. A system with a high sensitivity will be able to pick
up on most moving items in the scene, reducing the likelihood that anything crucial would go unnoticed.
Important fields that rely on it include automated driving and surveillance, where the ability to recognize any
moving object is paramount. The sensitivity of a system is defined as the percentage of real negatives (i.e.,
non-moving background) that are properly detected as negatives.

Table 1. Comparison of the response time for detection


Method description Response time in MATLAB (seconds)
DWT Optical method Background subtraction
Object detection Image/Video-1 0.25 0.31 0.45
Object detection Image/Video-2 0.24 0.33 0.44
Object detection Image/Video-3 0.30 0.39 0.47
Object detection Image/Video-4 0.23 0.29 0.35
Object detection Image/Video-5 0.31 0.35 0.40

The importance of a system's ability to accurately detect motion as opposed to non-motion is


highlighted by the fact that a high specificity is crucial in reducing the occurrence of false alarms and false
positives. It keeps things reliable and cuts down on needless processing or notifications by making sure the
system can tell moving objects apart from a stationary background. Table 2 presents the estimated values of
all the discussed methods and the corresponding comparative performance curve for the detection algorithms
is shown in Figure 3. Table 3 presents the simulated images and results for different images.

Table 2. Comparative performance values


Method TP TN FP FN Accuracy (%) Sensitivity (%) Specificity (%)
DWT 95 89 5 4 95.34 95.96 94.68
Optical 92 85 5 6 94.15 93.88 94.44
Background subtraction 90 80 6 10 91.4 90.00 93.02

Performance Estimation
97

96

95

94
% Utilization

93

92 Accuracy(%)

91 Sensitivity (%)
Specificity (%)
90

89

88

87
DWT Optical Background Subtraction
Method

Figure 3. Comparative performance curve for the detection algorithms

Performance comparison of optical flow and background subtraction … (Monika Sharma)


100  ISSN: 2722-2586

Table 3. Simulation outcome of the different sampled image object/video


S. No Input original After applying the DWT algorithm

Image object/Video-1

Image object/Video-2

Image object/Video-3

Image object/Video-4

Image object /Video-5

5. CONCLUSION
It is possible to compare several methods, tune parameters, and verify that the system satisfies
operational requirements with the help of MATLAB's performance evaluation and validation tools. The
evaluation is done for the optical flow methods, background subtraction methods, and DWT processing the

IAES Int J Rob & Autom, Vol. 14, No. 1, March 2025: 93-102
IAES Int J Rob & Autom ISSN: 2722-2586  101

moving objects detection. The simulation is carried out in several environments including rainy and hazy
environments. The primary advantages of the DWT for object detection are its capacity to capture essential
information, its robustness against noise, its compact representation, and its multi-resolution analysis.
Consequently, it serves as a very effective instrument, particularly when conventional procedures may pose
challenges or when obtaining specific attributes is essential. The simulation work of the DWT method has
shown a minimum latency of 0.23 seconds then optical flow of 0.29 seconds and 0.35 seconds for
background subtraction methods. The same type of behavior is analyzed for other cases also. The accuracy of
the DWT, optical, and background subtraction methods is 95.34%, 94.15%, and 91.40%. The sensitivity of
the DWT, optical, and background subtraction methods is 95.96%, 93.88%, and 90.00%. The specificity of
the DWT, optical, and background subtraction methods is 94.68%, 94.44%, and 93.03%. When it comes to
detecting moving objects in images and videos, the DWT method has continuously proven to be the optimal
choice in terms of both hardware and software.

REFERENCES
[1] A. Cavallaro, O. Steiger, and T. Ebrahimi, “Tracking video objects in cluttered background,” IEEE Transactions on Circuits and
Systems for Video Technology, vol. 15, no. 4, pp. 575–584, 2005, doi: 10.1109/TCSVT.2005.844447.
[2] A. Mukhtar, L. Xia, and T. B. Tang, “Vehicle detection techniques for collision avoidance systems: a review,” IEEE Transactions
on Intelligent Transportation Systems, vol. 15, no. 5, pp. 2318–2338, 2015, doi: 10.1109/TITS.2015.2409109.
[3] S. Hassan, G. Mujtaba, A. Rajput, and N. Fatima, “Multi-object tracking: a systematic literature review,” Multimedia Tools and
Applications, vol. 83, no. 14, pp. 43439–43492, Oct. 2023, doi: 10.1007/s11042-023-17297-3.
[4] Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, “Object detection in 20 years: a survey,” in Proceedings of the IEEE, 2023, vol. 111,
no. 3, pp. 257–276, doi: 10.1109/JPROC.2023.3238524.
[5] D. K. Prasad, D. Rajan, L. Rachmawati, E. Rajabally, and C. Quek, “Video processing from electro-optical sensors for object
detection and tracking in a maritime environment: a survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 18,
no. 8, pp. 1993–2016, Aug. 2017, doi: 10.1109/TITS.2016.2634580.
[6] H. Zhu, H. Wei, B. Li, X. Yuan, and N. Kehtarnavaz, “A comprehensive survey of video datasets for background subtraction,”
Applied Sciences, vol. 10, no. 21, Nov. 2020, doi: 10.3390/app10217834.
[7] A. Kumar, “Text extraction and recognition from an image using image processing in MATLAB,” in Conference on Advances in
Communication and Control Systems 2013 (CAC2S 2013), 2013, vol. 2013, pp. 429–435.
[8] M. Sharma, K. S. Kaswan, and D. K. Yadav, “Moving objects detection based on histogram of oriented gradient algorithm chip
for hazy environment,” International Journal of Reconfigurable and Embedded Systems, vol. 13, no. 3, pp. 604–615, 2024, doi:
10.11591/ijres.v13.i3.pp604-615.
[9] B. Mirzaei, H. Nezamabadi-pour, A. Raoof, and R. Derakhshani, “Small object detection and tracking: a comprehensive review,”
Sensors, vol. 23, no. 15, 2023, doi: 10.3390/s23156887.
[10] J. Bai, S. Li, L. Huang, and H. Chen, “Robust detection and tracking method for moving object based on radar and camera data
fusion,” IEEE Sensors Journal, vol. 21, no. 9, pp. 10761–10774, 2021, doi: 10.1109/JSEN.2021.3049449.
[11] X. Zhou, C. Yang, and W. Yu, “Moving object detection by detecting contiguous outliers in the low-rank representation,” IEEE
Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 3, pp. 597–610, 2013.
[12] X. Zhang, C. Zhu, S. Wang, Y. Liu, and M. Ye, “A Bayesian approach to camouflaged moving object detection,” IEEE
Transactions on Circuits and Systems for Video Technology, vol. 27, no. 9, pp. 2001–2013, 2017, doi:
10.1109/TCSVT.2016.2555719.
[13] H. Han, J. Zhu, S. Liao, Z. Lei, and S. Z. Li, “Moving object detection revisited: Speed and robustness,” IEEE Transactions on
Circuits and Systems for Video Technology, vol. 25, no. 6, pp. 910–921, 2015, doi: 10.1109/TCSVT.2014.2367371.
[14] D. K. Panda and S. Meher, “Detection of moving objects using fuzzy color difference histogram based background subtraction,”
IEEE Signal Processing Letters, vol. 23, no. 1, pp. 45–49, 2016, doi: 10.1109/LSP.2015.2498839.
[15] Z. Wang, K. Liao, J. Xiong, and Q. Zhang, “Moving object detection based on temporal information,” IEEE Signal Processing
Letters, vol. 21, no. 11, pp. 1403–1407, 2014, doi: 10.1109/LSP.2014.2338056.
[16] T. Wang and H. Snoussi, “Detection of abnormal visual events via global optical flow orientation histogram,” IEEE Transactions
on Information Forensics and Security, vol. 9, no. 6, pp. 988–998, 2014, doi: 10.1109/TIFS.2014.2315971.
[17] S. S. Sengar and S. Mukhopadhyay, “Detection of moving objects based on enhancement of optical flow,” Optik, vol. 145, pp.
130–141, 2017, doi: 10.1016/j.ijleo.2017.07.040.
[18] J. Hariyono, V. D. Hoang, and K. H. Jo, “Moving object localization using optical flow for pedestrian detection from a moving
vehicle,” Scientific World Journal, vol. 2014, 2014, doi: 10.1155/2014/196415.
[19] P. P. Gangal, V. R. Satpute, K. D. Kulat, and A. G. Keskar, “Object detection and tracking using 2D—DWT and variance
method,” in 2014 Students Conference on Engineering and Systems, May 2014, pp. 1–6, doi: 10.1109/SCES.2014.6880123.
[20] Y. Wu, X. He, and T. Q. Nguyen, “Moving object detection with a freely moving camera via background motion subtraction,”
IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 2, pp. 236–248, Feb. 2017, doi:
10.1109/TCSVT.2015.2493499.
[21] R. Kalsotra and S. Arora, “A comprehensive survey of video datasets for background subtraction,” IEEE Access, vol. 7, pp.
59143–59171, 2019, doi: 10.1109/ACCESS.2019.2914961.
[22] A. Talukder and L. Matthies, “Real-time detection of moving objects from moving vehicles using dense stereo and optical flow,”
in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2004, vol. 4, pp. 3718–3725, doi:
10.1109/iros.2004.1389993.
[23] K. Kale, S. Pawar, and P. Dhulekar, “Moving object tracking using optical flow and motion vector estimation,” in 2015 4th
International Conference on Reliability, Infocom Technologies and Optimization (ICRITO) (Trends and Future Directions), Sep.
2015, pp. 1–6, doi: 10.1109/ICRITO.2015.7359323.
[24] A. Kumar, “Study and analysis of different segmentation methods for brain tumor MRI application,” Multimedia Tools and
Applications, vol. 82, no. 5, pp. 7117–7139, Feb. 2023, doi: 10.1007/s11042-022-13636-y.
[25] A. Goel, A. K. Goel, and A. Kumar, “Performance analysis of multiple input single layer neural network hardware chip,”
Multimedia Tools and Applications, vol. 82, no. 18, pp. 28213–28234, Jul. 2023, doi: 10.1007/s11042-023-14627-3.

Performance comparison of optical flow and background subtraction … (Monika Sharma)


102  ISSN: 2722-2586

[26] A. Kumar, P. Rastogi, and P. Srivastava, “Design and FPGA implementation of DWT, image text extraction technique,” Procedia
Computer Science, vol. 57, pp. 1015–1025, 2015, doi: 10.1016/j.procs.2015.07.512.
[27] A. S. Rawat, A. Rana, A. Kumar, and A. Bagwari, “Application of multi layer artificial neural network in the diagnosis system: A
systematic review,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 7, no. 3, p. 138, Aug. 2018, doi:
10.11591/ijai.v7.i3.pp138-142.
[28] A. Goel, A. K. Goel, and A. Kumar, “The role of artificial neural network and machine learning in utilizing spatial information,”
Spatial Information Research, vol. 31, no. 3, pp. 275–285, Jun. 2023, doi: 10.1007/s41324-022-00494-x.
[29] A. Devrari and A. Kumar, “Turbo encoder and decoder chip design and FPGA device analysis for communication system,”
International Journal of Reconfigurable and Embedded Systems, vol. 12, no. 2, pp. 174–185, 2023, doi:
10.11591/ijres.v12.i2.pp174-185.
[30] S. Dhyani, A. Kumar, and S. Choudhury, “Analysis of ECG-based arrhythmia detection system using machine learning,”
MethodsX, vol. 10, 2023, doi: 10.1016/j.mex.2023.102195.

BIOGRAPHIES OF AUTHORS

Monika Sharma is currently working as a research scholar in computer science


and engineering at Galgotias University, Noida, NCR, India. She has an M.Tech. in computer
science and engineering and a B.Tech. in computer science and engineering in 2012 and 2009
respectively. She has published more than 20 research papers and book chapters. She is
working as a lecturer at Government Girls Polytechnic College, Daurala, Meerut, under the
Uttar Pradesh Technical Department, India. She is having experience of 14 years. She can be
contacted at [email protected].

Kuldeep Singh Kaswan is presently working in the School of Computing Science


and Engineering, at Galgotias University, Uttar Pradesh, India. His contributions focus on BCI,
cyborg, and data science. His Academic degrees and thirteen years of experience working with
global Universities like Amity University, Noida, Gautam Buddha University, Greater Noida,
and PDM University, Bahadurgarh, have made him more receptive and prominent in his
domain. He received a doctorate in computer science from Banasthali Vidyapith, Rajasthan.
He received a Doctor of Engineering (D.Engg.) from Dana Brain Health Institute, Iran. He has
obtained a master’s degree in computer science and engineering from Choudhary Devi Lal
University, Sirsa (Haryana). He has supervised many UG and PG projects of engineering
students. He has supervised 3 Ph.D. graduates and presently supervising 4 Ph.D. students. He
is also a member of IEEE, Computer Science Teacher Association (CSTA), New York, USA,
the International Association of Engineers (IAENG), Hong Kong, International Association of
Computer Science and Information Technology (IACSIT), USA, professional member
association of computing machinery, USA. He has published 9 books and 40 book chapters at
national/international level. He has a number of publications also in international/national
journal and conferences. He is an editor/author, and review editor of journals and books with
IEEE, Wiley, Springer, IGI, and River. He can be contacted at [email protected].

Dileep Kumar Yadav received an engineering degree (B.Tech. in computer


science and engineering) from Uttar Pradesh Technical University, Lucknow, UP, India in
2006 and master’s degree (M.Tech. in computer science and technology) from the School of
Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi, India in 2011. Dr.
Yadav earned a Ph.D. (computer science and technology) degree from the School of Computer
and Systems Sciences, Jawaharlal Nehru University New Delhi, India in 2016. He is a Sun
Certified Java Programmer. He is the author of 65 research publications, including patents,
journals (SCI/SCIE/SCOPUS), and national/international conferences. He has also authored
books and many book chapters for internationally reputed publishers. His primary research
interests are in image processing, computer vision, and blockchain security using artificial
intelligence and machine learning over dynamic data. Dr. Yadav supervised various students of
master’s degrees and Ph.D. Dr. Yadav is also associated with many international journals as
associate editor, member, Int. editorial board member, etc. He has more than 12 years of
working experience in industry as well as academia. Dr. Yadav is the recipient of various
awards from national and international organizations in research. He is also supervising many
national and international students to pursue their research work. Currently, Dr. Yadav is
working as an associate professor in the Department of CSE, SCSET, Bennett University,
Greater Noida, India. He can be contacted at [email protected].

IAES Int J Rob & Autom, Vol. 14, No. 1, March 2025: 93-102

You might also like