journal-8(2025)
journal-8(2025)
ABSTRACT
Recent advances in machine learning, deep learning, and large language models enable the design
of refined and complex algorithms to detect and prevent cybersecurity attacks. In this paper, we present
a hybrid fusion approach combining Generative AI, ADASYN, Recursive Feature Elimination (RFE),
and boosting algorithms to detect DDoS attacks. RFE was employed to optimize feature selection,
enhancing model interpretability and performance by reducing dimensionality. The proposed model
leverages (1) Packet Capture (pcap) data generated from virtual networks as real data, (2) synthetic
data generated by the Synthetic Data Vault, (3) ADASYN to balance the data, and (4) boosting
algorithms for training and testing. The results obtained from this hybrid-fusion model provided an
accuracy of 97–98%, indicating that the model is robust and reliable. Cross-validation of the model
further validated the results
KEYWORDS
Hybrid-Intelligence, Cyber Defense, Generative Ai, Resampling Techniques, Recursive Feature Elimination,
Ensemble Methods
INTRODUCTION
Distributed denial of service (DDoS) attacks are a type of cybersecurity threat that compromises
multiple systems using malware. These attacks typically involve overwhelming a target server with
high requests, leading to severe service disruptions. By exhausting the bandwidth and computational
resources, DDoS attacks render systems unavailable for legitimate users. Their effects include service
interruptions, revenue loss, reputational damage, and increased operational costs, making detecting
and mitigating DDoS attacks a critical priority for organizations.
DOI: 10.4018/IJAIML.370316
This article published as an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creative-
commons.org/licenses/by/4.0/) which permits unrestricted use, distribution, and production in any medium, provided the author of the
original work and original publication source are properly credited.
1
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
To address these challenges, we introduce a robust framework for DDoS detection. Our approach
combines the following processes:
This hybrid method achieved high accuracy rates, demonstrating its effectiveness in distinguishing
DDoS attacks from normal traffic.
By integrating synthetic and real data, balancing skewed datasets, and leveraging feature
elimination techniques, we provide a scalable, reliable framework for detecting malicious network
activity. The findings affirm the validity of this approach and underscore its potential to mitigate
cyberattacks that can cause significant operational and financial losses. This work contributes to
the field by offering an innovative pipeline for anomaly detection and infrastructure protection in
high-dimensional datasets. In the rest of the paper we include a literature review, an overview of
our methodology, a discussion on results and an interpretation of findings, and a conclusion and
recommendations for future work.
LITERATURE REVIEW
Advances in machine learning, deep learning, and large language models have provided
open-source libraries and tools that significantly enhance the ability to detect and mitigate cyberattacks.
Several studies have demonstrated the potential of synthetic data generated by generative adversarial
networks (GANs) and VAEs to augment datasets when real data are scarce, imbalanced, unreliable,
or skewed (Khakurel et al., 2022; Mehrabi et al., 2021). The use of synthetic data generated from
labeled data allows for training robust models and improving classification outcomes. Some studies
(Chalé & Bastian, 2022; Nikolov, 2023) have shown that combining synthetic and real data can
achieve results comparable to using real data alone, whereas models trained only on synthetic data
tend to underperform. However, other researchers (Halvorsen & Gebremedhin, 2024; Llugiqi &
Mayer, 2022) have reported that data models trained exclusively on synthetic data perform equally
well, or in some cases better, than models trained on real data. Enhanced feature extraction has also
been shown to improve anomaly detection speed and accuracy (Patil et al., 2022; Wang et al., 2022).
Machine learning algorithms are commonly used to evaluate the accuracy of methods for detecting
various types of cybercrime. For instance, Kilincer et al. (2022) and Oneto and Chiappa (2020) used
Light Gradient-Boosting Machine (LightGBM) and Extreme Gradient Boosting (XGBoost) on the
Comprehensive Cyber Security Intrusion Detection Dataset (CCiDD) and its subsets, CCiDD_A and
CCiDD_B. Their findings revealed that LightGBM outperformed XGBoost in detecting cyberattacks
within these datasets. Similarly, Louk and Tama (2023) and Chen et al. (2023) reported that ensemble
methods such as gradient boosting machine, XGBoost, LightGBM, and CatBoost were effective for
intrusion detection. Among these, CatBoost consistently achieved superior performance in identifying
cyberattacks.
Balancing datasets plays a pivotal role in designing accurate anomaly detection systems.
Techniques such as SMOTE and its variants are widely employed to address class imbalance. For
instance, Halim et al. (2023) and Stanford et al. (2024) reported that combining SMOTE and Adaptive
Synthetic (ADASYN) with the random forest (RF) algorithm yielded an accuracy of 99.03%. In
other studies, Ungkawa and Rafi (2024) and Islam et al. (2020) found that using principal component
analysis, K-means clustering, and ADASYN achieved a 95% accuracy rate.
2
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Our study extends this body of knowledge by presenting a hybrid fusion approach that incorporates
(a) high-quality synthetic data generated by the VAE synthesizer, (b) SMOTETomek for data balancing,
(c) parsed packet capture data generated from a virtual network consisting of two servers and three
clients, and (d) rigorous cross-validation to evaluate model performance. Additionally, in this study
we explored the impact of RFE combined with SMOTETomek, highlighting its potential to enhance
model performance.
The integration of RFE with SMOTETomek in a cybersecurity context highlights its potential
to improve model performance significantly. This novel pipeline ensures the effective detection of
DDoS attacks, making it a valuable contribution to the domain of cybersecurity.
METHODOLOGY
This research proposes a hybrid fusion approach for detecting DDoS attacks. By leveraging
synthetic data generation, feature selection, resampling techniques, and advanced machine learning
models, we designed a methodology that ensures robust model performance and generalizability. The
next section includes a step-by-step breakdown of the approach.
Theoretical Rationale
The hybrid fusion methodology integrates complementary techniques to address key challenges of
DDoS detection in high-dimensional, imbalanced datasets. Each component contributes to enhancing
detection accuracy and scalability. In the rest of this section, we describe these components.
3
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Pipeline Overview
We implemented the methodology in the following sequence (see Figure 1 for an overview):
1. data preparation
2. synthetic data generation
3. resampling
4. feature selection
5. model training and testing
6. evaluation and validation
Data Preparation
In this stage of implementing the methodology, preprocessing steps included encoding categorical
features using LabelEncoder and imputing missing values with SimpleImputer. The dataset was split
into synthetic data (70% training, 30% validation) and real data (used exclusively for testing).
4
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Resampling
For resampling, we applied SMOTETomek and ADASYN as the techniques. SMOTETomek
refined decision boundaries, while ADASYN focused on minority class oversampling.
Feature Selection
For feature selection, RFE reduced dimensionality by retaining only the most relevant features,
optimizing model training and inference efficiency.
• Exploratory data analysis: Basic descriptive statistics confirmed the absence of missing values
or duplicates.
• Feature validation: We examined key features, such as src_ip, protocol, and count, to ensure their
relevance for the detection of DDoS attacks.
Synthetic datasets were generated using VAEs and other methods, including GANs, Gaussian
Copula, and Copula GANs. The VAE-generated data closely resembled the real dataset in terms
of key statistical properties, including column-wise means, medians, and correlation metrics. This
similarity ensured that the synthetic data were realistic while also broadening the diversity of attack
scenarios, and the model trained with VAE data and tested on real data also provided the best results.
The overall accuracies for synthetic datasets generated and tested using VAE, GANs, Gaussian Copula,
and Conditional Tabular GAN were as follows: VAE (90.04%), GAN (85.72%), Gaussian Copula
(79.37%), and Copula GAN (89.69%).
The synthetic data consisted of 10,000 samples and introduced variability by simulating diverse
distributions of network features, such as src_bytes, dst_bytes, and count. We applied the following
resampling techniques to balance the dataset:
5
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Table 1 shows the comparative statistics of the real and synthetic datasets.
Statistical Alignment
For src_bytes and dst_bytes, although variances are relatively close (approximately 2.4 million
in real data and 1.4 million in synthetic), the means differ, indicating some divergence in value
distributions between the real and synthetic datasets. The feature hot demonstrates better alignment
in both variance and mean, although slight differences persist.
Key Findings
The Kolmogorov–Smirnov test results for all listed features show a p-value of 0.00, highlighting
significant distributional differences. This finding emphasizes the need for further refinement in
synthetic data generation to better mimic real-world patterns.
Features with similar variance but divergent means, such as src_bytes and dst_bytes, suggest
that although broad trends are captured, local data points may require more granularity in modeling.
Actionable Recommendations
Proposed actionable recommendations: (a) enhance the synthetic data generation process to focus
on aligning feature means, possibly by incorporating more advanced loss functions or constraints
during training, and (b) use additional validation metrics to identify and address specific feature
discrepancies, such as histogram comparisons or conditional probability checks.
6
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
alignment. Furthermore, testing on trade-offs real data validated the robustness of the models, with
high precision, recall, and F1 scores demonstrating excellent generalization. These results underscore
that reducing statistical differences remains a desirable goal for future iterations of synthetic data
generation, but the current methodology is already effective in leveraging synthetic data for DDoS
detection. For further details on model performance, refer to the Results section.
RESULTS
In this section we present the outcomes of applying the hybrid fusion approach and the RFE
methodology to detect DDoS attacks. The consolidated results highlight the performance of individual
models, resampling techniques, and the impact of RFE on model accuracy.
Note. AdaBoost = adaptive boosting, EEM = enhanced ensemble model, GB = gradient boosting, QDA = Quadratic Discriminant Analysis, and RF =
random forest.
7
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Figure 2. Raw and normalized confusion matrices for the enhanced ensemble model (EEM)
Table 3 summarizes the confusion matrix results for the best-performing models on real datasets,
showcasing true positives, true negatives, false positives, and false negatives.
Note. AdaBoost = adaptive boosting, DT = decision tree, EEM = enhanced ensemble model, GB = gradient boosting, QDA = Quadratic Discriminant
Analysis, and RF = random forest.
• It achieved the lowest false positive rate (0.00%) compared with other models.
• It maintained a false negative rate of only 1.74%, highlighting its effectiveness in detecting
nearly all DDoS attacks.
Impact of RFE
RFE significantly enhanced the model’s interpretability and performance by reducing feature
dimensionality by approximately 40%. The key features retained include top performance features
for src_ip, dst_ip, protocol, count, and same_srv_rate.
Benefits of RFE
RFE yielded the following benefits:
8
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
• RF (high precision): Ideal for minimizing false positives, ensuring efficient resource allocation
in environments where false alarms are costly.
• EEM (high recall): Excels at minimizing false negatives, making it suitable for high-risk
environments, such as healthcare and financial systems.
The F1 score balances precision and recall, with EEM achieving the highest scores, demonstrating
its robustness and adaptability in real-world scenarios.
Performance Trade-Offs
Regarding precision versus recall, we found the following strength for high-precision models
(e.g., RF): They achieve a low false positive rate, minimizing misclassification of normal traffic as
attacks, which is crucial for resource efficiency. However, a limitation was that they
tend to have slightly lower recall, increasing the likelihood of missing actual DDoS attacks.
For high-recall models (e.g., EEM), a strength was that they capture nearly all DDoS attacks
with a low false negative rate, essential for mitigating severe threats. A limitation, however, was that
higher recall may result in more false positives, potentially straining resources.
Using the F1 score as a balanced metric provides a harmonic mean of precision and recall,
balancing false positives and negatives. The EEM achieved an F1 score of 0.99, demonstrating its
ability to maintain this critical balance.
The confusion matrix for the EEM had the following strengths:
These results make the EEM suitable for high-stakes environments, such as healthcare and
finance, where undetected attacks could have significant consequences.
Comparative Analysis
In a comparative analysis, we found that RF and GB offer high precision, suitable for systems
prioritizing resource efficiency. We discovered that AdaBoost balances precision and recall, making
it adaptable for general purposes. QDA was effective, but with slightly lower performance metrics,
limiting its use in high-risk scenarios.
9
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Use Cases
High-recall models like EEM are ideal for systems requiring comprehensive detection, whereas
high-precision models like RF are suited for resource-sensitive deployments.
Computational Complexity
The proposed hybrid fusion approach involved a training phase and an inference phase. Training
ensemble models such as GB, RF, and the EEM involves multiple iterations and evaluations, all of
which can increase computational costs, especially with large datasets. However, the use of RFE
ensures that only the most critical features are used, significantly reducing the dimensionality of
input data and improving training efficiency.
Once trained, the EEM and individual models are optimized for fast inference, making them
suitable for scenarios requiring real-time threat detection.
Resource Requirements
The method was tested using synthetic datasets with 10,000 samples and real datasets with 5,800
samples. These evaluations required a standard server setup (16-core central processing unit, 32GB
RAM, no graphics processing unit dependency). However, scaling to larger datasets may benefit from
distributed computing environments or graphics processing unit acceleration for faster processing.
Memory use was optimized through the elimination of redundant features and the use of
lightweight models such as QDA within the ensemble.
Deployment Scenarios
In this section we discuss deployment of the hybrid fusion model on cloud platforms, on edge
devices, and in hybrid environments.
The hybrid fusion model is well-suited for deployment on cloud platforms like Amazon Web
Services or Google Cloud, where scalability and on-demand resources can handle growing data
volumes and ensure reliable DDoS detection across multiple network segments.
For latency-sensitive applications, the model can be deployed on edge devices, such as routers
or Internet of Things gateways. The compact nature of selected models (via RFE) and efficient
ensemble inference ensure low-latency performance, enabling real-time detection without relying
on central data centers.
Combining cloud and edge deployments can provide a layered defense mechanism, where critical
threats are detected locally while aggregated data supports higher-level analysis in the cloud.
Real-World Applications
The proposed method can be applied to sectors requiring robust DDoS defenses, including
financial services, healthcare, and government networks. In addition, integration with Security
Information and Event Management systems can enhance overall threat response and automate
mitigation strategies.
With its scalability, the approach also supports dynamic network traffic patterns, ensuring
adaptability to evolving cyber-attack scenarios.
10
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Future Considerations
Investigating more advanced resampling techniques and feature selection methods can further
improve computational efficiency. Additionally, transitioning to fully distributed frameworks like
Apache Spark or TensorFlow Distributed can prepare the method for big data environments.
Model Performance
For synthetic data testing, models trained on synthetic datasets, resampled using SMOTETomek,
achieved high performance. GB and AdaBoost were top-performing models.
For real data testing, optimized models retrained on real data with RFE-selected features yielded
strong results, with the EEM achieving the following scores:
• precision: 0.98
• recall: 1.00
• F1 score: 0.99
• accuracy: 99%
Note that five-fold cross-validation ensured consistency and generalizability across synthetic
data subsets.
11
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Comparative Analysis
The EEM consistently outperformed traditional models (e.g., RF, DT, GB, AdaBoost) across
all metrics. Its F1 score of 0.99 highlights the efficacy of integrating SMOTETomek, RFE, and
ensemble learning techniques.
Summary
The EEM, combining SMOTETomek, RFE, and ensemble learning, offers a robust solution for
DDoS detection, surpassing traditional and complex deep learning models in both performance and
scalability.
Although the results of this study demonstrate near-perfect performance, achieving up to 0.99
accuracy with the EEM, several areas for future work and potential improvements are identified,
including experimenting on new network environments, generalizing the model to address new attack
patterns by designing adaptive models that can self-regulate and detect attacks in real time, and
training models on other types of cyber threats, such as phishing, malware detection and Structured
Query Language injections.
Given the increasing frequency and complexity of DDoS attacks, the near-perfect performance
of the EEM highlights its potential as a highly effective tool for modern network security. The fusion
approach presented here demonstrates the value of combining advanced machine learning techniques
with feature optimization for scalable and generalizable solutions.
By addressing these challenges and extending the scope of this study, other researchers could
enable the proposed hybrid fusion approach to evolve into a robust, real-time, and adaptive solution
for modern cybersecurity threats. The methodology presented here serves as a foundation for future
research into advanced, scalable, and generalizable network intrusion detection systems.
AUTHOR
CONFLICTS OF INTEREST
We wish to confirm that there are no known conflicts of interest associated with this publication
and there has been no significant financial support for this work that could have influenced its outcome.
FUNDING STATEMENT
12
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
PROCESS DATES
13
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
REFERENCES
Babbar, H., Rani, S., & Driss, M. (2024). Effective DDoS attack detection in software-defined vehicular networks
using statistical flow analysis and machine learning. PLoS One, 19(12), e0314695. DOI: 10.1371/journal.
pone.0314695 PMID: 39693292
Chalé, M., & Bastian, N. D. (2022). Generating realistic cyber data for training and evaluating machine learning
classifiers for network intrusion detection systems. Expert Systems with Applications, 207, 117936. DOI:
10.1016/j.eswa.2022.117936
Chen, R. J., Wang, J. J., Williamson, D. F. K., Chen, T. Y., Lipkova, J., Lu, M. Y., Sahai, S., & Mahmood, F.
(2023). Algorithmic fairness in artificial intelligence for medicine and healthcare. Nature Biomedical Engineering,
7(6), 719–742. DOI: 10.1038/s41551-023-01056-8 PMID: 37380750
Halim, A. M., Dwifebri, M., & Nhita, F. (2023). Handling imbalanced data sets using SMOTE and ADASYN to
improve classification performance of Ecoli data sets. Technology and Science, 5(1), 246–253. DOI: 10.47065/
bits.v5i1.3647
Halvorsen, J., & Gebremedhin, A. (2024). Generative machine learning for cyber security. Military Cyber Affairs,
7(1), 4. https://digitalcommons.usf.edu/mca/vol7/iss1/4
Islam, M. Z., Islam, M. M., & Asraf, A. (2020). A combined deep CNN-LSTM network for the detection of novel
coronavirus (COVID-19) using X-ray images. Informatics in Medicine Unlocked, 20, 100412. DOI: 10.1016/j.
imu.2020.100412 PMID: 32835084
Khakurel, U., Abdelmoumin, G., Bajracharya, A., & Rawat, D. B. (2022). Exploring bias and fairness in artificial
intelligence and machine learning algorithms. Artificial Intelligence and Machine Learning for Multi-Domain
Operations Applications IV, 12113, 629–638. DOI: 10.1117/12.2621282
Kilincer, I. F., Ertam, F., & Sengur, A. (2022). A comprehensive intrusion detection framework using boosting
algorithms. Computers & Electrical Engineering, 100, 107869. DOI: 10.1016/j.compeleceng.2022.107869
Kumar, D., Pateriya, R. K., Gupta, R. K., Dehalwar, V., & Sharma, A. (2023). DDoS detection using deep
learning. Procedia Computer Science, 218, 2420–2429. DOI: 10.1016/j.procs.2023.01.217
Llugiqi, M., & Mayer, R. (2022). An empirical analysis of synthetic-data-based anomaly detection. In A.
Holzinger, P. Kieseberg, A. M. Tjoa, & E. Weippl (Eds.), Machine learning and knowledge extraction. 6th IFIP
TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International cross-domain conference, CD-MAKE 2022 (306–327).
Springer, Cham. DOI: 10.1007/978-3-031-14463-9_20
Louk, M. H. L., & Tama, B. A. (2023). Dual-IDS: A bagging-based gradient boosting decision tree model for
network anomaly intrusion detection system. Expert Systems with Applications, 213, 119030. DOI: 10.1016/j.
eswa.2022.119030
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in
machine learning. ACM Computing Surveys (CSUR), 54(6), 115, 1–35. DOI: 10.1145/3457607
Nikolov, I. A. (2023). Augmenting anomaly detection datasets with reactive synthetic elements. Computer
graphics and visual computing (CGVC). The Eurographics Association. DOI: 10.2312/cgvc.20231204
Oneto, L., & Chiappa, S. (2020). Fairness in machine learning. In Oneto, L., Navarin, N., Sperduti, A., & Anguita,
D. (Eds.), Recent trends in learning from data: Tutorials from the INNS big data and deep learning conference
(INNSBDDL2019) (pp. 155–196). Springer., DOI: 10.1007/978-3-030-43883-8_7
Patil, R., Biradar, R., Ravi, V., Biradar, P., & Ghosh, U. (2022). Network traffic anomaly detection using PCA
and BiGAN. Internet Technology Letters, 5(1), e235. DOI: 10.1002/itl2.235
Rakshe, D. S., Jha, S., & Bhaladhare, P. R. (2024). Validation of deep learning-based hybridization model for
DDoS attack detection with performance metrics comparison. Library Progress International, 44(3), 5564–5572.
Stanford, C., Adari, S., Liao, X., He, Y., Jiang, Q., Kuai, C., Ma, J., Tung, E., Qian, Y., Zhao, L., Zhou, Z., Rasheed,
Z., & Shafique, K. (2024). NUMOSIM: A synthetic mobility dataset with anomaly detection benchmarks.
arXiv:2409.03024 [cs.LG]. DOI: 10.1145/3681765.3698455
14
International Journal of Artificial Intelligence and Machine Learning
Volume 14 • Issue 1 • January-December 2025
Ungkawa, U., & Rafi, M. A. (2024). Data balancing techniques using the PCA-KMeans and ADASYN for
possible stroke disease cases. Jurnal Online Informatika, 9(1), 138–147. DOI: 10.15575/join.v9i1.1293
Wang, Z., Han, D., Li, M., Liu, H., & Cui, M. (2022). The abnormal traffic detection scheme based on PCA and
SSH. Connection Science, 34(1), 1201–1220. DOI: 10.1080/09540091.2022.2051434
Lakshmi Prayaga is a professor in the Department of Information Technology at University of West Florida. Her
research focuses on applications of technology in healthcare, sports medicine, management, and training. Topics
of interest include robotics, data visualizations, and analytics. She has coauthored books on robotics, Android
app development, beginning game programming, programming the Web with ColdFusion and XHMTL, and using
game programming to teach computer science concepts. She has also published numerous papers in international
journals and conferences. She teaches graduate and undergraduate courses in data analytics, data visualizations,
machine learning, and script programming. She holds an EdD in instructional technology and an MS in software
engineering, both from University of West Florida, and an MBA from Alabama A&M University.
Chandra Sekhar Prayaga is currently professor of physics at the University of West Florida. He holds a PhD in
physics from the Indian Institute of Science, Bangalore, India, where he was also a faculty member from 1981 to
1987. He has more than 40 years of experience in teaching physics, and has helped raise more than $3 million in
funding for research and projects involving University of West Florida faculty and students. His current research
interests include optical and electronic properties of liquid crystals, Langmuir-Blodgett films, phase transitions and
laser spectroscopy, physics education, and data analytics. He is a mentor for undergraduate student research
projects and coordinates summer camps on science and technology for middle and high school students. He is
cofounder of Discovery Spot, a technology playground for middle and high school students to experience the latest
technologies with hands-on activities, such as building smart cities using Internet of Things. He is coauthor of the
book titled Robotics: A Project-Based Approach by Cengage Publishers.
Mariah Borges Zuanazzi began studying computer science at University of West Florida in the fall of 2024. She is
also a research assistant at University of West Florida, where she specializes in data science, machine learning,
and generative artificial intelligence (AI). She is passionate about AI and cybersecurity and is actively conducting
research in these fields. She is driven by a love for innovation and problem-solving and is dedicated to advancing
technology and its applications to secure digital environments.
Sri Satya Harsha Pola holds an MS in data science, analytics, and modeling from the University of West Florida.
Her work focuses on developing innovative AI-driven solutions, including synthetic data generation, real-time
systems, and advanced statistical modeling. She has authored several works in esteemed journals and conferences,
highlighting her contributions to synthetic data analysis and AI research. With proficiency in Python, R, TensorFlow,
and Structured Query Language, she also excels in data visualization using tools like Tableau and Power BI. Her
research aims to advance privacy-preserving technologies and promote data-driven decision-making.
15