Interactive Deep Learning System for Automated Car Damage Detection: Multi-Model Evaluation and Interactive Web Deployment
Interactive Deep Learning System for Automated Car Damage Detection: Multi-Model Evaluation and Interactive Web Deployment
ORCID- 0009-0009-3701-8929
Abstract: This project presents an automated framework for vehicle damage evaluation employing deep learning
methodologies, designed to optimize assessment procedures within automotive service environments. By implementing the
YOLOv9 computational vision architecture, the system enables rapid identification of vehicular damage components
through advanced pattern recognition, reducing reliance on labor-intensive manual inspections. The model underwent
training on an extensive curated dataset comprising 8,450 annotated images capturing diverse damage morphologies across
multiple vehicle perspectives, including frontal collisions, lateral impacts, and rear-end accidents. The framework integrates
physics-informed augmentation strategies to enhance environmental adaptability, particularly addressing challenges posed
by variable lighting conditions and reflective surfaces. A modular processing pipeline facilitates scalable deployment through
quantization techniques optimized for edge computing devices, demonstrating practical applicability in service center
operations. The system incorporates a web-based interface enabling real-time damage visualization and automated report
generation, significantly streamlining technician workflows. Experimental results indicate substantial improvements in
inspection efficiency, with the YOLOv9 architecture achieving 87% mean average precision ([email protected]) while maintaining
computational efficiency. Quantized model variants exhibited a 68% reduction in memory footprint with minimal accuracy
degradation. Field validations conducted across multiple service centers confirmed the system's operational effectiveness,
highlighting strong correlations between model complexity, training duration, and real-time detection capabilities. This
research establishes foundational insights for future advancements in 3D damage reconstruction and adaptive learning
systems within automotive diagnostics.
Keywords: Computational Damage Assessment, YOLOv9 Architecture, Automotive Computer Vision, Edge AI Optimization, Service
Process Automation, Neural Network Quantization.
How to Cite: Sai Madhu; Bharathi Maddikatla; Ranjitha Padakanti; Vineel Sai Kumar Rampally; Shirish Kumar Gonala (2025).
Interactive Deep Learning System for Automated Car Damage Detection: Multi-Model Evaluation and Interactive
Web Deployment. International Journal of Innovative Science and Research Technology,
10(4), 2779-2798. https://doi.org/10.38124/ijisrt/25apr1759
Table 1 Overview of Data Sources Used for Vehicle Damage Image Collection
Following comprehensive data cleaning and paired with detailed annotations, and summary tables present
augmentation steps as depicted in [Fig.2], the finalised dataset the class distributions and image characteristics, supporting
consists of 8,450 labelled images spanning 22 categories of transparent reporting and reproducibility in deep learning
vehicle damage. This dataset reflects a broad spectrum of research [11]. This thorough documentation ensures that the
real-world damage scenarios and vehicle types, which is dataset can serve as a robust foundation for both model
essential for training models that generalise effectively to development and future benchmarking efforts within the field
practical automotive service environments. Each image is of automated vehicle damage detection.
Colour Space Transformation enhances the model's ability to generalize across different
To reduce the impact of lighting variability, pixel values lighting conditions ,As illustrated in [Fig.3], which is
were normalized by dividing by 255, bringing them into a particularly relevant for reflective automotive surfaces [6].
range. This normalization step stabilizes training and
Image Normalization the model focus on relevant features rather than variations
Further normalization was performed by standardizing caused by lighting or sensor differences, as recommended in
pixel intensity distributions using mean and standard prior vehicle damage detection research [6].
deviation calculated from the training set. This process helps
Our experiments demonstrated that standard applied. This step enhances the visibility of subtle damages,
normalization outperformed other techniques such as Z-score such as fine scratches or minor dents, without introducing
normalization for automotive damage detection. This finding artifacts that could mislead the model [12].
aligns with [6], who observed that maintaining the positive
range of pixel values preserves important visual Background Standardization
characteristics of damage patterns while still providing the Images often contain complex backgrounds that can
benefits of normalization. distract the model. A combination of semantic segmentation
and selective blurring was used to de-emphasize non-vehicle
Noise Reduction and Image Enhancement regions, helping the model focus on relevant damage areas.
To improve clarity and reduce the influence of noise, This approach has been shown to reduce false positives and
especially in images captured under suboptimal conditions, improve detection rates in automotive datasets.
adaptive bilateral filtering and histogram equalization were
As illustrated in [Table.6], shear augmentation sampling. This approach is essential for mitigating class
contributed to a 1.8% improvement in mAP, primarily by imbalance, which can otherwise bias model performance and
enhancing the model's robustness to perspective variations. reduce its ability to generalise to under-represented
This augmentation was particularly effective for improving categories. We adopted an 85:10:5 split ratio, allocating 85%
detection performance on damages captured at oblique of the data for training, 10% for validation during model
angles, which are common in real-world inspection scenarios development, and 5% for final testing and evaluation. As
[6]. shown in [Table.7], this strategy ensures that each subset
contains a proportional distribution of all damage classes,
Data Splitting supporting robust model training and unbiased performance
To ensure balanced representation of all damage assessment. Stratified sampling has been widely
classes—including rare types such as "Medium-Bodypanel- recommended in automotive damage detection literature to
Dent" with as few as three instances—the dataset was divided maintain dataset integrity and support fair evaluation of deep
into training, validation, and testing sets using stratified learning models [12].
This approach ensures that rare damage types are proportionally represented in each subset, supporting reliable evaluation and
minimizing bias.
This section details the implementation and performance of various YOLO (You Only Look Once) architectures for car damage
identification. We evaluated multiple YOLO variants with different configurations to identify and verify the optimal model for
classification and identification of vehicle damages across 22 damage classes.
Our experimentations with YOLOv8 variants revealed Swin Transformer backbone: Enhances feature extraction
promising results, with the nano (n) and small (s) versions with attention mechanisms [2].
demonstrating an excellent maintenance between accuracy Gradient flow optimization: Improves training stability
and computational efficiency. The YOLOv8s model achieved and convergence [4].
83% [email protected] when trained for 50–100 epochs, benefiting Dynamic label assignment: Better handles the variety of
from its larger parameter count (11.2M) while maintaining damage types and sizes [1].
reasonable inference speed (72 FPS on NVIDIA RTX
3080ti). As illustrated in [Fig. 5] YOLOv9's attention
mechanisms allow it to focus more effectively on damage
As shown in the architecture diagram [Fig.5], regions while suppressing background noise, which is
YOLOv8's feature pyramid network effectively captures particularly valuable for detecting subtle damages like
multi-scale features, which is critical for detecting damages scratches against complex vehicle surfaces.
ranging from small scratches to large dents. The model's
ability to process images at 640×640 resolution provides YOLOv11 and YOLOv12
sufficient detail for accurate damage localization while
maintaining efficiency [8]. YOLOv11 and YOLOv12 Variants Showed Mixed Results:
Extended training of YOLOv8n (0–150 epochs) YOLOv12m achieved competitive accuracy (78%
demonstrated significant improvement over the 50-epoch [email protected]) comparable to YOLOv8s but required
training, highlighting the importance of sufficient training significantly more parameters (25.6M vs. 11.2M).
iterations for complex damage classification tasks [Table.8]. Hybrid CNN-Transformer design: Improved multi-scale
where longer training periods consistently yielded better feature extraction but offered marginal performance gains
results. [2].
This section analyzes the performance of YOLO-based architectures for automotive damage identification, focusing on
accuracy-efficiency trade-offs and real-time deployment considerations.
As shown in [Table.8], YOLOv9s trained for 250 and architectural innovations like Swin Transformers
epochs achieved the highest mAP (87.0%), demonstrating a contributed to this performance leap while maintaining real-
47% improvement over YOLOv8n. Extended training cycles time capabilities (53.2 FPS) [6].
Requirements to Deploy and maintain Rare Damage Types: Classes with <50 training samples
For the YOLOv9s-based car damage identification (e.g., "Medium-Bodypanel-Dent") show 22% lower recall
model to be effective, a CUDA-capable GPU (e.g., NVIDIA than common types.
RTX 3060 or higher) is recommended for real-time inference Environmental Sensitivity: Performance drops 15% in
(~53 FPS) [Table.9]. At minimum, an NVIDIA GTX 1660 low-light conditions despite CLAHE augmentation [3].
(6GB VRAM) paired with 16GB RAM, and a multi-core Computational Demands: Real-time 4K processing
CPU can handle batch processing of 640×640 images. For requires ≥8GB VRAM, limiting edge device
edge deployment, INT8 quantisation reduces the model size compatibility.
to 25MB, enabling operation on devices like NVIDIA Jetson Differentiation Challenges: 18% misclassification rate
Xavier (8GB RAM). Storage should prioritise fast SSDs (50 persists between adjacent damage types (e.g., door vs.
GB+ for datasets). Software requires PyTorch 2.0, ONNX fender dents).
Runtime, and Linux/Windows OS. Cloud alternatives (AWS
EC2 P4d instances) are advised for large-scale training on VIII. CONCLUSION
HPC clusters.
Our research establishes YOLOv9s as the optimal
Challenges architecture for automated vehicle damage identification,
While the system shows promise, four key limitations achieving 87% [email protected] while maintaining 53 FPS on mid-
persist: range GPUs. Three key advancements drive this success: