0% found this document useful (0 votes)
14 views

A Robotic Waste Sorting Machine With Modified Conveyor Plates and Deep Learning Based Optical Detectio

The document presents a novel robotic waste-sorting machine that integrates a mechanical conveyor system with deep learning for real-time waste classification. It features a dual-camera setup and various advanced deep learning models, achieving over 96% accuracy in waste classification while reducing energy use and maintenance costs. This system aims to enhance waste management efficiency in Indian municipalities, addressing challenges posed by informal waste sorting practices.

Uploaded by

it2021037
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

A Robotic Waste Sorting Machine With Modified Conveyor Plates and Deep Learning Based Optical Detectio

The document presents a novel robotic waste-sorting machine that integrates a mechanical conveyor system with deep learning for real-time waste classification. It features a dual-camera setup and various advanced deep learning models, achieving over 96% accuracy in waste classification while reducing energy use and maintenance costs. This system aims to enhance waste management efficiency in Indian municipalities, addressing challenges posed by informal waste sorting practices.

Uploaded by

it2021037
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

A Robotic Waste Sorting Machine with Modified

Conveyor Plates and Deep Learning-Based Optical


Detection
1st Soumyajit Biswas 2nd Rupankar Mondal 3rd Dipan Maulick
Dept. of Information Technology Dept. of Information Technology Dept. of Information Technology
RCC Institute of Information Technology RCC Institute of Information Technology RCC Institute of Information Technology
Kolkata, India Kolkata, India Kolkata, India
[email protected] [email protected] [email protected]

4th Anwesha Roy 5th Soumen Mukherjee 6th Hiranmoy Roy


Dept. of Information Technology Dept. of Information Technology Dept. of Information Technology
RCC Institute of Information Technology RCC Institute of Information Technology RCC Institute of Information Technology
Kolkata, India Kolkata, India Kolkata, India
[email protected] [email protected] [email protected]

Abstract—In this work, we introduce a groundbreaking robotic play a vital role in environmental degradation and public health
waste-sorting system that fuses an innovative mechanical con- concerns.
veyor design with state-of-the-art deep learning for real-time In many Indian municipalities, waste management is largely
classification. Our conveyor features ten rows of hinged trays,
each independently actuated by compact servos, enabling precise, characterized by informal means where waste is collected,
row-level ejection of targeted waste fractions. A dual-camera sorted, and recycled by an unorganized labor force. Although
setup captures RGB frames every 0.5 s, and an efficient this informal sector plays a vital role in the recycling chain,
queue-based synchronization aligns detection and actuation with the process is time-consuming, hazardous, and inefficient, es-
millisecond accuracy—eliminating the need for encoders or pecially for such a large volume of waste. The implementation
complex pneumatics. We rigorously benchmarked four modern
architectures—ShuffleNetV2, MobileNetV2, Custom ResNet-50, of automated, accurate, and sustainable waste sorting solutions
a Swin-CNN hybrid, and ViT—on the TrashNet dataset, un- is more of an urgent necessity than a luxury.
covering a superior balance of speed, accuracy, and model Recent advancements in automation and artificial intel-
compactness. Our final deployment employs a lightweight CNN ligence have shown promising platforms for smart waste
variant achieving ¿96 % classification accuracy with sub-50 ms management systems. The integration of computer vision and
latency, all within a ¡5 MB footprint. Designed for plug-and-play
integration into existing materials-recovery facilities, this system deep learning techniques has the potential to stir up waste
promises to slash energy use, reduce maintenance overhead, segregation by offering real-time, highly accurate sorting of
and scale from small-town recycling centers to metro-scale complex and heterogeneous waste streams. Advanced image
operations—paving the way toward a truly circular economy. processing techniques can identify subtle differences in mate-
rial composition and color, ensuring that recyclables are sorted
more effectively. This not only reduces the dependence on
I. I NTRODUCTION human labor but also enhances the purity of sorted waste.
The present study introduces an innovative robotic waste
Due to India’s rapid urbanization and Industrialization, the sorting machine specifically designed to address the challenges
volume of municipal waste is ever increasing with an un- faced by Indian waste management systems. Unlike traditional
precedented challenge of managing it. Major cities like Delhi, state-of-the-art optical sorting systems that rely on pneumatic
Mumbai, Bangalore, Chennai, and Kolkata are battling with actuation, our system employs a mechanical conveyor belt
dual issues of waste quality and segregation practices, which with rows of plates and servo-actuated trays. At the intake
pose serious environmental hazards and undermine resource zone, waste is continuously fed onto the conveyor where it is
recovery efforts and sustainability. A significant portion of divided into multiple plate regions. The dual high frames-per-
waste ends up in open dumps or landfills, despite government second camera system captures images at a high frequency,
initiatives like the Swatch Bharat Mission, which aims to and a deep learning model classifies the images of the waste
improve sanitation and waste management nationwide. Un- in real time. The system captures images every 0.5 seconds
organized manual segregation, inefficient recycling facilities, at a detection point (Point A) and aligns these detections with
and a lack of organized waste streams are major barriers that the actuation point (Point B) after a calculated delay of about
2 seconds, corresponding to the speed of the conveyor belt developed a system that utilized texture features (extracted
and the distance between Point A and Point B. through techniques like Gray Level Co-occurrence Matrix)
In our approach, four state-of-the-art models were trained, to monitor levels of waste bins. Such techniques were con-
including convolutional neural network (CNN) models, vi- ceptually straightforward and computationally lightweight but
sual transformers (ViT), and hybrid models (CNN + ViT). were impractical in real-world environments where significant
After rigorous comparison based on balanced performance variations in lighting, occlusion, and background clutter can
in accuracy, inference speed, model compactness, and some dramatically decrease performance.
other factors, we decided to go forward with the XXX model. Traditional techniques often relied on edge detection algo-
The deep learning model is integrated into a holistic system rithms and explicitly tuned thresholds. Although these methods
where it predicts the control signals for the servo motors, laid a foundation for automated inspection, their highly sensi-
ensuring that when the detected waste on a specific plate tive nature towards variation of environmental conditions and
region (among the three per row) meets the predefined criteria. inability to adapt to complex tasks, limited their application
The corresponding mechanical actuator is triggered to eject outcome and practicality.
and separate the waste into another waste collection bin or
C. Deep Learning-Based Methods
direct it onto another path for further processing.
By combining deep learning techniques with a novel yet The introduction of deep learning has revolutionized com-
simple mechanical design, our system not only increases the puter vision. These techniques provided significantly more
efficiency and reliability of waste segregation but also offers robust and accurate results for waste classification tasks. CNNs
a cost-effective solution for Indian cities. This integration of have emerged as the most popular approach due to their
cutting-edge technology is poised to transform the waste man- automatic and dynamic feature extraction capabilities and
agement landscape by reducing manual intervention, increas- high performances in image recognition, classification, and
ing recycling rates, and ultimately contributing to a cleaner, segmentation tasks. Several studies have applied CNNs to
more sustainable urban environment in India. waste sorting applications with noble success:
• Electronic Waste Classification: Sarswate et al. [9]
II. L ITERATURE R EVIEW employed a CNN-based system, using the YOLO (You
A. Overview of Waste Sorting and its Evolution Only Look Once) object detection framework, for the
In rapidly urbanizing countries like India, managing mass- classification and segregation of various components of
scale municipal waste is one of the most critical environmental electronic waste. Their approach achieved high precision
challenges. Traditional methods of sorting wastes and recy- and recall in detecting e-waste items but the process was
clables have heavily relied on manual segregation, which is not computationally extensive for real-time application.
only labor intensive but also inconsistent. The inefficiencies • Plastic Waste Segregation: Research by Choi et al. [4]
in manual sorting have spurred efforts to develop automated demonstrated how deep learning models were able to
systems. The early mechanically autonomous solutions in- distinguish chemically similar plastics. The use of deep
corporating basic sensor technologies and image processing CNNs powered the system to maintain high classification
built the foundation of today’s technologies, as these methods accuracy even when waste items exhibited subtle visual
often struggled with the heterogeneous nature of wastes and differences.
changing environmental conditions. • Advanced Architectures: More recent models, such as
Computers have increasingly played a heavy role in the DSYOLO-Trash by Ma et al. [19], incorporate attention
automation of the sorting process by enabling the machines mechanisms (Convolutional Block Attention Module) and
to “see” and differentiate waste based on visual patterns. This object-tracking algorithms, that enhance detection perfor-
evolution from basic feature extraction to state-of-the-art deep mance even under mixed conditions.
learning techniques, particularly Convolutional Neural Net- If we consider accuracy and precision, Deep learning tech-
works (CNNs) has shown dramatic improvements in accuracy niques are clear winners here. However, they require large,
and precision. However, despite the advancements in artificial annotated datasets for supervised learning, are also computa-
intelligence, the harmonic integration of these state-of-the-art tionally extensive, and demand high resources.
models with physical machinery is still a challenge. There is
a significant research gap between the theoretical studies and D. Hybrid Approaches
practical implementation of these systems. Hybrid methods are a combination of both traditional
techniques and deep learning techniques that lead up to the
B. Traditional Computer Vision Models result. They typically incorporate conventional pre-processing
Early research in automated waste sorting relied on prim- techniques (noise reduction, edge, detection, and thresholding)
itive CV algorithms. These incorporated feature extraction with deep learning models to enhance overall performance.
techniques like Scale-Invariant Feature Transform (SIFT), For example, Cuingnet et al. [5] developed a hybrid system
Speeded-Up Robust Features (SURF), and Histogram of Ori- that integrated traditional image processing (using OpenCV)
ented Gradients (HOG) to classify wastes under predefined with deep learning-based classification to improve the accu-
and controlled conditions. For example, Arebey et al. [24] racy of sorted aluminum can streams in real-time. Similarly,
Jadli and Hain [3] proposed a system in which conventional and inference speeds, advanced hardware like high-end
feature extraction was incorporated with transfer learning GPUs or accelerated processors especially suited for
techniques to boost performance. These hybrid approaches matrix calculations, along with efficient algorithms are
reduce the dependency on large datasets and train the deep required.
learning model more efficiently. • Integration with Existing Infrastructure: Adapting and
integrating these systems into pre-existing waste manage-
E. Application-Specific Studies
ment frameworks or machines presents technical, logis-
Several studies catered computer vision-based waste sorting tical, and compatibility issues, particularly in developing
systems to specific waste streams or operational contexts: countries like India.
• Municipal Solid Waste (MSW): In urban scenarios, as
the complexity of waste streams is relatively high, it III. P REPROCESS W ORKFLOW OF M IXED WASTE
requires a model that can be generalized across diverse S TREAMS
waste types. Studies such as those by Sousa et al. [14] and Before an item reaches the optical sorter, it undergoes
Lavanya et al. [] have focused on MSW, demonstrating several steps to separate materials into specific categories ef-
the potential of CNN-based systems to handle mixed ficiently. The workflow consists of multiple sequential stages,
waste conditions, with challenges in model adaptability. designed to progressively condition mixed waste into streams
• E-Waste and Electronic Components: With the rise that are optimal for high-precision optical sorting. This section
of consumer electronics, electronic waste has not only explains each step thoroughly, including the industry-leading
become a rising concern but an environmental hazard due machinery used.
to the presence of hazardous components like batteries,
A. Manual Pre-sorting (Hazardous/Bulky Items)
etc. The work by Joseph et al. [] and subsequent studies
have shown that deep learning–based approaches show The incoming waste, straight from a landfill or dumpyard, is
promising results with high classification accuracy for first manually sorted to remove hazardous or bulky items that
sorting e-waste. might interfere with the workflow or damage the machinery.
• Drone-Based Waste Monitoring: Malche et al. [8] Although most of this presorting is performed by human labor,
explored the use of drone-based systems incorporating modern facilities increasingly employ robotic sorting arms for
TinyML for areal waste monitoring, which broadens the improved safety and efficiency. One industry-leading solution
application of Computer Vision techniques from static is the ZenRobotics Recycler (ZRR) by ZenRobotics. These
facilities to dynamic urban scenarios. This approach AI-powered robots use hyperspectral imaging to accurately
addresses challenges regarding large-scale monitoring of detect and remove batteries, electronics, and other hazardous
waste in expansive areas. materials from the waste stream.
• Smart Cities: Incorporating computer vision in bins
(smart bins), as discussed by Pan et al. [10], highlights a
shift towards IoT-based waste classification and manage-
ment. These systems also monitor bin levels in real-time
to optimize collection and minimize operational costs.
F. Challenges and Limitations
Despite significant progress, several major challenges still
exist for computer vision-based waste classification systems:
• Variability in Waste Materials: Waste streams vary
heavily both in material composition and presentation. As
models are trained in controlled environments, they may
perform sub-optimally when deployed in an industrial set-
ting, where there exists varying illumination, occlusion,
and background complexity.
• Dataset Limitations: Deep learning models like CNNs
are supervised models and thus require large-scale, anno-
tated datasets to train and achieve high accuracy. While Fig. 1. ZenRobotics Recycler (ZRR) for hazardous and bulky item removal
datasets like TrashNet and TACO are used widely, their https://zenrobotics.com/recycler/
limited scale and controlled conditions do not fully cap-
ture the diversity that is expected to be encountered in a
practical setting. B. Primary Shredding
• Real-Time Processing: Deep learning models are com- After pre-sorting, waste is broken into smaller chunks
putationally intensive, especially for high throughput and using a primary shredder. Dual-shaft shredders are common
real-time applications. To achieve the desired outcome due to their high throughput and ability to produce uniform
particle sizes. The Tana E-Shredder by Tana can process
over 50tons/hour, reducing waste pieces to less than 15cm.
An alternative is Krause Manufacturing’s dual-shaft shredding
system, which offers similar high-volume performance.

Fig. 4. Bunting Eddy Current separator for non-ferrous metal removal

https:
//www.buntingusa.com/products/eddy-current-separators/
Fig. 2. Tana 440ET Electric Shredder

https://www.tana.fi/en/shredders/tana-440et/
and vibratory action to handle even sticky, wet feedstocks with
ease.
C. Ferrous Metal Removal
Once shredded, the stream passes under overhead magnetic
separators that extract ferrous metals such as steel cans, nails,
and car parts. The Eriez Suspended Electromagnets by Eriez
Magnetics achieve up to 99% removal efficiency for ferrous
fragments.

Fig. 3. Eriez Suspended Electromagnet for ferrous metal removal

https://www.eriez.com/electromagnets

D. Non-Ferrous Metal Removal


Fig. 5. Dayong Trommel screen for glass and fines removal
After ferrous extraction, eddy-current separators repel and
separate non-ferrous metals such as aluminum, copper, and https:
zinc. The Bunting Eddy Current Separator recovers up to 95% //dayongintl.com/product/soil-and-glass-trommel-screener/
of these valuable metals.
E. Trommel Screening (Glass/Fines Removal) F. Air Classification (Lightweight Organics)
Trommel (rotating drum) screens isolate glass fragments and Pneumatic air classifiers such as the Pellenc ST Aerolight
fines (small soil particles) from larger components. The Day- system by Pellenc ST use controlled airflow (15–25m/s) to
ong Soil Glass Trommel Screener employs a 5–50mm mesh separate lightweight organics (paper, food scraps) from heavier
materials, recovering approximately 70–80% of the organic
fraction.

Fig. 8. TOMRA AUTOSORT LASER for high-precision optical sorting

https:
//www.tomra.com/en/sorting/recycling/metal/autosort-laser

Fig. 6. Pellenc ST Aerolight pneumatic classifier for lightweight organics


I. Secondary Shredding (Plastics)
Sorted plastics often undergo secondary shredding for pro-
https://www.pellenc.com/en/solutions/aerolight/
cesses like pelleting or extrusion. Single-shaft shredders, such
as the Vecoplan VEZ, reduce plastics to flakes under 10mm,
G. Density Separation (Heavy Inerts) ensuring consistent granularity for high-quality recycling.
To strip out stones, ceramics, and other heavy inerts, the
waste is fed into a sink–float tank. EnergyCle’s Sink–Float
Separator exploits density differences (threshold 2g/cm³):
heavy inerts sink, while lighter plastics float for easy skim-
ming.

Fig. 9. Vecoplan VEZ single-shaft shredder for secondary plastic processing

https://www.vecoplan.com/products/vez-series/

IV. S YSTEM D ESIGN AND A RCHITECTURE


This section details the integrated mechanical, sensor, and
Fig. 7. EnergyCle’s Sink–Float Separator for density separation software architecture of the robotic waste sorting machine. The
mechanical design centers on a 360° conveyor composed of
https://www.energycle.com/products/sink-float-separator rows of hinged plates, paired with servo-actuated levers for
material discharge. A dual-camera system—combining digital
H. Optical Sorting (Plastics/Organics) and NIR imaging—captures frames every 0.5 seconds, which
Advanced optical sorters use NIR imaging and AI (or color- are processed by a deep learning model. A circular queue
based sorting) to classify plastics (PET, HDPE) and residual aligns the detection at Point A with actuation at Point B after
organics. The TOMRA AUTOSORT LASER by TOMRA a fixed conveyor delay, ensuring precise timing for triggering
achieves over 95% classification accuracy. AMP Cortex em- one or more servo motors. Hardware integration leverages an
ploys AI-driven robotic pickers as an alternative. Arduino Uno R3 for real-time actuator control, while software
implements model inference, queue management, and serial commanded to pivot its lever downward. This action releases
communication. the hinged plate, causing the entire plate segment to tip and
deposit its contents into a green dustbin positioned below.
A. Mechanical Design
1) Conveyor Belt Structure: The core transport mechanism
consists of 10 rows of interlocking plates, each row subdi-
vided into 3 distinct plate regions (Plate 1, Plate 2, Plate
3) on which waste items rest. These plates form a continuous
loop around the machine, enabling a compact footprint and
smooth 360° circulation.
Waste enters the system through a prominently colored
blue intake box, where it is initially deposited and stabilized
before moving onto the conveyor plates. This intake module
also houses the imaging sensors for the downstream detection
phase.

Fig. 12. Actuation levers at Point B for targeted waste discharge.

3) Timing & Synchronization: Because the conveyor moves


continuously, there is a fixed 2-second travel time for any row
Fig. 10. Conveyor belt structure and intake system(1) to move from the detection zone (Point A) to the actuation
zone (Point B). To coordinate precise actuation, the system
employs a circular queue (size = number of rows) that records
each detection event with a timestamp and active plate indices
(1–3). This queue ensures that after the 2-second delay, the
correct servos are triggered exactly as each row arrives at Point
B.

B. Sensor System
1) Cameras and Sensors:
• Digital (RGB) Camera: Captures high-resolution color
images for object shape, texture, and visible-spectrum
features.
• NIR Camera: Mounted adjacent to the digital camera
in the blue intake box, intended for material composition
analysis via near-infrared spectroscopy. (Note: NIR data
is currently not leveraged due to lack of annotated NIR
datasets.)
The combined field-of-view is calibrated so that each cap-
tured frame spans exactly three plate regions. Software di-
vides each frame into three vertical sections—one per plate—
Fig. 11. Conveyor belt structure and intake system(2) for independent classification.
2) Hardware Integration: Both cameras interface via
2) Actuation Zone (Point B): Directly beneath the conveyor USB/FireWire to a central PC running the image processing
at a fixed location (Point B), three red levers—each linked to pipeline. An Arduino Uno R3 is connected over serial (COM
its own servo motor—are aligned in three parallel columns port at 9600 baud) to receive actuation commands. Each of
corresponding to Plate 1, Plate 2, and Plate 3 positions. When the three servos is wired to Arduino PWM pins (e.g., 9, 10,
the detection logic identifies that a given plate in a specific 11), providing precise angular control for lever pivoting and
row contains the target waste category, the associated servo is subsequent return to the ready position.
Input
3×224×224

3×3 Conv
s=2

BatchNorm
ReLU

2×2 MaxPool
Fig. 13. Dual-camera sensor system layout. s=2

[ShuffleNetV2 Unit] × N
C. Software System and Queue Logic
1) Real-Time Image Capture and Processing: A Python-
1×1 Conv
based main loop captures a new frame every 0.5 seconds via
OpenCV. Each frame is automatically split into three regions
of interest (ROIs), corresponding to Plates 1–3. These ROIs AdaptiveAvgPool
are preprocessed (resized, normalized) and passed through the
selected deep learning classifier to detect whether the target Flatten
waste class is present in each plate.
2) Queue-Based Row Tracking:
FC → Softmax
a) Queue Structure::
• Implements a circular deque with maximum length 10 Fig. 14. ShuffleNetV2 architecture flow.
(equal to row count).
• Entry format:
{ b) Key Equations:
"row_id": <int>,
Y::c = X ∗ K dw (c), K dw ∈ Rk×k×C
"detection_time": <timestamp>,
"active_plates": [<1|2|3>, ...] (1)
} C
X pw
Z::d = Y::c · Kc,d , K pw ∈ R1×1×C×D
• row_id cycles from 0 to 9, matching physical rows on c=1
the conveyor. (2)
2 2
b) Synchronization Logic:: A background task continu- FLOPssep = HW (k C + C D), FLOPsstd = HW k C D
ously polls the queue. For each entry, it computes elapsed (3)
= now - detection_time. When elapsed ≥ 2.0
2. Custom ResNet-50: Employs residual learning via skip
seconds, the entry’s active_plates list is read and
connections to enable very deep networks without gradient
serialized into a command string (e.g., "1,3\n"). This string
vanishing [41].
is sent via serial to the Arduino, which actuates the corre-
sponding servo(s). Entries are then removed from the queue, c) Residual Mapping:
ensuring timely and accurate actuation. The logic supports y = F (x; {Wi }) + x (4)
simultaneous activation of multiple servos whenever multiple
plates in the same row require discharge. 3. MobileNetV2: Uses inverted residuals and linear bottle-
3) Deep Learning Model Training and Comparison: necks to minimize computations on mobile devices [43] [31].
a) Detailed Overview of Selected Deep Learning Mod- d) Block Equations:
els: This section provides an in-depth discussion of five
state-of-the-art architectures—ShuffleNetV2, Custom ResNet- x̂ = σ(BN(Wexp x)), (5)
50, MobileNetV2, Hybrid Swin-CNN, and Vision Trans- x̃ = σ(BN(Wdw x̂)), (6)
former—highlighting their design principles, computational y = BN(Wproj x̃) (7)
characteristics, and suitability for waste image classification.
1. ShuffleNetV2 (×1.0): Optimized for real-time edge infer- 4. Hybrid Swin-CNN: Merges hierarchical shifted-window
ence by balancing memory-access cost and FLOPs via channel self-attention (Swin) with convolutional blocks (ResNet-18) to
splitting, pointwise convolutions, and channel shuffling [44]. capture global context and local textures [33].
Input Input
3×224×224 3×224×224

7×7 Conv
Swin-Tiny ResNet-18
s=2
Backbone Backbone
BatchNorm
ReLU GlobalAvgPool GlobalAvgPool
(768-d) (512-d)
3×3 MaxPool Concatenate
s=2 (1280)

[Bottleneck×3]
FC 512→GELU→Dropout
[Bottleneck×4]
LayerNorm
[Bottleneck×6]

[Bottleneck×3] FC→Softmax

Fig. 17. Hybrid Swin-CNN feature fusion.


GlobalAvgPool

5. Vision Transformer (ViT-Base Patch16): Applies global


FC 512→ReLU multi-head self-attention on image patches, leveraging large-
scale pretraining to handle small-data regimes [37] [].
FC→Softmax
Input
Fig. 15. Custom ResNet-50 architecture flow. 3×224×224

Input Patchify
3×224×224 16×16

3×3 Conv Linear Embed


s=2 768-d

BatchNorm [Encoder Layer] ×12


ReLU6
Extract [CLS]
[Inverted Residual] × 1,2,3,4,3,3,1 768-d

1×1 Conv FC→Softmax


Projection
Fig. 18. Vision Transformer (ViT-Base Patch16) flow.
GlobalAvgPool
f) Core Equations:
Dropout → FC→Softmax zp0 = xp E + Eppos , (9)
ℓ ℓ−1 ℓ−1
ẑ = MSA(LN(z )) + z , (10)
Fig. 16. MobileNetV2 architecture flow. ℓ ℓ ℓ
z = MLP(LN(ẑ )) + ẑ (11)
g) Key Observations:
e) Fusion Equation:
1) ShuffleNetV2 offers the lowest footprint (293 MFLOPS,
Zfused = [Zswin ; Zcnn ] ∈ R1280 (8) 1.26 M params, 4.87 MB) with a solid 94.7
2) Custom ResNet-50 achieves 95.3 • ViT’s Overhead: Shows the highest resource use (16.9
3) MobileNetV2 trades a slightly larger model (326 GFLOPS, 85.7 M params, 327 MB) but lags in accuracy
MFLOPS, 2.88 M params, 11 MB) for 94.5 (90.7
4) Hybrid Swin-CNN delivers top-tier performance (96.4
5) Vision Transformer (ViT) underperforms in accuracy TABLE I
(90.7 Q UANTITATIVE C OMPARISON OF D EEP L EARNING M ODELS FOR WASTE
C LASSIFICATION
h) Deployment Considerations:
Model FLOPs Params (M) Size (MB) Val Loss Accuracy (%) mAP
ShuffleNetV2 293 MFLOPs 1.26 4.87 0.1905 94.66 0.9802
• Edge/Embedded: ShuffleNetV2—lowest FLOPS, ResNet-50 (Custom) 8.21 GFLOPs 24.56 93.90 0.1694 95.26 0.9859
MobileNetV2 326 MFLOPs 2.88 11.00 0.1863 94.47 0.9745
params, size, and loss; ideal for battery-powered or Hybrid Swin-CNN 4.80 GFLOPs 30.69 150.13 0.1413 96.44 0.9942
Vision Transformer (ViT) 16.86 GFLOPs 85.65 327.31 0.4892 90.69 0.9469
low-latency scenarios.
• Mobile/Embedded+: MobileNetV2—small model size 4) System Integration:
(11 MB) with competitive accuracy; fits in mobile a) Control Flow Diagram::
CPU/GPU environments.
• Workstation/Server: Custom ResNet-50 or Hybrid Swin- [Camera Capture]
CNN—higher throughput requirements and memory ca- ↓
pacity allow for superior accuracy. [Frame Split into 3 ROIs]
• Research/High-Accuracy: Hybrid Swin-CNN—best ↓
overall mAP but trades compute and memory; use where [Deep Learning Inference]
balanced performance is secondary. ↓
[Queue Insertion with Timestamp]
↓ (repeat every 0.5 s)
[Delay Check Task]
↓ (after 2 s)
[Send \plates" Command]

[Arduino: Actuate Servos]
b) Arduino Interfacing:: The Python code sends a
comma-separated list of active plate indices followed by a
newline over serial. Example command "1,3\n" triggers
servo[0] and servo[2] to move to the active angle (e.g.,
90°), then return after a fixed delay, controlled by Arduino’s
millis()-based timing logic.

This architecture synergistically combines precise mechani-


cal actuation, robust sensor integration, and intelligent soft-
ware to deliver a fully automated, high-throughput waste
sorting system.
V. D ISCUSSION
Fig. 19. Radar chart comparing normalized model metrics (lower FLOPS, A. System Strengths
params, size, loss better; higher accuracy, mAP better). 1) Robust Mechanical Design:
• Servo-Actuated Trays vs. Pneumatic Actuators
i) Radar Insights: Servo motors provide precise angular control with no
• ShuffleNetV2’s Sweet Spot: Fills its radar area in need for air compressors or pneumatic piping, eliminating
FLOPS/params/size/loss dimensions, showing minimal leaks and pressure-drop issues common in pneumatic
resource usage while maintaining high accuracy (94.7 systems.
• Custom ResNet-50’s Trade-Off: Trades more FLOPS • Modular Plate Conveyor
and memory for a modest gain in accuracy (95.3 The 10 × 3 hinged-plate design allows individual plate
• MobileNetV2’s Balance: Sits between ShuffleNet and or lever replacement without shutdown of the entire
ResNet in resource usage, delivering near-edge accuracy line, reducing mean time to repair (MTTR) compared to
(94.5 continuous pneumatic ejector grids.
• Hybrid Swin-CNN’s Performance Peak: Dominates • Compact Footprint
accuracy and mAP at the expense of compute (4.8 By integrating the blue intake box, dual cameras, and a
GFLOPS) and size (150 MB), suited for high-throughput 360° loop in a compact skid, our system requires 30–50%
sorting lines. less floor space than comparable pneumatic sorters.
2) Efficient Queue Management: 3) Integration with Existing Waste Systems:
• Deterministic Timing with deque • Scaling for Industrial Throughput
Using Python’s collections.deque(maxlen=10) Mechanical Reinforcement: Upgrading to industrial-grade
ensures constant-time enqueue/dequeue operations, guar- servos (e.g., 100 Nm torque) and heavy-duty hinge assem-
anteeing that each detection event triggers exactly 2 blies will support belt speeds of 3–5 m/s at ¿100 tons/hour
seconds later—no extra sensors or encoders needed. throughput.
• Software Simplicity Control System Migration: Transitioning from Arduino
The main loop captures images at 0.5 s intervals and to a PLC (e.g., Siemens S7-1500) with EtherCAT servo
pushes results into the queue; a background thread polls drives ensures deterministic timing, built-in safety inter-
and dispatches serial commands for servos, keeping the locks, and seamless SCADA/OPC UA integration.
codebase under 200 lines of Python and easily auditable. • Workflow Adaptation
3) Improved Reliability & Maintenance: Plug-and-Play Retrofitting: Modular skid-mount design
• Fewer Wear Points and standardized electrical/mechanical interfaces allow
Removing compressed-air lines cuts out valves, regula- integration downstream of existing magnetic, eddy-
tors, and hoses—common leak points—while brushless current, or trommel lines with minimal civil or structural
servos typically run for millions of cycles with minimal changes.
upkeep. • Data & Analytics

• Built-In Diagnostics Real-time actuation logs and classification summaries can


Servo current monitoring and Arduino’s watchdog timer be published to MES via MQTT or OPC UA, enabling
rapidly flag mechanical stalls or torque spikes, enabling plant-wide KPI tracking (e.g., yield rates, material purity)
predictive replacement before catastrophic failure. and automated sorting recipe adjustments.
4) Energy Savings: C. Model Selection and Deployment
• Zero Idle Power Draw Guided by our comparative analysis of modern deep learn-
Unlike pneumatics that consume power to maintain ing architectures, the following recommendations balance ac-
pressure even at rest, servos draw near-zero current curacy, throughput, and resource constraints:
when stationary, cutting idle energy use by 40–60% in • Edge/Embedded Devices: ShuffleNetV2 delivers the low-
intermittent-actuation profiles. est FLOPS (293 MFLOPs), parameters (1.26 M), and
• Potential Regenerative Braking
model size (4.87 MB) while maintaining strong classi-
Advanced servo drives can capture energy when levers fication accuracy (94.7%) and mAP (0.980), making it
return, feeding it back into the DC bus rather than ideal for battery-powered or low-latency scenarios.
dissipating it as heat. • Mobile Platforms: MobileNetV2 offers a modest foot-
B. Limitations print (326 MFLOPs, 2.88 M params, 11 MB) with com-
1) NIR Dataset Absence: petitive performance (94.5% acc, 0.974 mAP), suitable
for on-device GPU or CPU inference.
• Material Ambiguity
• Workstation/Server Environments: Custom ResNet-
Although an NIR camera is installed, the lack of a cu-
50 (8.21 GFLOPs, 24.56 M params, 93.9 MB) balances
rated NIR–RGB paired dataset prevents incorporation of
higher accuracy (95.3%, 0.986 mAP) and manageable
spectral features that distinguish visually similar plastics
resource usage, while Hybrid Swin-CNN (4.8 GFLOPs,
(e.g., PET vs. PETG). Current models only use RGB,
30.7 M params, 150 MB) achieves peak performance
limiting classification accuracy in certain classes.
(96.4% acc, 0.994 mAP) when throughput and memory
• Data Collection Needs
are ample.
Building a labeled NIR dataset (10 K images across all
• High-Accuracy Research: Hybrid Swin-CNN remains
classes) is critical; techniques like transfer learning from
the top choice when maximum accuracy and mAP are
hyperspectral benchmarks (e.g., Pavia University) could
prioritized over footprint. The Vision Transformer is
help bootstrap this effort.
not recommended due to its high computational cost
2) Model Generalization & Real-World Complexities: (16.86 GFLOPs, 85.65 M params, 327 MB) and lower
• Domain Shift classification performance (90.7%, 0.947 mAP).
Models trained on controlled datasets (e.g., TrashNet)
often suffer 10–20% accuracy drops when applied to VI. C ONCLUSION
dirty, occluded, or deformed waste in real facilities. This research demonstrates that marrying precise mechan-
• Unseen Classes & Outliers ical engineering with lightweight, high-performance deep
Novel materials or composites (e.g., multi-layer pack- learning yields a waste-sorting solution that is both practical
aging) may fall outside trained categories, necessitating and transformative. By replacing bulky pneumatic actuators
fallback strategies such as default diversion to residual with servo-driven trays, we achieve deterministic, low-power
bins. actuation; by leveraging an optimized CNN classifier, we
deliver high accuracy at minimal computational cost. The [14] J. Sousa, A. Rebelo, and J. S. Cardoso, “Automation of Waste Sorting
modular, skid-mounted conveyor can be retrofitted into ex- with Deep Learning,” Environmental Research Communications, vol. 6,
no. 4, p. 045007, 2024.
isting lines, while the Python-based queue logic keeps the [15] P. R. Murugan et al., “Deep Learning-based Hybrid Image Classification
software stack lean and transparent. Although full NIR inte- Model for Solid Waste Management,” in 2024 International Conference
gration remains a future milestone, the current system already on Cognitive Robotics and Intelligent Systems (ICC-ROBINS), Coim-
batore, India, 2024, pp. 168–172. Available: https://doi.org/10.1109/
outperforms many industrial sorters in energy efficiency and ICC-ROBINS60238.2024.10534019
maintainability. Looking ahead, creating paired RGB–NIR [16] T. Raja Sree and S. Kanmani, “A comprehensive review of the role
datasets and migrating control to industrial PLCs will further of soft computing techniques in municipal solid waste management,”
Environmental Technology Reviews, vol. 13, no. 1, pp. 168–185, 2024.
elevate throughput and material recovery rates. Ultimately, this Available: https://doi.org/10.1080/21622515.2023.2293679
platform charts a clear path toward fully automated, scalable, [17] Y. Gao, J. Wang, and X. Xu, “Machine learning in construction and
and sustainable waste management—turning today’s recycling demolition waste management: Progress, challenges, and future direc-
tions,” Automation in Construction, vol. 162, p. 105380, 2024. Available:
challenges into tomorrow’s circular-economy opportunities. https://doi.org/10.1016/j.autcon.2024.105380
[18] V. Lavanya et al., “CNN Based Smart Waste Segregation and Collection
R EFERENCES System,” in 2024 4th International Conference on Data Engineering and
Communication Systems (ICDECS), Bangalore, India, 2024, pp. 1–6.
[1] W. Lu, J. Chen, and F. Xue, “Using computer vision to recognize Available: https://doi.org/10.1109/ICDECS59733.2023.10503053
composition of construction waste mixtures: A semantic segmentation [19] W. Ma et al., “DSYOLO-trash: An attention mechanism-integrated and
approach,” Resources, Conservation & Recycling, vol. 178, p. 106022, object tracking algorithm for solid waste detection,” Waste Management,
2022. Available: https://doi.org/10.1016/j.resconrec.2021.106022 vol. 178, pp. 46–56, 2024. Available: https://doi.org/10.1016/j.wasman.
[2] W. Lu, J. Chen, and F. Xue, “Computer vision for solid waste sorting: A 2024.02.014
critical review of academic research,” Waste Management, vol. 142, pp. [20] Z. Wang et al., “Vision-Based On-Site Construction Waste Localization
29–43, 2022. Available: https://doi.org/10.1016/j.wasman.2022.02.009 Using Unmanned Aerial Vehicle,” Sensors, vol. 24, no. 9, p. 2816, 2024.
[3] A. Jadli and M. Hain, “Toward a Deep Smart Waste Management System Available: https://doi.org/10.3390/s24092816
based on Pattern Recognition and Transfer learning,” in 2020 3rd In- [21] M. Shahin et al., “Waste reduction via image classification algorithms:
ternational Conference on Advanced Communication Technologies and beyond the human eye with an AI-based vision,” Journal of Man-
Networking (CommNet), Marrakech, Morocco, 2020, pp. 1–5. Available: ufacturing Processes, vol. 84, pp. 3193–3211, Jul. 2023. Available:
https://doi.org/10.1109/CommNet49926.2020.9199615 https://doi.org/10.1080/00207543.2023.2225652
[4] J. Choi, B. Lim, and Y. Yoo, “Advancing Plastic Waste Classification [22] H. Sharma and H. Kumar, “Sustainable collection and classification
and Recycling Efficiency: Integrating Image Sensors and Deep Learning of e-waste: A proposed computer vision technology-based framework,”
Algorithms,” Applied Sciences, vol. 13, no. 18, p. 10224, 2023. Avail- Indian Institute of Management Kashipur, Uttarakhand, India, 2024.
able: https://doi.org/10.3390/app131810224 Available: https://papers.ssrn.com/sol3/papers.cfm?abstract id=4049693
[5] R. Cuingnet et al., “PortiK: A computer vision based solution for [23] Z. Dong, J. Chen, and W. Lu, “Computer vision to recognize construc-
real-time automatic solid waste characterization—Application to an tion waste compositions: A novel boundary-aware transformer (BAT)
aluminium stream,” Waste Management, vol. 150, pp. 267–279, 2022. model,” Journal of Environmental Management, vol. 305, p. 114405,
Available: https://doi.org/10.1016/j.wasman.2022.05.021 2022. Available: https://doi.org/10.1016/j.jenvman.2021.114405
[6] M. M. Hossen et al., “GCDN-Net: Garbage classifier deep neural [24] M. Arebey et al., “Solid waste bin level detection using gray level
network for recyclable urban waste management,” Waste Manage- co-occurrence matrix feature extraction approach,” Waste Management,
ment, vol. 174, pp. 439–450, 2024. Available: https://doi.org/10.1016/ vol. 32, no. 4, pp. 658–665, 2012. Available: https://doi.org/10.1016/j.
j.wasman.2023.12.014 wasman.2012.01.009
[7] S. Sundaralingam and N. Ramanathan, “Recyclable plastic waste segre- [25] Z. Nwokediegwu et al., “AI-Driven waste management systems: A
gation with deep learning based hand-eye coordination,” Environmental comparative review of innovations in the USA and Africa,” Environ-
Research Communications, vol. 6, no. 4, p. 045007, 2024. Available: mental Science and Technology Journal, vol. 5, no. 2, pp. 50–64, 2024.
https://doi.org/10.1088/2515-7620/ad3db7 Available: https://doi.org/10.51594/estj.v5i2.828
[8] T. Malche et al., “Efficient solid waste inspection through drone-based [26] M. Malik et al., “Waste Classification for Sustainable Development
aerial imagery and TinyML vision model,” Environmental Technology Using Image Recognition with Deep Learning Neural Network Models,”
& Innovation, vol. 22, 2023. Available: https://doi.org/10.1002/ett.4878 Sustainability, vol. 14, no. 12, p. 7222, 2022. Available: https://doi.org/
[9] P. K. Sarswat et al., “Real time electronic-waste classification algorithms 10.3390/su14127222
using the computer vision based on Convolutional Neural Network [27] T. Faisal et al., “Development of Intelligent Waste Segregation Sys-
(CNN): Enhanced environmental incentives,” Resources, Conservation tem Based on Convolutional Neural Network,” Higher Colleges of
& Recycling, vol. 207, p. 107651, 2024. Available: https://doi.org/10. Technology, Ruwais, Abu Dhabi, UAE, 2024. Available: https://www.
1016/j.resconrec.2024.107651 researchgate.net/publication/346085288
[10] P. Pan et al., “An Intelligent Garbage Bin Based on NB-IOT Research [28] V. Melinda et al., “Enhancing Waste-to-Energy Conversion Efficiency
Mode,” in 2018 IEEE International Conference of Safety Produce and Sustainability Through Advanced Artificial Intelligence Integration,”
Informatization (IICSPI), Chongqing, China, 2018, pp. 113–117. Avail- International Journal of Energy and Environmental Technology, vol. 2,
able: https://doi.org/10.1109/IICSPI.2018.8690408 no. 2, pp. 45–56, 2024. Available: https://doi.org/10.33050/itee.v2i2.597
[11] I. O. Joseph et al., “E-Waste Intelligent Robotic Technology (EIRT): A [29] T. Sutikno et al., “Artificial Intelligence Technologies in Urban Smart
Deep Learning approach for Electronic Waste Detection, Classification, Waste Management,” Technology Journal, vol. 15, no. 1, pp. 23–35,
and Sorting,” in 2023 14th International Conference on Computing, 2024. Available: https://journal2.uad.ac.id/index.php/tech/article/view/
Communication and Networking Technologies (ICCCNT), Delhi, India, 10941
2023, pp. 1–6. Available: https://doi.org/10.1109/ICCCNT56998.2023. [30] M. N. V. Prasad, Integrated Municipal Solid Waste Management
10306479 for Energy Recovery and Pollution Prevention. Elsevier, 2024, pp.
[12] R. L. Kumar et al., “Garbage Collection and Segregation using Computer 135–192. ISBN 9780443220692. Available: https://doi.org/10.1016/
Vision,” in 2024 International Conference on Inventive Computation B978-0-443-22069-2.00018-8
Technologies (ICICT), Lalitpur, Nepal, 2024, pp. 1023–1028. Available: [31] F. Chollet, “Xception: Deep Learning with Depthwise Separable Convo-
https://doi.org/10.1109/ICICT60155.2024.10544861 lutions,” arXiv:1610.02357, 2017. Available: https://arxiv.org/abs/1610.
[13] C. Nandre et al., “Robot Vision-based Waste Recycling Sorting with 02357
PLC as Centralized Controller,” in 2023 15th International Conference [32] M. Lin, Q. Chen, and S. Yan, “Network In Network,” arXiv:1312.4400,
on Computer and Automation Engineering (ICCAE), Sydney, Australia, 2013. Available: https://arxiv.org/abs/1312.4400
2023, pp. 381–384. Available: https://doi.org/10.1109/ICCAE56788. [33] Z. Liu et al., “Swin Transformer: Hierarchical Vision Transformer
2023.10111451 Using Shifted Windows,” in Proc. IEEE/CVF Int. Conf. Comput. Vis.
(ICCV), 2021, pp. 10012–10022. Available: https://openaccess.thecvf.
com/content/ICCV2021/html/Liu Swin Transformer Hierarchical
Vision Transformer Using Shifted Windows ICCV 2021 paper.html
[34] A. Vaswani et al., “Attention Is All You Need,” in Adv. in Neural Infor-
mation Processing Systems (NeurIPS), 2017, pp. 5998–6008. Available:
https://arxiv.org/abs/1706.03762
[35] M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Con-
volutional Neural Networks,” in Proc. Int. Conf. Mach. Learn. (ICML),
2019, pp. 6105–6114. Available: https://arxiv.org/abs/1905.11946
[36] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolu-
tional Block Attention Module,” in Proc. European Conf. Comput. Vis.
(ECCV), 2018, pp. 3–19. Available: https://arxiv.org/abs/1807.06521
[37] A. Dosovitskiy et al., “An Image Is Worth 16×16 Words: Transformers
for Image Recognition at Scale,” in Proc. Int. Conf. Learn. Represent.
(ICLR), 2021. Available: https://arxiv.org/abs/2010.11929
[38] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How Transferable Are
Features in Deep Neural Networks?” arXiv:1411.1792, 2014. Available:
https://arxiv.org/abs/1411.1792
[39] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press,
2016. Available: https://www.deeplearningbook.org/
[40] G. Thung and S. Stanford, “TrashNet Dataset,” GitHub repository, 2016.
Available: https://github.com/garythung/trashnet
[41] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for
Image Recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.
(CVPR), 2016, pp. 770–778. Available: https://doi.org/10.1109/CVPR.
2016.90
[42] A. G. Howard et al., “MobileNets: Efficient Convolutional Neural
Networks for Mobile Vision Applications,” arXiv:1704.04861, 2017.
Available: https://arxiv.org/abs/1704.04861
[43] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “Mo-
bileNetV2: Inverted Residuals and Linear Bottlenecks,” in Proc. IEEE
Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 4510–4520.
Available: https://doi.org/10.1109/CVPR.2018.00474
[44] N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet V2: Practical
Guidelines for Efficient CNN Architecture Design,” in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., 2018, pp. 1166–1174. Available: https:
//doi.org/10.1109/CVPR.2018.00091

You might also like