An Enhanced Detection Method of PCB Defect Based On Improved YOLOv7
An Enhanced Detection Method of PCB Defect Based On Improved YOLOv7
Article
An Enhanced Detection Method of PCB Defect Based on
Improved YOLOv7
Yujie Yang and Haiyan Kang *
School of Information Management, Beijing Information Science and Technology University, Beijing 100192, China
* Correspondence: [email protected]
Abstract: Printed circuit boards (PCBs) are a critical component of modern electronic equipment,
performing a crucial role in the electronic information industry chain. However, accurate detection of
PCB defects can be challenging. To address this problem, this paper proposes an enhanced detection
method based on an improved YOLOv7 network. First, the SwinV2_TDD module is proposed, which
adds a convolutional layer to extract the local features of the PCB. Then, the Magnification Factor
Shuffle Attention (MFSA) mechanism is introduced, which adds a convolutional layer to each branch
of the Shuffle Attention (SA) to expand its depth and enhance the adaptability of the attention mecha-
nism. The SwinV2_TDD module and MFSA mechanism are integrated into the YOLOv7 network,
replacing some ELAN modules and changing the activation function to Mish. The evaluation indexes
used are Precision (P), Recall (R), and mean Average Precision (mAP). Experimental results show that
the enhanced method achieves an AP of 98.74%, indicating a significant improvement in PCB defect
detection performance.
Keywords: deep learning; printed circuit boards; YOLOv7; target detection; swin transformer;
attention mechanism
1. Introduction
The printed circuit board (PCB) holds immense importance in the electronic industry
as a crucial component for the development of electronic products. PCBs are becoming
Citation: Yang, Y.; Kang, H. An
increasingly integrated and smaller [1] due to the excellent craftsmanship, precise wiring,
Enhanced Detection Method of PCB
and rapid development of integrated circuits. However, with the reduction in size, defects
Defect Based on Improved YOLOv7.
in the PCBs are also getting smaller and more challenging to detect. Therefore, it is
Electronics 2023, 12, 2120. https://
imperative to conduct a thorough defect detection process during PCB-related production
doi.org/10.3390/electronics12092120
to improve product quality and reduce company costs.
Academic Editor: Mohamed The conventional methods of detecting defects in PCBs are classified into three cat-
Benbouzid egories: manual visual inspection, electrical testing, and optical inspection [2]. Manual
Received: 24 February 2023
visual inspection involves workers inspecting bare PCBs directly using their eyes and
Revised: 27 April 2023
other equipment. However, this method has become inadequate due to the increasing
Accepted: 4 May 2023
demand for higher precision in PCB development, as it has poor detection stability and low
Published: 6 May 2023 efficiency. On the other hand, electrical testing employs contact testing to detect defects in
bare PCBs, which requires complex testing circuits, expensive molds, and fixtures for each
batch of PCBs. This method is also limited in detecting multi-layer PCBs and poses a risk
of secondary damage. In contrast, automated optical inspection (AOI) is a non-contact in-
Copyright: © 2023 by the authors. spection method that uses machine vision technology and image processing algorithms [3].
Licensee MDPI, Basel, Switzerland. Industrial cameras capture images of the PCBs, which are transmitted to a computer that
This article is an open access article provides feedback on the defect detection results. AOI is more stable and accurate than the
distributed under the terms and previous methods, with a faster detection speed [4], and does not impact the PCB.
conditions of the Creative Commons The advancement of deep learning has led to the development of contactless automatic
Attribution (CC BY) license (https://
detection methods, which have become a popular area of research due to their strong
creativecommons.org/licenses/by/
recognition adaptability and generalization ability. Typically, deep learning-based detection
4.0/).
networks can be categorized into one-stage and two-stage networks. The one-stage network
includes Single Shot Detector (SSD) [5], and You Only Look Once (YOLO) [6]. In contrast,
the two-stage network includes regions with convolutional neural networks (R-CNN) [7],
Fast R-CNN [8], and Faster R-CNN [9], which is an improved version of R-CNN. The
primary difference between these networks is that the one-stage network directly predicts
the location and category of defects in the network after feature extraction, while the
two-stage network first generates proposals that may include defects, then conducts the
detection process. Specifically, the two-stage network generates candidate boxes of different
sizes that may contain defect features, then performs target detection to predict defect
classes and locations. However, the detection speed is slow due to the generation of many
candidate frames. On the other hand, the one-stage network performs both training and
detection in a single network without the need for explicit region proposals, resulting in
faster detection speed. This paper adopts the one-stage network based on YOLOv7 [10]
and improves it to meet real-time performance requirements in the industrial field.
The Swin Transformer v2 [11] is designed to overcome three significant challenges
in large visual model training and application, namely, model instability, the resolution
gap problem, and a chronic lack of labeled data. To address these challenges, the Swin
Transformer v2 proposes three primary methods. First, it combines cosine attention and
post-normalization to enhance model stability. Second, it introduces a logarithmic space con-
tinuous location deviation method, which enables the model to be trained on low-resolution
images, then transferred to its higher-resolution counterparts. Lastly, it introduces SimMIM,
a self-supervised pretraining method that reduces the need for large amounts of labeled
data. To improve the global feature extraction and stability of the model, SwinV2_CSPB
modules can replace some ELAN modules in the YOLOv7 backbone network.
The attention mechanism is a widely used method to enhance model performance.
Typically, attention weight is obtained by calculating the importance of each position in the
input sequence. Shuffle Attention (SA) [12] improves upon this method by shuffling and
reordering the input sequence, then calculating the importance of each position to obtain
the attention weight. Compared with the traditional attention mechanism, it increases
computational efficiency by using a new calculation method that reduces the amount of
computation required to calculate attention weight. Additionally, SA can enhance the
model’s generalization ability, resulting in more consistent performance on both training
and test data.
The main contributions of this paper are as follows:
(1) The Swin Transformer v2 has been further enhanced with the SwinV2_TDD (Tiny
Defect Detection) structure, which involves adding a convolutional layer and an upsam-
pling layer at the beginning of each stage in the Swin Transformer v2. This is to extract local
features of PCBs and prevent excessive compression of feature maps, thereby improving
the accuracy of detecting small defects.
(2) The Magnification Factor Shuffle Attention (MFSA) mechanism is introduced as a
solution to the issue of gradient vanishing in the attention calculation of SA, which is based
on a simple, fully connected layer. MFSA proposes adding a 1 × 1 convolutional layer to
expand the network’s layers and introducing a scaling factor to adjust the model’s percep-
tion of data dynamically. This improvement enhances the model’s ability to effectively
capture long-range dependencies and improves its generalization ability.
(3) The SwinV2_TDD structure and the MFSA mechanism are integrated into the
backbone network of YOLOv7 to enhance its performance in detection on PCBs. The
SwinV2_TDD structure is used to replace some of the ELAN modules. The activation
function is changed to Mish, which improves the model’s nonlinear expression capability.
The rest of this paper is organized as follows: Section 2 of the paper reviews related
works in PCB defect detection. Section 3 presents the three main techniques used in the
enhanced method and provides project formulations. In Section 4, the performance of the
proposed method is evaluated through ablation experiments. Finally, Section 5 concludes
the paper.
Electronics 2023, 12, 2120 3 of 18
2. Related Work
The conventional approach for detecting visual anomalies in artificial systems has
drawbacks, such as high cost, low efficiency, and errors in detection. As an alternative, the
electrical properties of components can be leveraged for detecting defects in printed circuit
boards (PCBs) through a semi-automatic, manual detection method that includes online
and functional testing [13]. Researchers have explored various techniques to enhance
this method, such as compressing images using wavelet transform to reduce memory
and computation requirements [14], using traditional machine learning algorithms for
defect detection [15], and designing low-complexity neural network and machine vision
schemes to improve defect detection [16]. Other approaches include using Fourier image
reconstruction to identify small defects [17] and ultrasonic laser thermal imaging for real-
time defect detection [18]. Although these methods can reduce costs compared to manual
detection, their limited application is attributed to factors, such as the non-reusability of
the test process, the high cost of equipment, and complex writing functions, among others.
Machine vision detection methods have emerged as a viable solution to overcome
the shortcomings of traditional artificial detection methods and are increasingly being
applied in modern industries [19]. There are three primary categories of PCB defect
detection methods based on machine vision: reference, non-reference, and hybrid methods.
The reference method [20] typically involves image segmentation techniques to detect
defects. For example, Li et al. compared PCB images with and without defects to identify
defects [21]. Non-reference methods [22] mainly rely on machine learning algorithms for
defect detection. For instance, Malge et al. employed an image segmentation algorithm
to detect PCB defects [23]. The hybrid method [24] combines reference and non-reference
methods to achieve more accurate defect detection. For example, Ray et al. developed
a hybrid detection method by comparing PCB images and using image segmentation
techniques [25]. Image segmentation techniques include threshold segmentation, edge
segmentation, and region segmentation methods. For example, Ardhy et al. [26] used
the adaptive Gaussian threshold segmentation to achieve rapid detection with minimal
parameters, but the detection efficacy varied significantly in different areas with light strips.
Baygin et al. [27] used Hough transform for edge segmentation and combined it with
the Canny operator to enhance detection efficiency. Ma et al. [28] improved the region
growth algorithm for region segmentation to achieve better detection outcomes. However,
these methods require manual tuning of model parameters, which may lead to suboptimal
accuracy and efficiency.
Recent studies have demonstrated that the accuracy of automated optical inspection
(AOI) is higher compared to other methods. However, due to the system’s high sensitivity,
it has very strict parameter-setting rules and may miss some cases, necessitating manual
screening after machine screening is complete [29]. Meanwhile, deep learning technology
has been rapidly advancing. Target defect detection methods based on deep learning have
shown to be highly accurate, fast, and do not require manual screening. Thus, they are
more cost-effective and efficient. Moreover, the parameter-setting rules are not as strict as
those in the AOI system. As a result, deep learning-based methods are being increasingly
studied and applied in various industries.
Due to advancements in computing technology, complex operations have become
more affordable, resulting in the rapid development of neural networks, including a large
number of deep neural networks. In the field of PCB defect detection, many scholars have
applied deep learning techniques. DenseNet [30] achieved better performance with fewer
parameters and computing costs by densely connecting all front and back layers to enable
feature reuse. Huang et al. [31] improved detection accuracy and efficiency by designing a
convolutional neural network that connects each layer in a feedforward manner. Compared
to conventional machine vision methods, deep learning algorithms have stronger nonlinear
abilities, higher robustness, and are applicable to more complex scenarios. He [32] proposed
an improvement measure that helped achieve a 96.91% accuracy rate. Geng et al. [33]
improved the detection accuracy to 96.65% by using focal loss and ResNet50 as the backbone
Electronics 2023, 12, 2120 4 of 18
network. Ding et al. [34] designed TDD-net, a detection network specifically aimed at tiny
PCB defects, which adopted a multi-scale fusion strategy and applied online hard example
mining to enhance the certainty of ROI proposals, resulting in a detection accuracy of
98.90%. Sun et al. [35] proposed the Inception-ResNet-v2 model, which improved the PCB
detection accuracy by adding an SE module to part of the structure. Hu et al. [36] presented
UF-Net, which retained more defect target information by using the Skip Connect method
and achieved a detection accuracy of 98.6%. Li et al. [37] improved the mAP value to
98.71% by replacing the convolution layer in the trunk with the residual structure unit CSP
based on the YOLOv4 algorithm. Wang et al. [38] proposed a lightweight model that used
the ShuffleNetV2 structure in the YOLOv5 backbone and achieved an accuracy of 95%.
YOLOv7, as a classic representative of the target detection algorithm, has surpassed the
previous YOLO series in detection speed and accuracy.
This paper proposes an improved PCB defect detection method based on the study
of the algorithms discussed above. The proposed method is based on the YOLOv7 al-
gorithm and achieves higher accuracy. The specific improvements include applying the
SwinV2_TDD structure in the backbone network to enlarge resolution, improve model
stability, and extract local features of PCB images better. The proposed MFSA mecha-
nism effectively combines spatial attention and channel attention to enhance target feature
information and dynamically adjust the model’s perception of data. Additionally, the
activation function is changed to Mish to improve training stability and final accuracy. The
experiment shows that the proposed enhanced detection method performs better in PCB
defect detection.
Figure
Figure 1. 1. YOLOv7
YOLOv7 Network
Network structure.
structure.
3.2.Head: TheSwin
Improved backbone networkv2
Transformer persists in producing three-layer feature maps with
varying sizes. The RepVGG block and Conv are followed by the prediction of three image
3.2.1. Swin
detection Transformer
tasks: classification,v2
background classification, and frame. Auxiliary head training
and positive
Deep and negative
learning sample matching
networks strategies
often encounter are employed
challenges to enhance
during theand
training overall
application
performance
such as (1) of the model.
visual models being prone to large-scale instability; (2) high-resolution image
or windows being required for many downstream visual tasks; and (3) high graphics pro
3.2. Improved Swin Transformer v2
cessing
3.2.1. Swinunit (GPU) memory
Transformer v2 consumption when dealing with large images and high reso
lutions. To tackle these issues, Liu et al. proposed the Swin Transformer technology [11
Deep learning networks often encounter challenges during training and application,
which
such includes
as (1) (1) post-normalization
visual models technology
being prone to large-scale and scaling
instability; cosine attention
(2) high-resolution to enhanc
images
or windows being required for many downstream visual tasks; and (3) high graphicsdeviatio
the stability of large visual tasks; and (2) a log-spaced continuous location
method to
processing enable
unit (GPU) the modelconsumption
memory trained on coarse imageswith
when dealing to belarge
applied to and
images higher-resolutio
high
counterparts.
resolutions. The these
To tackle specific structure
issues, is proposed
Liu et al. depictedthe in Figure 2. Furthermore,
Swin Transformer zero[11],
technology redundanc
which includes (1) post-normalization technology and scaling cosine attention
optimizers, activation checkpoints, and sequential self-attention calculations can signif to enhance
the stability
cantly of large
reduce GPU visual tasks; and
memory (2) a log-spaced
consumption. Bycontinuous
training alocation deviation method
Swin Transformer model usin
tothese
enablemethods,
the modelittrained on coarse images to be applied to higher-resolution counterparts.
can be applied to large visual tasks, including those involving high-re
The specific structure is depicted in Figure 2. Furthermore, zero redundancy optimizers,
olution images, while mitigating model instability and GPU memory consumption. Th
activation checkpoints, and sequential self-attention calculations can significantly reduce
structure
GPU memory of consumption.
Swin Transformer v2 is ashown
By training in Figure 2.model using these methods,
Swin Transformer
it can be applied to large visual tasks, including those involving high-resolution images,
while mitigating model instability and GPU memory consumption. The structure of Swin
Transformer v2 is shown in Figure 2.
x FOR PEER REVIEW
Electronics 2023, 12, 2120 6 of 186 of 18
Linear Embedding
Patch Merging
Patch Merging
Patch Merging
Patch Partition
Linear Embedding
Swin Swin Swin Swin
Patch Merging
Patch Merging
Patch Merging
Patch Partition
Images Transformer Transformer
Swin SwinTransformer Swin Transformer Swin
Images Block Block
Transformer Transformer Block Transformer Block Transformer
Block Block Block Block
ⅹ2 ⅹ2 ⅹ2 ⅹ2 ⅹ6 ⅹ6 ⅹ2 ⅹ2
Figure
Figure 2. Swin Transformer
Figure 2.2.
v2 Swin Transformer
structure.
Swin Transformer v2v2 structure.
structure.
The
Thestructure
structureofofthe
theSwin
SwinTransformer Block
Transformer is is
Block shown inin
shown Figure 3. 3.
Figure
The structure of the Swin Transformer Block is shown in Figure 3.
Xl
z Xl
z
Softmax Layer Norm
x y 3.3.Swin
Figure
Figure SwinTransformer
TransformerBlock structure.
Block structure. Attention
Log-CPB The
Thereason
reasonforforthe
z instabilityininthe
theinstability thetraining
trainingprocess
process is the difference in amplitude of
X l 1 is the difference in amplitude of
the interlayer activation function, caused by adding
the interlayer activation function, caused by adding a chief branch a chief branch between
betweenthe theoutput
outputof
ofleft
leftelements.
elements.ToTo address this issue, the Swin Transformer V2
address this issue, the Swin Transformer V2 relocates the LN layer. relocates the LN layer.
Spe-
Figure 3. Swin Transformer Block structure.
Specifically, for the maximum model training (Swin V2-H and
cifically, for the maximum model training (Swin V2-H and Swin V2-G), an additional LN Swin V2-G), an additional
LN layer
layer is is addedtotoevery
added everysixsixTransformer
Transformer modules modules to to ensure
ensure training
trainingstability.
stability.While
Whilethe
the
The reason forattention
the instability
of pixel in
pairs the
is training
typically process
computed byis the
taking difference
the dot in
attention of pixel pairs is typically computed by taking the dot product of key vectors,this
product amplitude
of key of this
vectors,
the interlayer activation
method
method function,
often
oftenleads caused
leads bypixel
totoseveral
several adding
pixelpairs acontrolling
pairs chief branch
controlling the between
theattention the for
attentiongraph
graphoutput of ofof
fora anumber
number
blocks
blocksand
left elements. To address andheads.
this issue,
heads. ToTomitigate
the Swinthis
mitigate problem,
problem,a ascaling
Transformer
this cosine
cosineattention
V2 relocates
scaling the LN
attentionmethod
layer.isisSpe-
method proposed
proposed
that calculates
that calculates the attention
thetraining of
attention(Swinpixel i
of pixelV2-Hand pixel
i and pixel j by scaling cosine.
cifically, for the maximum model and jSwinby scalingV2-G),cosine.
an additional LN
layer is added to every six Transformer modules to ensureqi , Ktraining
/π + Bij stability. While the (1)(1)
Sim
𝑆𝑖𝑚(𝑞qi , K j, 𝐾 =) cos𝑐𝑜𝑠(𝑞
=
𝑖 𝑗 𝑖 , 𝐾𝑗 )/𝜋 +
j 𝐵𝑖𝑗
attention of pixel pairs is typically computed by taking the dot product of key vectors, this
InInthis
thiscontext,
context,thethevariable
variableππisisa aparameter
parameterthat thatcan
canbebelearned
learnedand andisisnot
notshared
shared
method often leadsacross
to several
acrosslayers
pixel
layersororsets
pairs controlling
setsofoflayers.
layers.Typically,
Typically,
the
it ithas
attention
hasa avalue
graph
valuegreater
greaterthan
for0.01
than0.01
aandnumber of the
andrepresents
representsthe
blocks and heads. To mitigate
difference
difference this problem,
ininrelative
relative positiona
position scalingpixel
between
between cosine
pixel attention
i and
i and method
pixelj. j.The
pixel The isfunction,
proposed
cosinefunction,
cosine duetotoitsits
due
that calculates the attention
natural of pixel i and
naturalnormalization,
normalization, pixel
results ininlow
results j by
low scaling
attention
attention cosine.
values.
values.Rather
Ratherthan thandirectly
directlyoptimizing
optimizingthe the
deviation
deviationparameters,
parameters,thethecontinuous
continuousrelative
relativeposition
positiondeviation
deviationmethod
methodemploys
employsa asmall
small
element 𝑆𝑖𝑚(𝑞
elementnetwork inin
network ,
the
𝑖 𝑗 𝐾 ) = 𝑐𝑜𝑠(𝑞
relative
the , 𝐾
coordinates:
relative coordinates:
𝑖 𝑗 )/𝜋 + 𝐵 𝑖𝑗 (1)
B 𝐵(∆
∆x , ∆𝑥y, ∆𝑦=)= 𝑔(∆
g ∆ x, ∆𝑥y, ∆𝑦 ) and is not shared (2)(2)
In this context, the variable π is a parameter that can be learned
across layers or sets of layers. Typically,
Thesymbol
symbol g in thisitequation
has a value greater
represents than
a small 0.01 and represents
meta-network consisting ofthe
two lay-
The g in this equation represents a small meta-network consisting of two layers
difference in relative position
ofers between
of multi-layer
multi-layer pixel
(MLP)iand
perceptron
perceptron anda ReLU
(MLP) pixel j. ReLU
The cosine
and aactivation function,
activation
function. ∆ x , ∆y due
function. (∆𝑥to
, ∆𝑦its
corresponds ) corre-
to
natural normalization,sponds to
results the
inscaled
low coordinate
attention of linear
values. space,
Rather and the
than new deviation
directly is learned
optimizing
the scaled coordinate of linear space, and the new deviation is learned from the original from
the the
original deviation. If the training is parameterized directly
deviation parameters, the continuous relative position deviation method employs a small and the pre-trained bias pa-
rameters are not used, then the performance of the window may suffer when it is
element network in the relative coordinates:
3.3.2. MFSA
In YOLOv7, when the input image size is (640, 640), the model produces three pre-
Electronics 2023, 12, 2120 9 of 18
FOR PEER REVIEW 9 of 18
3.3.2. MFSA
In YOLOv7, when the input image size is (640, 640), the model produces three pre-
different scale prediction layers to highlight the proportion of small-scale targets. This
diction layers of varying sizes, specifically (20, 20), (40, 40), and (80, 80). However, as PCB
attention mechanism has improved
defects are typically the
small accuracy
in size and ofthere
recognizing low-contrast
are relatively objects
fewer large targets, inimportant
it is the
model. to focus on improving the recognition accuracy of smaller objects [39]. To achieve this,
The SA module theispaper
effective at capturing
proposes using the SA features
mechanism of different
module, whichscales and weights
applies orientations
to different
scale prediction layers to highlight the proportion of
in images by combining interactions between channels and spatial dimensions. However, small-scale targets. This attention
mechanism has improved the accuracy of recognizing low-contrast objects in the model.
the attention calculation in SA is limited by the use of a simple, fully connected layer,
The SA module is effective at capturing features of different scales and orientations in
which may lead toimages
problems such asinteractions
by combining gradient between
vanishing. To and
channels address
spatialthis issue, aHowever,
dimensions. 1 × 1 the
convolution layer isattention
addedcalculation
to increase in SAthe depthby
is limited andthe enhance the adaptive
use of a simple, nature
fully connected layer,ofwhich
the may
attention mechanism, allowing the network to better adapt to different tasks and datasets. layer
lead to problems such as gradient vanishing. To address this issue, a 1 × 1 convolution
is added to increase the depth and enhance the adaptive nature of the attention mechanism,
Additionally, a magnification factor is introduced to dynamically adjust the model’s per-
allowing the network to better adapt to different tasks and datasets. Additionally, a
ception of data, which improves its nonlinear
magnification factor is introduced fitting ability and
to dynamically overall
adjust accuracy.
the model’s The op-
perception of data,
timal value of the magnification factor can be determined through experimentation.
which improves its nonlinear fitting ability and overall accuracy. The optimal value of the
The MFSA mechanism
magnificationdepicted
factor canin be
Figure 6 is achieved
determined by adding a shortcut connec-
through experimentation.
tion and applying a max pooling layer. The input data is processed,byand
The MFSA mechanism depicted in Figure 6 is achieved adding
each a shortcut
channel connection
is
and applying a max pooling layer. The input data is processed, and each channel is
multiplied to produce a feature map that completes the original feature relocation of the
multiplied to produce a feature map that completes the original feature relocation of the
channel dimension channel
data, resulting
dimensionin enhanced
data, resultingmodel performance.
in enhanced model performance.
The original
The original computation computation
of SA of SA as
is expressed is expressed
follows: as follows:
𝑁 1 N
1 fc = ∑ αi xi Wc,i
𝑓𝑐 = ∑ 𝛼𝑖 𝑥𝑖 𝑊N𝑐,𝑖i = 1 (5)
(5)
𝑁
𝑖=1
where xi represents the i-th feature map in the input tensor, N represents the number of
where 𝑥𝑖 represents the i-th
channels in feature
the inputmap in αthe
tensor, input tensor,
i represents N represents
the attention theWnumber
weights, and of the
c,i represents
weights after channel shuffling. After adding convolutional layers
channels in the input tensor, 𝛼𝑖 represents the attention weights, and 𝑊𝑐,𝑖 represents the and a scaling factor s,
the above computation can be expressed as the following formula:
weights after channel shuffling. After adding convolutional layers and a scaling factor s,
the above computation can be expressed as the following
1 N formula:
N i∑
fc = αi xi (W c,i + Wc,j )s (6)
𝑁 =1
1
𝑓𝑐 = the
where Wc,j represents ∑ 𝛼𝑖 𝑥𝑖 (𝑊 + 𝑊𝑐,𝑗 ) 𝑠 (6) the
operation, and s represents
𝑁 weights after
𝑐,𝑖 the convolutional
scaling factor. 𝑖=1
where 𝑊𝑐,𝑗 represents the weights after the convolutional operation, and s represents the
scaling factor.
f ( x ) = x × σ ( βx ) (7)
Figure
Figure7.7.Mish
Mishfunction
functiongraph.
graph.
Figure8. 8.
Figure Enhanced
Enhanced backbone
backbone structure
structure comparison:
comparison: (left) original
(left) original backbonebackbone and
and (right) (right) enhance
enhanced
backbone.
backbone.
4.4.Results
Results
4.1. Experimental Conditions
4.1. Experimental Conditions
This paper’s experimental environment is based on the Ubuntu 20.04 LTS operating
This
system. Thepaper’s
CPU usedexperimental
is AMD Ryzen environment
7 5800H, and is
thebased on the
GPU used UbuntuGeForce
is NVIDIA 20.04 LTS
RTXoperatin
system.
3060. The The
CUDACPU11.7used is AMD
acceleration Ryzen
library 7 5800H,
is used, and
and the the GPU
PyTorch used isisNVIDIA
framework used for GeForc
RTX 3060. The CUDA 11.7 acceleration library is used, and the PyTorch framework is use
implementation.
for implementation.
4.2. Dataset
4.2. In this paper, the Intelligent Robot Open Laboratory of Peking University’s open-
Dataset
source dataset [34] is utilized, which includes six types of common defects: missing hole,
mouseIn this
bite, paper,
open the
circuit, Intelligent
short Robot
circuit, spur, andOpen Laboratory
spurious of Peking
copper, as shown University’s
in Figure 9. The open
source dataset [34] is utilized, which includes six types of common defects:
dataset comprises a total of 693 images, each containing 3 to 5 defects. The image size ismissing hole
mouse
600 × 600. bite, open circuit, short circuit, spur, and spurious copper, as shown in Figure 9
The dataset comprises a total of 693 images, each containing 3 to 5 defects. The image siz
is 600 × 600.
The limited size of the dataset used in this study can affect the detection of PCB boar
defects. To address this, data augmentation techniques were employed to improve th
generalization ability of the network during training. Data augmentation is a techniqu
that involves transforming original images through operations, such as rotations, crop
Electronics 2023, 12, x FOR PEER REVIEW 12 of 18
4.3. Evaluation
The limitedIndicators
size of the dataset used in this study can affect the detection of PCB board
defects. To address this, data
Precision (P), Recall augmentation
(R), False techniques
Positive Rate (FPR), were employed
and mean to improve
Average the
Precision
generalization
(mAP) were used ability of the network
as evaluation during The
indicators. training.
ratio Data augmentation
between is a technique
positive samples that
quantity
involves
and transforming
all detected samplesoriginal images
quantity through
of this type isoperations,
denoted assuch as rotations,
P, and cropping,
its calculation and
formula
isscaling, to generate more training data [44]. During model testing, an unaugmented test
as follows:
dataset was used to evaluate the model’s performance and generalization ability. Using the
same augmented images for testing as for 𝑇𝑃 can lead to overly optimistic evaluations
training
𝑃= (10)
of the model’s performance because the training 𝑇𝑃 + 𝐹and
𝑃 test sets contain different versions
of the
The same
ratiooriginal
between images.
detectedTo positive
address classes
this issue, the dataset
quantity and allofpositive
693 original images
classes was
quantity
randomly divided into training and testing sets at an 8:2 ratio. Data augmentation was
is denoted as R:
only applied to the training set, and all images were resized to a uniform size of 640 × 640.
The augmented training set contained 9920 images, 𝑇𝑃 while the test set contained 139 images,
𝑅 =
which were original images that had not undergone
(11)
𝑇𝑃 + 𝐹𝑁 any augmentation.
4.3. Evaluation Indicators 𝐹𝑃
𝐹𝑃𝑅 = (12)
𝐹𝑃 (FPR),
Precision (P), Recall (R), False Positive Rate + 𝑇𝑁 and mean Average Precision (mAP)
were used
where as evaluation
𝑇𝑃 represents indicators.
the quantity of The ratiodenoted
samples betweenas positive samples
positive and arequantity
actuallyand all
posi-
detected samples quantity of this type is denoted as P, and its calculation formula
tive; 𝐹𝑃 represents the quantity of samples denoted as positive but are actually negative; is as
follows:
𝑇𝑁 represents the quantity of samples denoted Tas negative and are actually negative; and
P
=
𝐹𝑁 represents the quantity of samples Pdenoted as Fnegative but are actually positive. (10)
TP + P
The value of P or R alone cannot objectively reflect the quality of the detection results.
The ratio between detected positive classes quantity
Therefore, it is required to combine these two evaluation indexesand all positive classes
to measure the quantity
perfor-
is denoted as R:
mance of the algorithm. Using a combination ofTPpoints with different P and R values can
draw a P-R curve, also called a P-R curve.R =Based on a P-R curve, AP could be obtained (11) by
TP + FN
counting the P value corresponding to each R value. Its computational formula is as fol-
lows: FP
FPR = (12)
F
1P
+ TN
where TP represents the quantity of samples 𝑃(𝑅)𝑑𝑅as positive and are actually positive;
𝐴𝑃 = ∫denoted (13)
FP represents the quantity of samples denoted
0 as positive but are actually negative; TN
represents the quantity of samples denoted as negative and are actually negative; and FN
represents the quantity of samples denoted as negative but are actually positive.
Electronics 2023, 12, 2120 13 of 18
The value of P or R alone cannot objectively reflect the quality of the detection results.
Therefore, it is required to combine these two evaluation indexes to measure the perfor-
mance of the algorithm. Using a combination of points with different P and R values can
draw a P-R curve, also called a P-R curve. Based on a P-R curve, AP could be obtained
by counting the P value corresponding to each R value. Its computational formula is
as follows: Z 1
AP = P( R)dR (13)
0
The sum of all AP classes divided by the number of classes is the mAP:
∑in = 0 AP(i )
mAP = (14)
n
Table 1 represents the effectiveness of the proposed SwinV2_TDD method in this paper.
(1) The results indicate that replacing some ELAN modules with the Swin Transformer
v2 structure in the original YOLOv7 network improves the P value by 2.56%, R value by
1.66%, and mAP value by 2.06%. This suggests that incorporating the Swin Transformer
v2 structure in YOLOv7 can enhance the accuracy of detecting PCB defects.
(2) Moreover, replacing some ELAN modules with the SwinV2_TDD structure in
the YOLOv7 network results in a greater improvement in P value by 3.74%, R value by
1.90%, and mAP value by 2.46%, compared to the original YOLOv7 network. Furthermore,
compared to adding the Swin Transformer v2 structure to YOLOv7, SwinV2_TDD achieves
better P improvement by 1.18%, R improvement by 1.66%, and mAP improvement by
2.06%. Therefore, these findings verify the effectiveness of SwinV2_TDD in achieving
higher detection accuracy than Swin Transformer v2 in YOLOv7.
Based on the findings in Table 2, it is apparent that the mAP of the model improves
as the scaling factor increases, whereas the false alarm rate decreases. Specifically, the
highest accuracy of the MFSA-YOLOv7 model was achieved at a scaling factor of 3, with a
maximum accuracy of 97.16%, and a minimum false alarm rate of 4.21%. However, when
the scaling factor exceeds 3, the model’s accuracy starts to decrease while the false alarm
rate continues to increase. Thus, the experiment suggests that the optimal value for the
scaling factor is 3, since a larger value causes the attention mechanism to learn too much
irrelevant information, leading to a decrease in the model’s performance.
Table 3 shows the effectiveness of the proposed MFSA mechanism in this paper. The
following observations can be made.
(1) When compared to the original YOLOv7 network, introducing the SA mechanism
in the YOLOv7 network resulted in an increased P value by 2.16%, R value by 1.00%, and
mAP value by 1.46% in detecting PCB defects. This suggests that incorporating the SA
mechanism in the YOLOv7 network can improve the accuracy of PCB defect detection.
(2) Comparing the original YOLOv7 network to the YOLOv7 network with the MFSA
mechanism, it was found that the latter improved the P value by 3.41%, R value by
1.63%, and mAP value by 2.08%. Additionally, when compared to the YOLOv7 net-
work with the SA mechanism, the MFSA mechanism improved the P value by 1.25%,
R value by 0.63%, and the mAP value by 0.62%. These results demonstrate that incor-
porating the MFSA mechanism in the YOLOv7 network can achieve higher detection
accuracy than incorporating the SA mechanism, thereby validating the effectiveness of the
MFSA mechanism.
the data, ultimately leading to improved model performance. The experiment involved
substituting Sigmoid, Relu, SiLU, and Mish activation functions for those in the original
YOLOv7 network structure while keeping the network structure unchanged. P, R, and
mAP were employed as performance evaluation criteria, and the findings are tabulated
in Table 4.
Based on the experiment’s outcomes, the Mish activation function exhibited the best
performance, achieving P, R, and mAP scores of 87.93%, 98.34%, and 96.17%, respectively.
Compared to the use of the SiLU activation function in the original YOLOv7 network, it
outperformed by 1.46%, 1.13%, and 1.09%, correspondingly. Furthermore, compared to
Sigmoid and ReLU activation functions, the Mish activation function showed significant
improvements in P, R, and mAP. These findings demonstrate that the Mish activation func-
tion enhances the model’s nonlinear fitting ability and has a stronger nonlinear expression
ability when detecting smaller objects.
(1) The experimental results show that the enhanced YOLOv7 network model exhibited
the highest accuracy in detecting PCB defects, with P, R, and mAP scores of 94.53%,
99.49%, and 98.74%, respectively. Compared to the original YOLOv7 network, there was
a significant improvement of 7.32%, 1.68%, and 3.66% in P, R, and mAP, respectively.
Additionally, the improved model also demonstrated notable performance enhancements
in comparison to various other popular object detection networks.
(2) Furthermore, the improved YOLOv7 network model achieved its highest mAP0.5:0.95,
reaching 53.52%, which is a 2.27% increase compared to the original YOLOv7 network and
is also higher than the values achieved by other mainstream object detection networks.
These results suggest that based on the improved YOLOv7 network, the enhanced method
can maintain high P and R values across different IoU thresholds in PCB defect detection.
Therefore, the findings suggest that the accuracy of PCB defect detection can be effectively
improved using the enhanced method.
(2) Furthermore, the improved YOLOv7 network model achieved its highest
mAP0.5:0.95, reaching 53.52%, which is a 2.27% increase compared to the original YOLOv7
network and is also higher than the values achieved by other mainstream object detection
networks. These results suggest that based on the improved YOLOv7 network, the en-
Electronics 2023, 12, 2120
hanced method can maintain high P and R values across different IoU thresholds in16PCB of 18
defect detection. Therefore, the findings suggest that the accuracy of PCB defect detection
can be effectively improved using the enhanced method.
4.4.6. Display
4.4.6. Display of
of Detection
Detection Effect
Effect
In the
In the following
following examples,
examples, the
the enhanced
enhanced method
method was
was able
able to
to detect
detect all
all six
six types
types of
of
errors with high accuracy. Specifically, the detection accuracy for missing_hole,
errors with high accuracy. Specifically, the detection accuracy for missing_hole, mouse_bite,
and spurious_copper
mouse_bite, was 1.00, while
and spurious_copper wasthe detection
1.00, accuracy
while the for open_circuit,
detection short_circuit,
accuracy for open_circuit,
and spur was 0.99. Figure 10 shows the specific detection effect pictures.
short_circuit, and spur was 0.99. Figure 10 shows the specific detection effect pictures.
5. Conclusions
Conclusions
The paper
paperpresents
presentsanan improved
improved method
method for detecting defects
for detecting in printed
defects circuit boards
in printed circuit
(PCBs) by enhancing the YOLOv7 network with an improved Swin
boards (PCBs) by enhancing the YOLOv7 network with an improved Swin Transformer Transformer V2 struc-
ture.
V2 The proposed
structure. method introduces
The proposed the MFSA
method introduces themechanism, which includes
MFSA mechanism, a convolu-
which includes a
tional layer and
convolutional a scaling
layer factor to
and a scaling enhance
factor the attention
to enhance mechanism’s
the attention mechanism’sadaptability and
adaptability
perception
and ability.
perception Moreover,
ability. the activation
Moreover, function
the activation is changed
function to Mish
is changed to increase
to Mish accu-
to increase
racy and generalization ability. The experiments are conducted on
accuracy and generalization ability. The experiments are conducted on public datasetspublic datasets and
a dataset of painted and wired Rigid PCBs. Moreover, the proposed
and a dataset of painted and wired Rigid PCBs. Moreover, the proposed defect detection defect detection
method is
method is trained
trainedandandtested
testedonly
onlyonona adataset
dataset ofof
painted
painted and wired
and wiredrigid PCBs.
rigid TheThe
PCBs. results
re-
show that the proposed method achieves a higher mAP of 3.66% compared
sults show that the proposed method achieves a higher mAP of 3.66% compared to the to the original
YOLOv7 YOLOv7
original network, demonstrating its effectiveness
network, demonstrating for PCB defectfor
its effectiveness detection. However,
PCB defect since
detection.
detecting small PCB defects is challenging, the network will be further optimized in the
future to improve detection accuracy.
Author Contributions: Conceptualization, Y.Y. and H.K.; methodology, Y.Y.; software, Y.Y.; vali-
dation, Y.Y. and H.K.; formal analysis, Y.Y.; investigation, Y.Y.; resources, Y.Y.; data curation, Y.Y.;
writing—original draft preparation, Y.Y.; writing—review and editing, H.K.; visualization, Y.Y.;
supervision, H.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Humanities and Social Sciences research project of the
Ministry of Education (grant number 20YJAZH046) and the Scientific Research Project of the Beijing
Educational Committee (grant number KM202011232022).
Data Availability Statement: The data that support the findings of this study are available from the
corresponding author upon reasonable request.
Electronics 2023, 12, 2120 17 of 18
Acknowledgments: This work was supported by the Humanities and Social Sciences Research Project
of the Ministry of Education, the Scientific Research Project of the Beijing Educational Committee, and
the Department of Information Security of Beijing Information Science and Technology University.
Conflicts of Interest: The authors declare that they have no competing interest.
References
1. Zheng, L.J.; Zhang, X.; Wang, C.Y.; Wang, L.F.; Li, S.; Song, Y.X.; Zhang, L.Q. Experimental study of micro-holes position accuracy
on drilling flexible printed circuit board. In Proceedings of the 11th Global Conference on Sustainable Manufacturing, Berlin,
Germany, 23–25 September 2013.
2. Deng, L. Research on PCB Surface Assembly Defect Detection Method Based on Machine Vision. Master’s Thesis, Wuhan
University of Technology, Wuhan, China, 2019.
3. Zhu, Y.; Ling, Z.G.; Zhang, Y.Q. Research progress and prospect of machine vision technology. J. Graph. 2020, 41, 871–890.
4. Khalid, N.K.; Ibrahim, Z.; Abidin, M.S.Z. An Algorithm to Group Defects on Printed Circuit Board for Automated Visual
Inspection. Int. J. Simul. Syst. Sci. Technol. 2008, 9, 1–10.
5. Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single shot multibox detector. In Proceedings
of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Part I 14;
Springer International Publishing: Berlin/Heidelberg, Germany, 2016; pp. 21–37.
6. Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788.
7. Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014;
pp. 580–587.
8. Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 13–16 December
2015; pp. 1440–1448.
9. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural
Inf. Process. Syst. 2015, 28. [CrossRef]
10. Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object
detectors. arXiv 2022, arXiv:2207.02696.
11. Liu, Z.; Hu, H.; Lin, Y.; Yao, Z.; Xie, Z.; Wei, Y.; Ning, J.; Cao, Y.; Zhang, Z.; Dong, L.; et al. Swin transformer v2: Scaling up
capacity and resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans,
LA, USA, 18–24 June 2022; pp. 12009–12019.
12. Zhang, Q.L.; Yang, Y.B. Sa-net: Shuffle attention for deep convolutional neural networks. In Proceedings of the ICASSP 2021-
2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021;
pp. 2235–2239.
13. Liu, Z.; Qu, B. Machine vision based online detection of PCB defect. Microprocess. Microsyst. 2021, 82, 103807. [CrossRef]
14. Kim, J.; Ko, J.; Choi, H.; Kim, H. Printed circuit board defect detection using deep learning via a skip-connected convolutional
autoencoder. Sensors 2021, 21, 4968. [CrossRef]
15. Gaidhane, V.H.; Hote, Y.V.; Singh, V. An efficient similarity measure approach for PCB surface defect detection. Pattern Anal. Appl.
2018, 21, 277–289. [CrossRef]
16. Annaby, M.H.; Fouda, Y.M.; Rushdi, M.A. Improved normalized cross-correlation for defect detection in printed-circuit boards.
IEEE Trans. Semicond. Manuf. 2019, 32, 199–211. [CrossRef]
17. Tsai, D.M.; Huang, C.K. Defect detection in electronic surfaces using template-based Fourier image reconstruction. IEEE Trans.
Compon. Packag. Manuf. Technol. 2018, 9, 163–172. [CrossRef]
18. Cho, J.W.; Seo, Y.C.; Jung, S.H.; Jung, H.K.; Kim, S.H. A study on real-time defect detection using ultrasound excited thermography.
J. Korean Soc. Nondestruct. Test. 2006, 26, 211–219.
19. Dong, J.Y.; Lu, W.T.; Bao, X.M.; Luo, S.Y.; Wang, C.Q.; Xu, W.Q. Research progress of the PCB surface defect detection method
based on machine vision. J. Zhejiang Sci.-Tech. Univ. (Nat. Sci. Ed.) 2021, 45, 379–389.
20. Chen, S. Analysis of PCB defect detection technology based on image processing and its importance. Digit. Technol. Appl. 2016,
10, 64–65.
21. Li, Z.M.; Li, H.; Sun, J. Detection of PCB Based on Digital Image Processing. Instrum. Tech. Sens. 2012, 8, 87–89.
22. Liu, B.F.; Li, H.W.; Zhang, S.Y.; Lin, D.X. Automatic Defect Inspection of PCB Bare Board Based on Machine Vision. Ind. Control.
Comput. 2014, 27, 7–8.
23. Malge, P.S.; Nadaf, R.S. PCB defect detection, classification and localization using mathematical morphology and image processing
tools. Int. J. Comput. Appl. 2014, 87, 40–45.
24. Moganti, M.; Ercal, F. Automatic PCB inspection systems. IEEE Potentials 1995, 14, 6–10. [CrossRef]
25. Ray, S.; Mukherjee, J. A Hybrid Approach for Detection and Classification of the Defects on Printed Circuit Board. Int. J. Comput.
Appl. 2015, 121, 42–48. [CrossRef]
Electronics 2023, 12, 2120 18 of 18
26. Ardhy, F.; Hariadi, F.I. Development of SBC based machine-vision system for PCB board assembly automatic optical inspec-
tion. In Proceedings of the 2016 International Symposium on Electronics and Smart Devices (ISESD), Bandung, Indonesia,
29–30 November 2016; pp. 386–393.
27. Baygin, M.; Karakose, M.; Sarimaden, A.; Akin, E. Machine vision-based defect detection approach using image processing.
In Proceedings of the 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, Turkey,
16–17 September 2017; pp. 1–5.
28. Ma, J. Defect detection and recognition of bare PCB based on computer vision. In Proceedings of the 2017 36th Chinese Control
Conference (CCC), Dalian, China, 26–28 July 2017; pp. 11023–11028.
29. Deng, Y.S.; Luo, A.C.; Dai, M.J. Building an automatic defect verification system using deep neural network for pcb defect
classification. In Proceedings of the 2018 4th International Conference on Frontiers of Signal Processing (ICFSP), Poitiers, France,
24–27 September 2018; pp. 145–149.
30. Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708.
31. Huang, W.; Wei, P. A PCB dataset for defects detection and classification. arXiv 2019, arXiv:1901.08204.
32. He, X.Z. Research on Image Detection of Solder Joint Defects Based on Deep Learning. Master’s Thesis, Southwest University of
Science and Technology, Mianyang, China, 2021; pp. 30–38.
33. Geng, Z.; Gong, T. PCB surface defect detection based on improved Faster R-CNN. Mod. Comput. 2021, 19, 89–93.
34. Ding, R.; Dai, L.; Li, G.; Liu, H. TDD-net: A tiny defect detection network for printed circuit boards. CAAI Trans. Intell. Technol.
2019, 4, 110–116. [CrossRef]
35. Sun, C.; Deng, X.Y.; Li, Y.; Zhu, J.R. PCB defect detection based on improved Inception-ResNet-v2. Inf. Technol. 2020, 44, 4.
36. Hu, S.S.; Xiao, Y.; Wang, B.S.; Yin, J.Y. Research on PCB defect detection based on deep learning. Electr. Meas. Instrum. 2021, 58,
139–145.
37. Li, C.F.; Cai, J.L.; Qiu, S.H.; Liang, H.J. Defect detection of PCB based on improved YOLOv4 algorithm. Electron. Meas. Technol.
2021, 44, 146–153.
38. Wang, S.Q.; Lu, H.; Lu, D.; Liu, Y.; Yao, R. PCB Board Defect Detection Based on Lightweight Artificial Neural Network. Instrum.
Tech. Sens. 2022, 5, 98–104.
39. Zhou, W.J.; Li, F.; Xue, F. Identification of Butterfly Species in the Wild Based on YOLOv3 and Attention Mechanism. J. Zhengzhou
Univ. (Eng. Sci.) 2022, 43, 34–40. [CrossRef]
40. Elfwing, S.; Uchibe, E.; Doya, K. Sigmoid-weighted linear units for neural network function approximation in reinforcement
learning. Neural Netw. 2018, 107, 3–11. [CrossRef]
41. Misra, D. Mish: A self-regularized non-monotonic activation function. arXiv 2019, arXiv:1908.08681.
42. Kateb, Y.; Meglouli, H.; Khebli, A. Steel surface defect detection using convolutional neural network. Alger. J. Signals Syst. 2020, 5,
203–208. [CrossRef]
43. Guo, X. Research on PCB Bare Board Defect Detection Algorithm Based on Deep Learning. Master’s Thesis, Nanchang University,
Nanchang, China, 2021. [CrossRef]
44. Guo, D.; Qiu, B.; Liu, Y.; Xiang, G. Supernova Detection Based on Multi-scale Fusion Faster RCNN. In Proceedings of the 2021 6th
International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 9–11 April 2021.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.