Soft Filter Pruning For Accelerating Deep Convolutional Neural Networks

Uploaded by

cy.zzz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Soft Filter Pruning For Accelerating Deep Convolutional Neural Networks

Uploaded by

cy.zzz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

Yang He1,2 , Guoliang Kang2 , Xuanyi Dong2 , Yanwei Fu3∗ , Yi Yang1,2∗

1
SUSTech-UTS Joint Centre of CIS, Southern University of Science and Technology
2
CAI, University of Technology Sydney
3
The School of Data Science, Fudan University
{yang.he-1, guoliang.kang, xuanyi.dong}@student.uts.edu.au,
[email protected], [email protected]
arXiv:1808.06866v1 [cs.CV] 21 Aug 2018

Abstract Input Output

This paper proposed a Soft Filter Pruning (SFP)

Convolution
Operation
×
method to accelerate the inference procedure of Filters
deep Convolutional Neural Networks (CNNs). Hard pruned Training Still zero Capacity
Specifically, the proposed SFP enables the pruned never update Reduced
filters to be updated when training the model af-
ter pruning. SFP has two advantages over previ- × Hard
Filter ×
ous works: (1) Larger model capacity. Updat- Pruning
ing previously pruned filters provides our approach
with larger optimization space than fixing the fil- Soft pruned Training Non-zero Capacity
allow update Maintained
ters to zero. Therefore, the network trained by our
method has a larger model capacity to learn from × Soft
Filter
×
the training data. (2) Less dependence on the pre-
Pruning
trained model. Large capacity enables SFP to train
from scratch and prune the model simultaneously.
In contrast, previous filter pruning methods should Figure 1: Hard Filter Pruning v.s. Soft Filter Pruning. We mark
be conducted on the basis of the pre-trained model the pruned filter as the green dashed box. For the hard filter pruning,
the pruned filters are always fixed during the whole training proce-
to guarantee their performance. Empirically, SFP dure. Therefore, the model capacity is reduced and thus harms the
from scratch outperforms the previous filter prun- performance because the dashed blue box is useless during train-
ing methods. Moreover, our approach has been ing. On the contrary, our SFP allows the pruned filters to be updated
demonstrated effective for many advanced CNN ar- during the training procedure. In this way, the model capacity is
chitectures. Notably, on ILSCRC-2012, SFP re- recovered from the pruned model, and thus leads a better accuracy.
duces more than 42% FLOPs on ResNet-101 with
even 0.2% top-5 accuracy improvement, which has
advanced the state-of-the-art. Code is publicly cumbersome model significantly exceed the computing limi-
available on GitHub: https://github.com/he-y/soft- tation of current mobile devices. Therefore, it is essential to
filter-pruning maintain the small size of the deep CNN models which has
relatively low computational cost but high accuracy in real-
world applications.
1 Introduction Recent efforts have been made either on directly deleting
The superior performance of deep CNNs usually comes from weight values of filters [Han et al., 2015b] (i.e., weight prun-
the deeper and wider architectures, which cause the pro- ing) or totally discarding some filters (i.e., filter pruning) [Li
hibitively expensive computation cost. Even if we use more et al., 2017; He et al., 2017; Luo et al., 2017]. However, the
efficient architectures, such as residual connection [He et al., weight pruning may result in the unstructured sparsity of fil-
2016a] or inception module [Szegedy et al., 2015], it is still ters, which may still be less efficient in saving the memory
difficult in deploying the state-of-the-art CNN models on mo- usage and computational cost, since the unstructured model
bile devices. For example, ResNet-152 has 60.2 million pa- cannot leverage the existing high-efficiency BLAS libraries.
rameters with 231MB storage spaces; besides, it also needs In contrast, the filter pruning enables the model with struc-
more than 380MB memory footprint and six seconds (11.3 tured sparsity and more efficient memory usage than weight
billion float point operations, FLOPs) to process a single im- pruning, and thus takes full advantage of BLAS libraries to
age on CPU. The storage, memory, and computation of this achieve a more realistic acceleration. Therefore, the filter
pruning is more advocated in accelerating the networks.
∗
Corrsponding Author Nevertheless, most of the previous works on filter pruning
still suffer from the problems of (1) the model capacity re- Han et al., 2015a; Guo et al., 2016] pruning weights of
duction and (2) the dependence on pre-trained model. Specif- neural network resulting in small models. For example,
ically, as shown in Fig. 1, most previous works conduct the [Han et al., 2015b] proposed an iterative weight pruning
“hard filter pruning”, which directly delete the pruned filters. method by discarding the small weights whose values are
The discarded filters will reduce the model capacity of origi- below the threshold. [Guo et al., 2016] proposed the dy-
nal models, and thus inevitably harm the performance. More- namic network surgery to reduce the training iteration while
over, to maintain a reasonable performance with respect to the maintaining a good prediction accuracy. [Wen et al., 2016;
full models, previous works [Li et al., 2017; He et al., 2017; Lebedev and Lempitsky, 2016] leveraged the sparsity prop-
Luo et al., 2017] always fine-tuned the hard pruned model erty of feature maps or weight parameters to accelerate the
after pruning the filters of a pre-trained model, which how- CNN models. A special case of weight pruning is neuron
ever has low training efficiency and often requires much more pruning. However, pruning weights always leads to unstruc-
training time than the traditional training schema. tured models, so the model cannot leverage the existing effi-
To address the above mentioned two problems, we propose cient BLAS libraries in practice. Therefore, it is difficult for
a novel Soft Filter Pruning (SFP) approach. The SFP dynam- weight pruning to achieve realistic speedup.
ically prunes the filters in a soft manner. Particularly, before Filter Pruning. Concurrently with our work, some fil-
first training epoch, the filters of almost all layers with small ter pruning strategies [Li et al., 2017; Liu et al., 2017;
`2 -norm are selected and set to zero. Then the training data He et al., 2017; Luo et al., 2017] have been explored. Prun-
is used to update the pruned model. Before the next training ing the filters leads to the removal of the corresponding fea-
epoch, our SFP will prune a new set of filters of small `2 - ture maps. This not only reduces the storage usage on de-
norm. These training process is continued until converged. vices but also decreases the memory footprint consumption
Finally, some filters will be selected and pruned without fur- to accelerate the inference. [Li et al., 2017] uses `1 -norm to
ther updating. The SFP algorithm enables the compressed select unimportant filters and explores the sensitivity of lay-
network to have a larger model capacity, and thus achieve a ers for filter pruning. [Liu et al., 2017] introduces `1 regu-
higher accuracy than others. larization on the scaling factors in batch normalization (BN)
Contributions. We highlight three contributions: (1) We layers as a penalty term, and prune channel with small scal-
propose SFP to allow the pruned filters to be updated during factors in BN layers. [Molchanov et al., 2017] proposes
ing the training procedure. This soft manner can dramatically a Taylor expansion based pruning criterion to approximate
maintain the model capacity and thus achieves the superior the change in the cost function induced by pruning. [Luo et
performance. (2) Our acceleration approach can train a model al., 2017] adopts the statistics information from next layer to
from scratch and achieve better performance compared to the guide the importance evaluation of filters. [He et al., 2017]
state-of-the-art. In this way, the fine-tuning procedure and proposes a LASSO-based channel selection strategy, and a
the overall training time is saved. Moreover, using the pre- least square reconstruction algorithm to prune filers. How-
trained model can further enhance the performance of our ap- ever, for all these filter pruning methods, the representative
proach to advance the state-of-the-art in model acceleration. capacity of neural network after pruning is seriously affected
(3) The extensive experiment on two benchmark datasets by smaller optimization space.
demonstrates the effectiveness and efficiency of our SFP. We Discussion. To the best of our knowledge, there is only one
accelerate ResNet-110 by two times with about 4% relative approach that uses the soft manner to prune weights [Guo et
accuracy improvement on CIFAR-10, and also achieve state- al., 2016]. We would like to highlight our advantages com-
of-the-art results on ILSVRC-2012. pared to this approach as below: (1) Our SPF focuses on
the filter pruning, but they focus on the weight pruning. As
discussed above, weight pruning approaches lack the practi-
2 Related Works cal implementations to achieve the realistic acceleration. (2)
Most previous works on accelerating CNNs can be roughly [Guo et al., 2016] paid more attention to the model com-
divided into three categories, namely, matrix decomposition, pression, whereas our approach can achieve both compres-
low-precision weights, and pruning. In particular, the ma- sion and acceleration of the model. (3) Extensive experiments
trix decomposition of deep CNN tensors is approximated by have been conducted to validate the effectiveness of our pro-
the product of two low-rank matrices [Jaderberg et al., 2014; posed approach both on large-scale datasets and the state-of-
Zhang et al., 2016; Tai et al., 2016]. This can save the the-art CNN models. In contrast, [Guo et al., 2016] only had
computational cost. Some works [Zhu et al., 2017; Zhou the experiments on Alexnet which is more redundant the ad-
et al., 2017] focus on compressing the CNNs by using low- vanced models, such as ResNet.
precision weights. Pruning-based approaches aim to remove
the unnecessary connections of the neural network [Han et 3 Methodology
al., 2015b; Li et al., 2017]. Essentially, the work of this pa-
per is based on the idea of pruning techniques; and the ap- 3.1 Preliminaries
proaches of matrix decomposition and low-precision weights We will formally introduce the symbol and annotations in this
are orthogonal but potentially useful here – it may be still section. The deep CNN network can be parameterized by
worth simplifying the weight matrix after pruning filters, {W(i) ∈ RNi+1 ×Ni ×K×K , 1 ≤ i ≤ L} W(i) denotes a
which would be taken as future work. matrix of connection weights in the i-th layer. Ni denotes the
Weight Pruning. Many recent works [Han et al., 2015b; number of input channels for the i-th convolution layer. L
denotes the number of layers. The shapes of input tensor U Algorithm 1 Algorithm Description of SFP
and output tensor V are Ni × Hi × Wi and Ni+1 × Hi+1 × Input: training data: X, pruning rate: Pi
Wi+1 , respectively. The convolutional operation of the i-th
the model with parameters W = {W(i) , 0 ≤ i ≤ L}.
layer can be written as:
Initialize the model parameter W
Vi,j = Fi,j ∗ U for 1 ≤ j ≤ Ni+1 , (1) for epoch = 1; epoch ≤ epochmax ; epoch + + do
Ni ×K×K Update the model parameter W based on X
where Fi,j ∈ R represents the j-th filter of the i-th for i = 1; i ≤ L; i + + do
layer. W(i) consists of {Fi,j , 1 ≤ j ≤ Ni+1 }. The Vi,j Calculate the `2 -norm for each filter kFi,j k2 , 1 ≤
represents the j-th output feature map of the i-th layer. j ≤ Ni+1
Pruning filters can remove the output feature maps. In Zeroize Ni+1 Pi filters by `2 -norm filter selection
this way, the computational cost of the neural network will end for
reduce remarkably. Let us assume the pruning rate of end for
SFP is Pi for the i-th layer. The number of filters of Obtain the compact model with parameters W∗ from W
this layer will be reduced from Ni+1 to Ni+1 (1 − Pi ), Output: The compact model and its parameters W∗
thereby the size of the output tensor Vi,j can be reduced to
Ni+1 (1 − Pi ) × Hi+1 × Wi+1 . As the output tensor of i-th
layer is the input tensor of i + 1-th layer, we can reduce the unimportant filters for the i-th weighted layer. In other words,
input size of i-th layer to achieve a higher acceleration ratio. the lowest Ni+1 Pi filters are selected, e.g., the blue filters in
Fig. 2. In practice, `2 -norm is used based on the empirical
3.2 Soft Filter Pruning (SFP) analysis.
Most of previous filter pruning works [Li et al., 2017; Liu et v
u Ni K K
al., 2017; He et al., 2017; Luo et al., 2017] compressed the uX X X
deep CNNs in a hard manner. We call them as the hard filter
p
kFi,j kp = t |Fi,j (n, k1 , k2 )|p , (2)
n=1 k1 =1 k2 =1
pruning. Typically, these algorithms firstly prune filters of a
single layer of a pre-trained model and fine-tune the pruned Filter Pruning. We set the value of selected Ni+1 Pi filters
model to complement the degrade of the performance. Then to zero (see the filter pruning step in Fig. 2). This can tem-
they prune the next layer and fine-tune the model again until porarily eliminate their contribution to the network output.
the last layer of the model is pruned. However, once filters are Nevertheless, in the following training stage, we still allow
pruned, these approaches will not update these filters again. these selected filters to be updated, in order to keep the repre-
Therefore, the model capacity is drastically reduced due to sentative capacity and the high performance of the model.
the removed filters; and such a hard pruning manner affects In the filter pruning step, we simply prune all the weighted
the performance of the compressed models negatively. layers at the same time. In this way, we can prune each
As summarized in Alg. 1, the proposed SFP algorithm can filter in parallel, which would cost negligible computation
dynamically remove the filters in a soft manner. Specifically, time. In contrast, the previous filter pruning methods al-
the key is to keep updating the pruned filters in the train- ways conduct layer by layer greedy pruning. After prun-
ing stage. Such an updating manner brings several bene- ing filters of one single layer, existing methods always re-
fits. It not only keeps the model capacity of the compressed quire training to converge the network [Luo et al., 2017;
deep CNN models as the original models, but also avoids the He et al., 2017]. This procedure cost much extra computa-
greedy layer by layer pruning procedure and enable pruning tion time, especially when the depth increases. Moreover, we
almost all layers at the same time. More specifically, our use the same pruning rate for all weighted layers. Therefore,
approach can prune a model either in the process of train- we need only one hyper-parameter Pi = P to balance the
ing from scratch, or a pre-trained model. In each training acceleration and accuracy. This can avoid the inconvenient
epoch, the full model is optimized and trained on the training hyper-parameter search or the complicated sensitivity anal-
data. After each epoch, the `2 -norm of all filters are com- ysis [Li et al., 2017]. As we allow the pruned filters to be
puted for each weighted layer and used as the criterion of our updated, the model has a large model capacity and becomes
filter selection strategy. Then we will prune the selected fil- more flexible and thus can well balance the contribution of
ters by setting the corresponding filter weights as zero, which each filter to the final prediction.
is followed by next training epoch. Finally, the original deep Reconstruction. After the pruning step, we train the net-
CNNs are pruned into a compact and efficient model. The work for one epoch to reconstruct the pruned filters. As
details of SFP is illustratively explained in Alg. 1, which can shown in Fig. 2, the pruned filters are updated to non-zero by
be divided into the following four steps. back-propagation. In this way, SFP allows the pruned model
Filter selection. We use the `p -norm to evaluate the impor- to have the same capacity as the original model during train-
tance of each filter as Eq. (2). In general, the convolutional ing. In contrast, hard filter pruning decreases the number of
results of the filter with the smaller `p -norm lead to relatively feature maps. The reduction of feature maps would dramat-
lower activation values; and thus have a less numerical im- ically reduce the model capacity, and further harm the per-
pact on the final prediction of deep CNN models. In term formance. Previous pruning methods usually require a pre-
of this understanding, such filters of small `p -norm will be trained model and then fine-tune it. However, as we inte-
given high priority of being pruned than those of higher `p - grate the pruning step into the normal training schema, our
norm. Particularly, we use a pruning rate Pi to select Ni+1 Pi approach can train the model from scratch. Therefore, the
k-th training epoch Pruned model (k+1)-th training epoch
filters ‖•‖p importance filters ‖•‖p importance filters ‖•‖p importance
1.231 1.231 2.512
0.331 0 1.324
2.056 Filter Pruning 2.056 0.056
Reconstruction
0.275 0 0.897

1.572 1.572 3.742

a b c
Figure 2: Overview of SFP. At the end of each training epoch, we prune the filters based on their importance evaluations. The filters are ranked
by their `p -norms (purple rectangles) and the small ones (blue circles) are selected to be pruned. After filter pruning, the model undergoes a
reconstruction process where pruned filters are capable of being reconstructed (i.e., updated from zeros) by the forward-backward process.
(a): filter instantiations before pruning. (b): filter instantiations after pruning. (c): filter instantiations after reconstruction.

fine-tuning stage is no longer necessary for SFP. As we will for computation complexity comparison, which is commonly
show in experiments, the network trained from scratch by used in previous work [Li et al., 2017; Luo et al., 2017].
SFP can obtain the competitive results with the one trained However, reduced FLOPs cannot bring the same level of re-
from a well-trained model by others. By leveraging the pre- alistic speedup because non-tensor layers (e.g., BN and pool-
trained model, SFP obtains a much higher performance and ing layers) also need the inference time on GPU [Luo et al.,
advances the state-of-the-art. 2017]. In addition, the limitation of IO delay, buffer switch
Obtaining Compact Model. SFP iterates over the filter and efficiency of BLAS libraries also lead to the wide gap be-
selection, filter pruning and reconstruction steps. After the tween theoretical and realistic speedup ratio. We compare the
model gets converged, we can obtain a sparse model contain- theoretical and realistic speedup in Section 4.3.
ing many “zero filters”. One “zero filter” corresponds to one
feature map. The features maps, corresponding to those “zero
filters”, will always be zero during the inference procedure. 4 Evaluation and Results
There will be no influence to remove these filters as well as 4.1 Benchmark Datasets and Experimental Setting
the corresponding feature maps. Specifically, for the prun-
ing rate Pi in the i-th layer, only Ni+1 (1 − Pi ) filters are Our method is evaluated on two benchmarks: CIFAR-
non-zero and have an effect on the final prediction. Consider 10 [Krizhevsky and Hinton, 2009] and ILSVRC-2012 [Rus-
pruning the previous layer, the input channel of i-th layer is sakovsky et al., 2015]. The CIFAR-10 dataset contains
changed from Ni to Ni (1 − Pi−1 ). We can thus re-build 50,000 training images and 10,000 testing images, which are
the i-th layer into a smaller one. Finally, a compact model categorized into 10 classes. ILSVRC-2012 is a large-scale
{W∗ (i) ∈ RNi+1 (1−Pi )×Ni (1−Pi−1 )×K×K } is obtained. dataset containing 1.28 million training images and 50k val-
idation images of 1,000 classes. Following the common set-
3.3 Computation Complexity Analysis ting in [Luo et al., 2017; He et al., 2017; Dong et al., 2017a],
Theoretical speedup analysis. Suppose the filter pruning we focus on pruning the challenging ResNet model in this
rate of the ith layer is Pi , which means the Ni+1 × Pi fil- paper. SFP should also be effective on different computer
ters are set to zero and pruned from the layer, and the other vision tasks, such as [Kang et al., 2017; Ren et al., 2015;
Ni+1 × (1 − Pi ) filters remain unchanged, and suppose the Dong et al., 2018; Shen et al., 2018b; Yang et al., 2010;
size of the input and output feature map of ith layer is Hi ×Wi Shen et al., 2018a; Dong et al., 2017b], and we will explore
and Hi+1 × Wi+1 . Then after filter pruning, the dimension this in future.
of useful output feature map of the ith layer decreases from In the CIFAR-10 experiments, we use the default parame-
Ni+1 × Hi+1 × Wi+1 to Ni+1 (1 − Pi ) × Hi+1 × Wi+1 . ter setting as [He et al., 2016b] and follow the training sched-
Note that the output of ith layer is the input of (i + 1) th ule in [Zagoruyko and Komodakis, 2016]. On ILSVRC-2012,
layer. And we further prunes the (i + 1)th layer with a fil- we follow the same parameter settings as [He et al., 2016a;
ter pruning rate Pi+1 , then the calculation of (i + 1)th layer He et al., 2016b]. We use the same data argumentation strate-
is decrease from Ni+2 × Ni+1 × k 2 × Hi+2 × Wi+2 to gies with PyTorch official examples [Paszke et al., 2017].
Ni+2 (1 − Pi+1 ) × Ni+1 (1 − Pi ) × k 2 × Hi+2 × Wi+2 . In We conduct our SFP operation at the end of every training
other words, a proportion of 1 − (1 − Pi+1 ) × (1 − Pi ) of the epoch. For pruning a scratch model, we use the normal train-
original calculation is reduced, which will make the neural ing schedule. For pruning a pre-trained model, we reduce the
network inference much faster. learning rate by 10 compared to the schedule for the scratch
Realistic speedup analysis. In theoretical speedup anal- model. We run each experiment three times and report the
ysis, other operations such as batch normalization (BN) and “mean ± std”. We compare the performance with other state-
pooling are negligible comparing to convolution operations. of-the-art acceleration algorithms, e.g., [Dong et al., 2017a;
Therefore, we consider the FLOPs of convolution operations Li et al., 2017; He et al., 2017; Luo et al., 2017].
Depth Method Fine-tune? Baseline Accu. (%) Accelerated Accu. (%) Accu. Drop (%) FLOPs Pruned FLOPs(%)
[Dong et al., 2017a] N 91.53 91.43 0.10 3.20E7 20.3
Ours(10%) N 92.20 ± 0.18 92.24 ± 0.33 -0.04 3.44E7 15.2
20
Ours(20%) N 92.20 ± 0.18 91.20 ± 0.30 1.00 2.87E7 29.3
Ours(30%) N 92.20 ± 0.18 90.83 ± 0.31 1.37 2.43E7 42.2
[Dong et al., 2017a] N 92.33 90.74 1.59 4.70E7 31.2
Ours(10%) N 92.63 ± 0.70 93.22 ± 0.09 -0.59 5.86E7 14.9
32
Ours(20%) N 92.63 ± 0.70 90.63 ± 0.37 0.00 4.90E7 28.8
Ours(30%) N 92.63 ± 0.70 90.08 ± 0.08 0.55 4.03E7 41.5
[Li et al., 2017] N 93.04 91.31 1.75 9.09E7 27.6
[Li et al., 2017] Y 93.04 93.06 -0.02 9.09E7 27.6
[He et al., 2017] N 92.80 90.90 1.90 - 50.0
[He et al., 2017] Y 92.80 91.80 1.00 - 50.0
Ours(10%) N 93.59 ± 0.58 93.89 ± 0.19 -0.30 1.070E8 14.7
56
Ours(20%) N 93.59 ± 0.58 93.47 ± 0.24 0.12 8.98E7 28.4
Ours(30%) N 93.59 ± 0.58 93.10 ± 0.20 0.49 7.40E7 41.1
Ours(30%) Y 93.59 ± 0.58 93.78 ± 0.22 -0.19 7.40E7 41.1
Ours(40%) N 93.59 ± 0.58 92.26 ± 0.31 1.33 5.94E7 52.6
Ours(40%) Y 93.59 ± 0.58 93.35 ± 0.31 0.24 5.94E7 52.6
[Li et al., 2017] N 93.53 92.94 0.61 1.55E8 38.6
[Li et al., 2017] Y 93.53 93.30 0.20 1.55E8 38.6
[Dong et al., 2017a] N 93.63 93.44 0.19 - 34.2
110 Ours(10%) N 93.68 ± 0.32 93.83 ± 0.19 -0.15 2.16E8 14.6
Ours(20%) N 93.68 ± 0.32 93.93 ± 0.41 -0.25 1.82E8 28.2
Ours(30%) N 93.68 ± 0.32 93.38 ± 0.30 0.30 1.50E8 40.8
Ours(30%) Y 93.68 ± 0.32 93.86 ± 0.21 -0.18 1.50E8 40.8

Table 1: Comparison of pruning ResNet on CIFAR-10. In “Fine-tune?” column, “Y” and “N” indicate whether to use the pre-trained model
as initialization or not, respectively. The “Accu. Drop” is the accuracy of the pruned model minus that of the baseline model, so negative
number means the accelerated model has a higher accuracy than the baseline model. A smaller number of ”Accu. Drop” is better.

4.2 ResNet on CIFAR-10 method [Luo et al., 2017], but the accuracy of our pruned
Settings. For CIFAR-10 dataset, we test our SFP on ResNet- model exceeds their model by 2.57%. Moreover, for prun-
20, 32, 56 and 110. We use several different pruning rates, ing a pre-trained ResNet-101, SFP reduces more than 40%
and also analyze the difference between using the pre-trained FLOPs of the model with even 0.2% top-5 accuracy in-
model and from scratch. crease, which is the state-of-the-art result. In contrast, the
Results. Tab. 1 shows the results. Our SFP could achieve performance degradation is inevitable for hard filter pruning
a better performance than the other state-of-the-art hard filter method. Maintained model capacity of SFP is the main rea-
pruning methods. For example, [Li et al., 2017] use the hard son for the superior performance. In addition, the non-greedy
pruning method to accelerate ResNet-110 by 38.6% speedup all-layer pruning method may have a better performance than
ratio with 0.61% accuracy drop when without fine-tuning. the locally optimal solution obtained from previous greedy
When using pre-trained model and fine-tuning, the accuracy pruning method, which seems to be another reason. Occa-
drop becomes 0.20%. However, we can accelerate the infer- sionally, large performance degradation happens for the pre-
ence of ResNet-110 to 40.8% speed-up with only 0.30% ac- trained model (e.g., 14.01% top-1 accuracy drop for ResNet-
curacy drop without fine-tuning. When using the pre-trained 50). This will be explored in our future work.
model, we can even outperform the original model by 0.18% To test the realistic speedup ratio, we measure the forward
with about more than 40% FLOPs reduced. time of the pruned models on one GTX1080 GPU with a
These results validate the effectiveness of SFP, which can batch size of 64 (shown in Tab. 3). The gap between theo-
produce a more compressed model with comparable perfor- retical and realistic model may come from and the limitation
mance to the original model. of IO delay, buffer switch and efficiency of BLAS libraries.

4.3 ResNet on ILSVRC-2012 4.4 Ablation Study

Settings. For ILSVRC-2012 dataset, we test our SFP on We conducted extensive ablation studies to further analyze
ResNet-18, 34, 50 and 101; and we use the same pruning rate each component of SFP.
30% for all the models. All the convolutional layer of ResNet Filter Selection Criteria. The magnitude based criteria
are pruned with the same pruning rate at the same time. (We such as `p -norm are widely used to filter selection because
do not prune the projection shortcuts for simplification, which computational resources cost is small [Li et al., 2017]. We
only need negligible time and do not affect the overall cost.) compare the `2 -norm and `1 -norm. For `1 -norm criteria,
Results. Tab. 2 shows that SFP outperforms other state- the accuracy of the model under pruning rate 10%, 20%,
of-the-art methods. For ResNet-34, SFP without fine- 30% are 93.68±0.60%, 93.68±0.76% and 93.34±0.12%,
tuning achieves more inference speedup to the hard pruning respectively. While for `2 -norm criteria, the accuracy
Fine- Top-1 Accu. Top-1 Accu. Top-5 Accu. Top-5 Accu. Top-1 Accu. Top-5 Accu. Pruned
Depth Method tune? Baseline(%) Accelerated(%) Baseline(%) Accelerated(%) Drop(%) Drop(%) FLOPs(%)
[Dong et al., 2017a] N 69.98 66.33 89.24 86.94 3.65 2.30 34.6
18
Ours(30%) N 70.28 67.10 89.63 87.78 3.18 1.85 41.8
[Dong et al., 2017a] N 73.42 72.99 91.36 91.19 0.43 0.17 24.8
34 [Li et al., 2017] Y 73.23 72.17 - - 1.06 - 24.2
Ours(30%) N 73.92 71.83 91.62 90.33 2.09 1.29 41.1
[He et al., 2017] Y - - 92.20 90.80 - 1.40 50.0
[Luo et al., 2017] Y 72.88 72.04 91.14 90.67 0.84 0.47 36.7
50
Ours(30%) N 76.15 74.61 92.87 92.06 1.54 0.81 41.8
Ours(30%) Y 76.15 62.14 92.87 84.60 14.01 8.27 41.8
Ours(30%) N 77.37 77.03 93.56 93.46 0.34 0.10 42.2
101
Ours(30%) Y 77.37 77.51 93.56 93.71 -0.14 -0.20 42.2

Table 2: Comparison of pruning ResNet on ImageNet. “Fine-tune?” and ”Accu. Drop” have the same meaning with Tab. 1.

Baseline Pruned Realistic Theoretical different SFP intervals may lead to different performance; so
Model time (ms) time (ms) Speed-up(%) Speed-up(%) we explore the sensitivity of SFP interval. We use the ResNet-
ResNet-18 37.10 26.97 27.4 41.8 110 under pruning rate 30% as a baseline, and change the SFP
ResNet-34 63.97 45.14 29.4 41.1
ResNet-50 135.01 94.66 29.8 41.8
interval from one epoch to ten epochs, as shown in Fig. 3(b).
ResNet-101 219.71 148.64 32.3 42.2 It is shown that the model accuracy has no large fluctuation
along with the different SFP intervals. Moreover, the model
accuracy of most (80%) intervals surpasses the accuracy of
Table 3: Comparison on the theoretical and realistic speedup. We one epoch interval. Therefore, we can even achieve a better
only count the time consumption of the forward procedure.
performance if we fine-tune this parameter.
Selection of pruned layers. Previous works always prune
94 94
a portion of the layers of the network. Besides, different lay-
Accuracy (%)

Accuracy (%)

ers always have different pruning rates. For example, [Li et

93 al., 2017] only prunes insensitive layers, [Luo et al., 2017]
93 skips the last layer of every block of the ResNet, and [Luo
92 Baseline Model Other epochs et al., 2017] prunes more aggressive for shallower layers and
Accelerated Model One epoch
0 20 40 60 92 1 3 5 7 9
prune less for deep layers. Similarly, we compare the perfor-
mance of pruning first and second layer of all basic blocks
Pruning rate (%) Epoch of ResNet-110. We set the pruning rate as 30%. The model
(a) Different Pruning Rates (b) Different SFP Intervals with all the first layers of blocks pruned has an accuracy of
93.96 ± 0.13%, while that with the second layers of blocks
Figure 3: Accuracy of ResNet-110 on CIFAR-10 regarding differ- pruned has an accuracy of 93.38 ± 0.44%. Therefore, differ-
ent hyper-parameters. (Solid line and shadow denotes the mean and ent layers have different sensitivity for SFP, and careful selec-
standard deviation of three experiment, respectively.) tion of pruned layers would potentially lead to performance
improvement, although more hyper-parameters are needed.
are 93.89±0.19%, 93.93±0.41% and 93.38±0.30%, respec-
tively. The performance of `2 -norm criteria is slightly better 5 Conclusion and Future Work
than that of `1 -norm criteria. The result of `2 -norm is dom- In this paper, we propose a soft filter pruning (SFP) approach
inated by the largest element, while the result of `1 -norm is to accelerate the deep CNNs. During the training procedure,
also largely affected by other small elements. Therefore, fil- SFP allows the pruned filters to be updated. This soft manner
ters with some large weights would be preserved by the `2 - can maintain the model capacity and thus achieve the supe-
norm criteria. So the corresponding discriminative features rior performance. Remarkably, SFP can achieve the competi-
are kept so the performance of the pruned model is better. tive performance compared to the state-of-the-art without the
Varying pruning rates. To comprehensively understand pre-trained model. Moreover, by leveraging the pre-trained
SFP, we test the accuracy of different pruning rates for model, SFP achieves a better result and advances the state-
ResNet-110, shown in Fig. 3(a). As the pruning rate in- of-the-art. Furthermore, SFP can be combined with other
creases, the accuracy of the pruned model first rises above the acceleration algorithms, e.g., matrix decomposition and low-
baseline model and then drops approximately linearly. For precision weights, to further improve the performance.
the pruning rate between 0% and about 23%, the accuracy of
the accelerated model is higher than the baseline model. This
shows that our SFP has a regularization effect on the neural Acknowledgments
network because SFP reduces the over-fitting of the model. Yi Yang is the recipient of a Google Faculty Research Award.
Sensitivity of SFP interval. By default, we conduct our We acknowledge the Data to Decisions CRC (D2D CRC), the
SFP operation at the end of every training epoch. However, Cooperative Research Centres Programme and ARC’s DE-
CRA (project DE170101415) for funding this research. We [Luo et al., 2017] Jian-Hao Luo, Jianxin Wu, and Weiyao
thank Amazon for the AWS Cloud Credits. Lin. ThiNet: A filter level pruning method for deep neural
network compression. In ICCV, 2017.
References [Molchanov et al., 2017] Pavlo Molchanov, Stephen Tyree,
[Dong et al., 2017a] Xuanyi Dong, Junshi Huang, Yi Yang, Tero Karras, Timo Aila, and Jan Kautz. Pruning convolu-
and Shuicheng Yan. More is less: A more complicated tional neural networks for resource efficient transfer learn-
network with less inference complexity. In CVPR, 2017. ing. In ICLR, 2017.
[Dong et al., 2017b] Xuanyi Dong, Deyu Meng, Fan Ma, [Paszke et al., 2017] Adam Paszke, Sam Gross, Soumith
and Yi Yang. A dual-network progressive approach to Chintala, Gregory Chanan, Edward Yang, Zachary De-
weakly supervised object detection. In ACM Multimedia, Vito, Zeming Lin, Alban Desmaison, Luca Antiga, and
2017. Adam Lerer. Automatic differentiation in pytorch. In
[Dong et al., 2018] Xuanyi Dong, Shoou-I Yu, Xinshuo NIPS-W, 2017.
Weng, Shih-En Wei, Yi Yang, and Yaser Sheikh. [Ren et al., 2015] Shaoqing Ren, Kaiming He, Ross Gir-
Supervision-by-Registration: An unsupervised approach shick, and Jian Sun. Faster r-cnn: Towards real-time object
to improve the precision of facial landmark detectors. In detection with region proposal networks. In NIPS, 2015.
CVPR, 2018.
[Russakovsky et al., 2015] Olga Russakovsky, Jia Deng,
[Guo et al., 2016] Yiwen Guo, Anbang Yao, and Yurong
Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma,
Chen. Dynamic network surgery for efficient DNNs. In Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael
NIPS, 2016. Bernstein, et al. ImageNet large scale visual recognition
[Han et al., 2015a] Song Han, Huizi Mao, and William J challenge. IJCV, 2015.
Dally. Deep compression: Compressing deep neural net-
[Shen et al., 2018a] Tao Shen, Tianyi Zhou, Guodong Long,
works with pruning, trained quantization and huffman cod-
ing. In ICLR, 2015. Jing Jiang, Shirui Pan, and Chengqi Zhang. Disan: Di-
rectional self-attention network for rnn/cnn-free language
[Han et al., 2015b] Song Han, Jeff Pool, John Tran, and understanding. In AAAI, 2018.
William Dally. Learning both weights and connections for
efficient neural network. In NIPS, 2015. [Shen et al., 2018b] Tao Shen, Tianyi Zhou, Guodong Long,
Jing Jiang, and Chengqi Zhang. Bi-directional block self-
[He et al., 2016a] Kaiming He, Xiangyu Zhang, Shaoqing attention for fast and memory-efficient sequence model-
Ren, and Jian Sun. Deep residual learning for image recog- ing. In ICLR, 2018.
nition. In CVPR, 2016.
[Szegedy et al., 2015] Christian Szegedy, Wei Liu, Yangqing
[He et al., 2016b] Kaiming He, Xiangyu Zhang, Shaoqing
Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov,
Ren, and Jian Sun. Identity mappings in deep residual
Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabi-
networks. In ECCV, 2016.
novich. Going deeper with convolutions. In CVPR, 2015.
[He et al., 2017] Yihui He, Xiangyu Zhang, and Jian Sun.
Channel pruning for accelerating very deep neural net- [Tai et al., 2016] Cheng Tai, Tong Xiao, Yi Zhang, Xiaogang
works. In ICCV, 2017. Wang, et al. Convolutional neural networks with low-rank
regularization. In ICLR, 2016.
[Jaderberg et al., 2014] Max Jaderberg, Andrea Vedaldi, and
Andrew Zisserman. Speeding up convolutional neural net- [Wen et al., 2016] Wei Wen, Chunpeng Wu, Yandan Wang,
works with low rank expansions. In BMVC, 2014. Yiran Chen, and Hai Li. Learning structured sparsity in
deep neural networks. In NIPS, 2016.
[Kang et al., 2017] Guoliang Kang, Jun Li, and Dacheng
Tao. Shakeout: A new approach to regularized deep neural [Yang et al., 2010] Yi Yang, Dong Xu, Feiping Nie,
network training. IEEE T-PAMI, 2017. Shuicheng Yan, and Yueting Zhuang. Image clustering us-
[Krizhevsky and Hinton, 2009] Alex Krizhevsky and Geof- ing local discriminant models and global integration. IEEE
frey Hinton. Learning multiple layers of features from tiny T-IP, 2010.
images. 2009. [Zagoruyko and Komodakis, 2016] Sergey Zagoruyko and
[Lebedev and Lempitsky, 2016] Vadim Lebedev and Victor Nikos Komodakis. Wide residual networks. In BMVC,
Lempitsky. Fast ConvNets using group-wise brain dam- 2016.
age. In CVPR, 2016. [Zhang et al., 2016] Xiangyu Zhang, Jianhua Zou, Kaiming
[Li et al., 2017] Hao Li, Asim Kadav, Igor Durdanovic, He, and Jian Sun. Accelerating very deep convolutional
Hanan Samet, and Hans Peter Graf. Pruning filters for networks for classification and detection. IEEE T-PAMI,
efficient ConvNets. In ICLR, 2017. 2016.
[Liu et al., 2017] Zhuang Liu, Jianguo Li, Zhiqiang Shen, [Zhou et al., 2017] Aojun Zhou, Anbang Yao, Yiwen Guo,
Gao Huang, Shoumeng Yan, and Changshui Zhang. Lin Xu, and Yurong Chen. Incremental network quantiza-
Learning efficient convolutional networks through net- tion: Towards lossless cnns with low-precision weights. In
work slimming. In ICCV, 2017. ICLR, 2017.
[Zhu et al., 2017] Chenzhuo Zhu, Song Han, Huizi Mao, and
William J Dally. Trained ternary quantization. In ICLR,
2017.

NX 3.0.22.00 - NX 4.0.22.00 - Service Manual (Service Manual For Download) PDF
100% (4)
NX 3.0.22.00 - NX 4.0.22.00 - Service Manual (Service Manual For Download) PDF
772 pages
Concise Guide to OTN optical transport networks
From Everand
Concise Guide to OTN optical transport networks
alasdair gilchrist
4/5 (2)
Feathered Phonics (Parrot Speech Training CD) Torrent - Kickass Torrents
No ratings yet
Feathered Phonics (Parrot Speech Training CD) Torrent - Kickass Torrents
5 pages
Learning To Prune Filters in Convolutional Neural Networks: Qianguih, Uneumann @usc - Edu Suya - You.civ@mail - Mil
No ratings yet
Learning To Prune Filters in Convolutional Neural Networks: Qianguih, Uneumann @usc - Edu Suya - You.civ@mail - Mil
10 pages
Dynamic and Progressive Filter Pruning For Compressing Convolutional Neural Networks From Scratch
No ratings yet
Dynamic and Progressive Filter Pruning For Compressing Convolutional Neural Networks From Scratch
9 pages
Safari - 07-Apr-2023 at 4:10 PM
No ratings yet
Safari - 07-Apr-2023 at 4:10 PM
1 page
Paper de Research
No ratings yet
Paper de Research
4 pages
Paper de Research de New
No ratings yet
Paper de Research de New
4 pages
Zehao Huang Data-Driven Sparse Structure ECCV 2018 Paper
No ratings yet
Zehao Huang Data-Driven Sparse Structure ECCV 2018 Paper
17 pages
Introduction
No ratings yet
Introduction
10 pages
Faraone 2018
No ratings yet
Faraone 2018
4 pages
Pruning Paper
No ratings yet
Pruning Paper
14 pages
References
No ratings yet
References
4 pages
MLSys 2020 What Is The State of Neural Network Pruning Paper
No ratings yet
MLSys 2020 What Is The State of Neural Network Pruning Paper
18 pages
Pruning_Networks_With_Cross-Layer_Ranking_amp_k-Reciprocal_Nearest_Filters
No ratings yet
Pruning_Networks_With_Cross-Layer_Ranking_amp_k-Reciprocal_Nearest_Filters
10 pages
PQAT
No ratings yet
PQAT
25 pages
1580 Rethinking the Value of Networ
No ratings yet
1580 Rethinking the Value of Networ
21 pages
P C N N R E I: Runing Onvolutional Eural Etworks FOR Esource Fficient Nference
No ratings yet
P C N N R E I: Runing Onvolutional Eural Etworks FOR Esource Fficient Nference
17 pages
4a TensorCores
No ratings yet
4a TensorCores
18 pages
Dep Pruning
No ratings yet
Dep Pruning
10 pages
Applsci 12 11184
No ratings yet
Applsci 12 11184
18 pages
1506 02626 PDF
No ratings yet
1506 02626 PDF
9 pages
A Survey of Model Compression and Acceleration For Deep Neural Networks
No ratings yet
A Survey of Model Compression and Acceleration For Deep Neural Networks
10 pages
Channel Pruning For Accelerating Very Deep Neural Networks
No ratings yet
Channel Pruning For Accelerating Very Deep Neural Networks
9 pages
(P4) CLIP-Q Deep Network
No ratings yet
(P4) CLIP-Q Deep Network
10 pages
To Prune, or Not To Prune: Exploring The Efficacy of Pruning For Model Compression
No ratings yet
To Prune, or Not To Prune: Exploring The Efficacy of Pruning For Model Compression
11 pages
2021-Huan Wang-Emerging Paradigms of Neural Network Pruning
No ratings yet
2021-Huan Wang-Emerging Paradigms of Neural Network Pruning
8 pages
Towards_Efficient_Neuromorphic_Hardware_Unsupervis
No ratings yet
Towards_Efficient_Neuromorphic_Hardware_Unsupervis
15 pages
Lec03 Pruning I
No ratings yet
Lec03 Pruning I
74 pages
Structured Pruning of Deep Convolutional Neural Netw Orks: Sajid Anwar, Kyuyeon Hwang and Wonyong Sung
No ratings yet
Structured Pruning of Deep Convolutional Neural Netw Orks: Sajid Anwar, Kyuyeon Hwang and Wonyong Sung
11 pages
21699-Article Text-25712-1-2-20220628
No ratings yet
21699-Article Text-25712-1-2-20220628
2 pages
模型剪枝在2d3d卷积网络中的研究与应用-悉尼大学在读博士生郭晋阳智东西公开课
No ratings yet
模型剪枝在2d3d卷积网络中的研究与应用-悉尼大学在读博士生郭晋阳智东西公开课
70 pages
134 Faster Cnns With Direct Sparse
No ratings yet
134 Faster Cnns With Direct Sparse
11 pages
Pruning Algorithms of Neural Networks - A Comparat
No ratings yet
Pruning Algorithms of Neural Networks - A Comparat
11 pages
A Hardware-Friendly High-Precision CNN Pruning Method and Its FPGA Implementation
No ratings yet
A Hardware-Friendly High-Precision CNN Pruning Method and Its FPGA Implementation
22 pages
Learning Efficient Convolutional Networks Through Network Slimming
No ratings yet
Learning Efficient Convolutional Networks Through Network Slimming
10 pages
Global Sparse Momentum SGD For Pruning Very Deep Neural Networks
No ratings yet
Global Sparse Momentum SGD For Pruning Very Deep Neural Networks
13 pages
Scalpel: Customizing DNN Pruning To The Underlying Hardware Parallelism
No ratings yet
Scalpel: Customizing DNN Pruning To The Underlying Hardware Parallelism
13 pages
Snip: S - N P C S: Ingle Shot Etwork Runing Based On Onnection Ensitivity
No ratings yet
Snip: S - N P C S: Ingle Shot Etwork Runing Based On Onnection Ensitivity
15 pages
A Survey of Quantization Methods For Efficient Neural Network Inference
No ratings yet
A Survey of Quantization Methods For Efficient Neural Network Inference
33 pages
ISQED2021
No ratings yet
ISQED2021
7 pages
ML System Optimization - Lecture 10 - Model Optimization Techniques
No ratings yet
ML System Optimization - Lecture 10 - Model Optimization Techniques
33 pages
Unstructured PruneFL
No ratings yet
Unstructured PruneFL
22 pages
Ilhan Resource-Efficient Transformer Pruning for Finetuning of Large Models CVPR 2024 Paper
No ratings yet
Ilhan Resource-Efficient Transformer Pruning for Finetuning of Large Models CVPR 2024 Paper
10 pages
2104.08500v4
No ratings yet
2104.08500v4
4 pages
EAI FinalProject Team3
No ratings yet
EAI FinalProject Team3
24 pages
Chen Run Dont Walk Chasing Higher FLOPS For Faster Neural Networks CVPR 2023 Paper
No ratings yet
Chen Run Dont Walk Chasing Higher FLOPS For Faster Neural Networks CVPR 2023 Paper
11 pages
Solodskikh_Integral_Neural_Networks_CVPR_2023_paper
No ratings yet
Solodskikh_Integral_Neural_Networks_CVPR_2023_paper
10 pages
Model Channel Pruning Method Based On Squeeze-and-Excitation Mechanism and Upper Quartile Truncation
No ratings yet
Model Channel Pruning Method Based On Squeeze-and-Excitation Mechanism and Upper Quartile Truncation
5 pages
Vahid Butterfly Transform An Efficient FFT Based Neural Architecture Design CVPR 2020 Paper
No ratings yet
Vahid Butterfly Transform An Efficient FFT Based Neural Architecture Design CVPR 2020 Paper
10 pages
Paper
No ratings yet
Paper
15 pages
1 - A Day in The Life of ChatGPT As A Researcher
No ratings yet
1 - A Day in The Life of ChatGPT As A Researcher
20 pages
Greedy Pruning For Continually Adapting Networks
No ratings yet
Greedy Pruning For Continually Adapting Networks
60 pages
the kmean quatization
No ratings yet
the kmean quatization
14 pages
Light Convolutional Neural Network by Neural Architecture Search and Model Pruning For Bearing Fault Diagnosis and Remaining Useful Life Prediction
No ratings yet
Light Convolutional Neural Network by Neural Architecture Search and Model Pruning For Bearing Fault Diagnosis and Remaining Useful Life Prediction
19 pages
20222-Article Text-24235-1-2-20220628
No ratings yet
20222-Article Text-24235-1-2-20220628
9 pages
CNN 1721592934
No ratings yet
CNN 1721592934
53 pages
ResRep - Lossless CNN Pruning Via Decoupling Remembering and Forgetting
No ratings yet
ResRep - Lossless CNN Pruning Via Decoupling Remembering and Forgetting
11 pages
BRDS: An FPGA-based LSTM Accelerator With Row-Balanced Dual-Ratio Sparsification
No ratings yet
BRDS: An FPGA-based LSTM Accelerator With Row-Balanced Dual-Ratio Sparsification
8 pages
With Morphnet, Google Helps You Build Faster and Smaller Neural Networks
No ratings yet
With Morphnet, Google Helps You Build Faster and Smaller Neural Networks
6 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Attractor Networks: Fundamentals and Applications in Computational Neuroscience
From Everand
Attractor Networks: Fundamentals and Applications in Computational Neuroscience
Fouad Sabry
No ratings yet
Concealed Weapon Detection Using Image Processing: Bingi Yogi Gopinath, Vasa Suresh Krishna, G.Srilatha
No ratings yet
Concealed Weapon Detection Using Image Processing: Bingi Yogi Gopinath, Vasa Suresh Krishna, G.Srilatha
5 pages
A Hybrid Machine Learning Model For Grade Prediction in Online Engineering Education
No ratings yet
A Hybrid Machine Learning Model For Grade Prediction in Online Engineering Education
22 pages
(eBook PDF) Calculus Early Transcendentals 7th Edition instant download
100% (5)
(eBook PDF) Calculus Early Transcendentals 7th Edition instant download
50 pages
Synchronization in Distributed Systems
No ratings yet
Synchronization in Distributed Systems
11 pages
Introhtml PPT 01 01
No ratings yet
Introhtml PPT 01 01
12 pages
P.E Quarter 2 Performance Task
No ratings yet
P.E Quarter 2 Performance Task
2 pages
Chapter 4 Ethics in The Marketplace
No ratings yet
Chapter 4 Ethics in The Marketplace
11 pages
End To End Data & AI Solutions
No ratings yet
End To End Data & AI Solutions
7 pages
Assignment Report
No ratings yet
Assignment Report
4 pages
Date and Time Formats Used in HTML - HTML - HyperText Markup Language - MDN
No ratings yet
Date and Time Formats Used in HTML - HTML - HyperText Markup Language - MDN
10 pages
Screenshot 2024-01-22 at 11.19.35
No ratings yet
Screenshot 2024-01-22 at 11.19.35
8 pages
PhonePe_Statement_Oct2024_Nov2024
No ratings yet
PhonePe_Statement_Oct2024_Nov2024
12 pages
42-51CFDMLReview Updated1 (1)
No ratings yet
42-51CFDMLReview Updated1 (1)
11 pages
Digital Electronics: C.H. Roth and L. L. Kinney, Fundamentals of
No ratings yet
Digital Electronics: C.H. Roth and L. L. Kinney, Fundamentals of
31 pages
New Debit Card Application Form
No ratings yet
New Debit Card Application Form
1 page
Al-Muslim Group IP Phone PABX No - 1
No ratings yet
Al-Muslim Group IP Phone PABX No - 1
8 pages
Best Pratices Camunda
No ratings yet
Best Pratices Camunda
50 pages
1200-2 - Test 2
No ratings yet
1200-2 - Test 2
18 pages
API Gateway APISIX Integrates Keycloak For Authentication - Apache APISIX - Cloud-Native API Gateway
No ratings yet
API Gateway APISIX Integrates Keycloak For Authentication - Apache APISIX - Cloud-Native API Gateway
11 pages
Disomat Opus
No ratings yet
Disomat Opus
162 pages
Program
No ratings yet
Program
5 pages
Swarnim Rai
No ratings yet
Swarnim Rai
1 page
Comp7 - Computer Fundamentals and Programming
No ratings yet
Comp7 - Computer Fundamentals and Programming
2 pages
CPSC 481-Artificial Intelligence Handout - Intro, Statespacesearch & Heuristics
No ratings yet
CPSC 481-Artificial Intelligence Handout - Intro, Statespacesearch & Heuristics
12 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
48 pages
Module B PDF
No ratings yet
Module B PDF
9 pages
Attachment
No ratings yet
Attachment
3 pages
AI Practical File Python PDF
No ratings yet
AI Practical File Python PDF
15 pages

Soft Filter Pruning For Accelerating Deep Convolutional Neural Networks

Uploaded by

Soft Filter Pruning For Accelerating Deep Convolutional Neural Networks

Uploaded by

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks

Yang He1,2 , Guoliang Kang2 , Xuanyi Dong2 , Yanwei Fu3∗ , Yi Yang1,2∗

Abstract Input Output

This paper proposed a Soft Filter Pruning (SFP)

1.572 1.572 3.742

4.3 ResNet on ILSVRC-2012 4.4 Ablation Study

ers always have different pruning rates. For example, [Li et

You might also like