Journal of Shanghai Jiao Tong University ›› 2023, Vol. 57 ›› Issue (9): 1203-1213.doi: 10.16183/j.cnki.jsjtu.2022.077
Special Issue: 《上海交通大学学报》2023年“电子信息与电气工程”专题
• Electronic Information and Electrical Engineering • Previous Articles Next Articles
Received:
2022-03-21
Revised:
2022-05-30
Accepted:
2022-06-06
Online:
2023-09-28
Published:
2023-09-27
Contact:
LEI Xuemei
E-mail:ndlxm@imu.edu.cn
CLC Number:
LIU Yu, LEI Xuemei. A Structured Pruning Method Integrating Characteristics of MobileNetV3[J]. Journal of Shanghai Jiao Tong University, 2023, 57(9): 1203-1213.
Add to citation manager EndNote|Ris|BibTeX
URL: https://xuebao.sjtu.edu.cn/EN/10.16183/j.cnki.jsjtu.2022.077
Tab.2
m,γ, and Sl(n) at different pruning rates
剪枝率/% | m | γ | Sl(n) |
---|---|---|---|
10 | 2.306 4×10-12 | 4.003 9×10-12~4.922 5×10-10 | 0~0.651 8 |
20 | 1.072 1×10-11 | 8.369 6×10-17~5.289 2×10-10 | 0.006 3~0.705 4 |
30 | 2.904 7×10-11 | 1.814 9×10-11~4.982 8×10-10 | 0.035 7~0.681 2 |
40 | 7.888 8×10-11 | 4.654 0×10-11~5.280 5×10-10 | 0.062 5~0.830 4 |
50 | 0.070 0 | 1.188 0×10-10~2.804 1×10-01 | 0.071 4~0.794 6 |
60 | 0.146 6 | 0.087 5~0.583 8 | 0.187 5~0.830 4 |
Tab.3
Parameters and FLOPs at different pruning rates
剪枝 率/% | 准确 率/% | 参数 量×10-6 | 参数 减少量/% | 计算量 | 计算 减少量/% |
---|---|---|---|---|---|
0 | 88.28 | 4.22 | 0 | 2.30×108 | 0 |
10 | 87.86 | 3.68 | 12.8 | 2.17×108 | 5.7 |
20 | 88.26 | 3.26 | 22.7 | 2.00×108 | 13.0 |
30 | 88.23 | 2.89 | 31.5 | 1.81×108 | 21.3 |
40 | 88.69 | 2.61 | 38.2 | 1.61×108 | 30.0 |
50 | 88.55 | 2.34 | 44.5 | 1.38×108 | 40.0 |
60 | 87.99 | 2.16 | 48.8 | 9.87×107 | 57.1 |
Tab.4
Comparison of several pruning criteria on CIFAR-10 (50% pruned)
剪枝方法 | 准确率/% | 参数量/M | 参数减少量/% | 计算量 | 计算减少量 |
---|---|---|---|---|---|
正常训练 | 88.28 | 4.22 | 0 | 2.30×108 | 0 |
稀疏训练(λ=105) | 88.15 | 4.22 | 0 | 2.30×108 | 0 |
稀疏性公式[ | 88.12 | 2.44 | 42.2 | 1.18×108 | 48.7 |
γ[ | 88.37 | 2.34 | 44.5 | 1.39×108 | 39.6 |
L1norm+γ[ | 88.36 | 2.33 | 44.8 | 1.40×108 | 39.1 |
稀疏性公式+γ,本文提出 | 88.55 | 2.34 | 44.5 | 1.18×108 | 40.0 |
Tab.5
Comparison of model channels before and after pruning
输入尺寸 | 模块 | 模块中 通道数 | 模块中通道 数(剪枝后) | 输出 通道数 | 激励模块 | 激活函数 | 步长 |
---|---|---|---|---|---|---|---|
224×224×3 | conv2d | — | — | 16 | — | HS | 2 |
112×112×16 | bneck, 3×3 | 16 | 9 | 16 | — | RE | 1 |
112×112×16 | bneck, 3×3 | 64 | 49 | 24 | — | RE | 2 |
56×56×24 | bneck, 3×3 | 72 | 42 | 24 | — | RE | 1 |
56×56×24 | bneck, 5×5 | 72 | 72 | 40 | √ | RE | 2 |
28×28×40 | bneck, 5×5 | 120 | 102 | 40 | √ | RE | 1 |
28×28×40 | bneck, 5×5 | 120 | 89 | 40 | √ | RE | 1 |
28×28×40 | bneck, 3×3 | 240 | 223 | 80 | — | HS | 2 |
14×14×80 | bneck, 3×3 | 200 | 144 | 80 | — | HS | 1 |
14×14×80 | bneck, 3×3 | 184 | 139 | 80 | — | HS | 1 |
14×14×80 | bneck, 3×3 | 184 | 112 | 80 | — | HS | 1 |
14×14×80 | bneck, 3×3 | 480 | 209 | 112 | √ | HS | 1 |
14×14×112 | bneck, 3×3 | 672 | 38 | 112 | √ | HS | 1 |
14×14×112 | bneck, 5×5 | 672 | 540 | 160 | √ | HS | 2 |
7×7×160 | bneck, 5×5 | 960 | 484 | 160 | √ | HS | 1 |
7×7×160 | bneck, 5×5 | 960 | 255 | 160 | √ | HS | 1 |
7×7×160 | conv2d, 1×1 | — | — | 960 | — | HS | 1 |
7×7×960 | pool, 7×7 | — | — | — | — | — | 1 |
1×1×960 | conv2d, 1×1, NBN | — | — | 1 280 | — | HS | 1 |
1×1×1280 | conv2d, 1×1, NBN | — | — | q | — | — | 1 |
Tab.6
Parameters and FLOPs at different pruning rates (CIFAR100)
剪枝 率/% | 准确 率/% | 参数 量/M | 参数 减少量/% | 计算量 | 计算 减少量/% |
---|---|---|---|---|---|
0 | 61.47 | 4.33 | 0 | 2.30×108 | 0 |
10 | 60.81 | 3.59 | 17.1 | 2.18×108 | 5.2 |
20 | 61.25 | 3.29 | 24.0 | 2.07×108 | 10.0 |
30 | 61.51 | 2.93 | 32.3 | 1.90×108 | 17.4 |
40 | 60.92 | 2.62 | 39.5 | 1.70×108 | 26.1 |
50 | 61.22 | 2.38 | 45.0 | 1.52×108 | 33.9 |
60 | 61.16 | 2.16 | 50.1 | 1.18×108 | 48.7 |
[1] |
KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90.
doi: 10.1145/3065386 URL |
[2] | 李洋洋, 史历程, 万卫兵, 等. 基于卷积神经网络的三维物体检测方法[J]. 上海交通大学学报, 2018, 52(1): 7-12. |
LI Yangyang, SHI Licheng, WAN Weibing, et al. A convolutional neural network-based method for 3D object detection[J]. Journal of Shanghai Jiao Tong University, 2018, 52(1): 7-12. | |
[3] |
KANG J, TARIQ S, OH H, et al. A survey of deep learning-based object detection methods and datasets for overhead imagery[J]. IEEE Access, 2022, 10: 20118-20134.
doi: 10.1109/ACCESS.2022.3149052 URL |
[4] | 张峻宁, 苏群星, 王成, 等. 一种改进变换网络的域自适应语义分割网络[J]. 上海交通大学学报, 2021, 55(9): 1158-1168. |
ZHANG Junning, SU Qunxing, WANG Cheng, et al. A domain adaptive semantic segmentation network based on improved transformation network[J]. Journal of Shanghai Jiao Tong University, 2021, 55(9): 1158-1168. | |
[5] | LI X, YANG Y B, ZHAO Q J, et al. Spatial pyramid based graph reasoning for semantic segmentation[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 8947-8956. |
[6] | 高晗, 田育龙, 许封元, 等. 深度学习模型压缩与加速综述[J]. 软件学报, 2021, 32(1): 68-92. |
GAO Han, TIAN Yulong, XU Fengyuan, et al. Survey of deep learning model compression and acceleration[J]. Journal of Software, 2021, 32(1): 68-92. | |
[7] |
耿丽丽, 牛保宁. 深度神经网络模型压缩综述[J]. 计算机科学与探索, 2020, 14(9): 1441-1455.
doi: 10.3778/j.issn.1673-9418.2003056 |
GENG Lili, NIU Baoning. Survey of deep neural networks model compression[J]. Journal of Frontiers of Computer Science & Technology, 2020, 14(9): 1441-1455. | |
[8] | WU J, WANG Y, WU Z, et al. Deep k-means: Retraining and parameter sharing with harder cluster assignments for compressing deep convolutions[C]//Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018: 5363-5372. |
[9] | AGGARWAL V, WANG W L, ERIKSSON B, et al. Wide compression: Tensor ring nets[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 9329-9338. |
[10] | CHEN H T, GUO T Y, XU C, et al. Learning student networks in the wild[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021: 6424-6433. |
[11] | HOWARD A G, ZHU M, CHEN B, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-04-17) [2022-03-18]. https://arxiv.org/abs/1704.04861. |
[12] | SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 4510-4520. |
[13] | HOWARD A, SANDLER M, CHEN B, et al. Searching for MobileNetV3[C]//2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE, 2019: 1314-1324. |
[14] | CHOLLET F. Xception: Deep learning with depthwise separable convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017: 1800-1807. |
[15] | KIM E, AHN C, OH S. NestedNet: Learning nested sparse structures in Deep Neural Networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018: 8669-8678. |
[16] | LI Y S, CHEN Y P, DAI X Y, et al. MicroNet: Improving image recognition with extremely low FLOPs[C]//2021 IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021: 458-467. |
[17] | YANN L C, DENKER J S, SOLLA S A. 1990. Optimal brain damage[J]. Neural Information Proceeding Systems. 1989, 2(279): 598-605. |
[18] | HASSIBI B, STORK D G, WOLFF G J. Optimal Brain Surgeon and general network pruning[C]//IEEE International Conference on Neural Networks. San Francisco, USA: IEEE, 1993: 293-299. |
[19] | HAN S, POOL J, TRAN J, et al. Learning both weights and connections for efficient neural networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015: 1135-1143. |
[20] | CHEN W L, WILSON J T, TYREE S, et al. Compressing neural networks with the hashing trick[EB/OL]. (2015-04-19)[2022-03-18]. https://arxiv.org/abs/1504.04788. |
[21] | LI H, ASIM K, IGOR D, et al. Pruning filters for efficient convNets[EB/OL]. (2017-05-10) [2022-03-18], https://arxiv.org/abs/1608.08710. |
[22] | CHEN Y H, EMER J, SZE V. Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks[C]//2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture. Seoul, Korea: IEEE, 2016: 367-379. |
[23] | LIU Z, LI J, SHEN Z, et al. Learning efficient convolutional networks through network slimming[C]//2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2755-2763. |
[24] |
韦越, 陈世超, 朱凤华, 等. 基于稀疏正则化的卷积神经网络模型剪枝方法[J]. 计算机工程, 2021, 47(10): 61-66.
doi: 10.19678/j.issn.1000-3428.0059375 |
WEI Yue, CHEN Shichao, ZHU Fenghua, et al. Pruning method for convolutional neural network models based on sparse regularization[J]. Computer Engineering, 2021, 47(10): 61-66.
doi: 10.19678/j.issn.1000-3428.0059375 |
|
[25] | 卢海伟, 夏海峰, 袁晓彤. 基于滤波器注意力机制与特征缩放系数的动态网络剪枝[J]. 小型微型计算机系统, 2019, 40(9): 1832-1838. |
LU Haiwei, XIA Haifeng, YUAN Xiaotong. Dynamic network pruning via filter attention mechanism and feature scaling factor[J]. Journal of Chinese Computer Systems, 2019, 40(9): 1832-1838. | |
[26] | LIU C T, LIN T W, WU Y H, et al. Computation-performance optimization of convolutional neural networks with redundant filter removal[J]. IEEE Transactions on Circuits & Systems, 2019, 66(5): 1908-1921. |
[27] | IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on International Conference on Machine Learning. Lille, France: JMLR: W&CP, 2015: 448-456. |
[28] |
KULKARNI U, MEENA S M, GURLAHOSUR S V, et al. Quantization friendly MobileNet (QF-MobileNet) architecture for vision based applications on embedded platforms[J]. Neural Networks: The Official Journal of the International Neural Network Society, 2021, 136: 28-39.
doi: 10.1016/j.neunet.2020.12.022 URL |
[29] | 叶会娟, 刘向阳. 基于稀疏卷积核的卷积神经网络研究及其应用[J]. 信息技术, 2017, 41(10): 5-9. |
YE Huijuan, LIU Xiangyang. Research and application of convolutional neural network based on sparse convolution kernel[J]. Information Technology, 2017, 41(10): 5-9. | |
[30] |
WU S L, ZHANG F R, CHEN H D, et al. Semantic understanding based on multi-feature kernel sparse representation and decision rules for mangrove growth[J]. Information Processing & Management, 2022, 59(2): 102813.
doi: 10.1016/j.ipm.2021.102813 URL |
[31] | MERINO P. A difference-of-convex functions approach for sparse PDE optimal control problems with nonconvex costs[J]. Computational Optimization & Applications, 2019, 74(1): 225-258. |
[32] |
GAO X R, BAI Y Q, LI Q. A sparse optimization problem with hybrid L2-Lp regularization for application of magnetic resonance brain images[J]. Journal of Combinatorial Optimization, 2021, 42(4): 760-784.
doi: 10.1007/s10878-019-00479-x |
[1] | WANG Yalun, ZHOU Tao, CHEN Zhong, WANG Yi, QUAN Hao. Stepwise Inertial Intelligent Control of Wind Power for Frequency Regulation Based on Stacked Denoising Autoencoder and Deep Neural Network [J]. Journal of Shanghai Jiao Tong University, 2023, 57(11): 1477-1491. |
[2] | ZHENG Dezhong, YANG Yuanyuan, XIE Zhe, NI Yangfan, LI Wentao. Data Splitting Method of Distance Metric Learning Based on Gaussian Mixed Model [J]. Journal of Shanghai Jiao Tong University, 2021, 55(2): 131-140. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||