基于改进拆分注意力网络的目标检测算法

doi:10.11896/jsjkx.210800214

Abstract

Abstract: Recently,most object detection algorithms based on convolutional neural network have the problems of lacking of reasonable use of meaningful contextual information and are easy to miss the detection of hard targets.In order to solve these problems,this paper proposes an object detection algorithm based on improved split-attention networks.Firstly,the split attention mechanism is introduced,and the multi-path structure is combined with feature-map attention mechanism to improve its feature representations.Then,in the convolution layer,poly-scale convolution is used to replace the vanilla convolution to enhance the scale-sensitivity of the neural network.Finally,the proposed algorithm is applied to Faster R-CNN.Experiments are carried out on Pascal VOC and MS COCO datasets.Compared with the original algorithm,the mAP of the proposed algorithm has improved 1.6% and 2.4% respectively without introducing additional parameters and computational complexities,and the mAP of the proposed algorithm is also higher than that of other algorithms,which verifies its good performance.

Key words: Convolutional neural network, Contextual information, Object detection, Split-attention, Poly-scale convolution

CLC Number:

TP391

PAN Yi, WANG Li-ping. Object Detection Algorithm Based on Improved Split-attention Network[J].Computer Science, 2022, 49(10): 198-206.

References

[1]CHEN L,MA N,PANG G L,et al.Research on multi-view datafusion and balanced YOLOv3 for pedestrian detection[J].CAAI Transactions on Intelligent Systems,2021,16(1):57-65.
[2]YUAN Z H,SUN Q,LI G X,et al.Automatic Driving TargetDetection Based on Yolov3[J].Journal of Chongqing University of Technology(Natural Science),2020,34(9):56-61.
[3]HE Z H,HUANG S,RAN G,et al.An Improved Visual Back-ground Extractor Model for Moving Objects Detection Algorithm[J].Journal of Chinese Mini-Micro Computer Systems,2015,36(11):2559-2562.
[4]HE K,GKIOXARI G,DOLLÁRP,et al.Mask r-cnn[C] //Proceedings of the IEEE International Conference on Computer Vision.Venice,2017:2961-2969.
[5]LI J W,ZHOU X L,CHAN S X,et al.A Novel Video Target Tracking Method Based on Adaptive Convolutional Neural[J].Journal of Computer-Aided Design & Computer Graphics,2018,30(2):273-281.
[6]ZOU Z,SHI Z,GUO Y,et al.Object detection in 20 years:Asurvey[J].arXiv:1905.05055,2019.
[7]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.Las Vegas,2016:779-788.
[8]LIN T Y,GOYAL P,GIRSHICKR,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,2017:2980-2988.
[9]REN S,HE K,GIRSHICKR,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[10]DAI J,LI Y,HE K,et al.R-fcn:Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems.Barcelona,2016:379-387.
[11]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,2016:770-778.
[12]ZHANG H,WU C,ZHANG Z,et al.Resnest:Split-attentionnetworks[J].arXiv:2004.08955,2020.
[13]LONG J,SHELHAMER E,DARRELLT.Fully Convolutional Networks for Semantic Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):640-651.
[14]LI D,YAO A,CHEN Q.PSConv:Squeezing feature pyramid into one compact poly-scale convolutional layer[C]//Computer Vision-ECCV 2020.Glasgow,2020:615-632.
[15]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//Computer Vision-ECCV 2014.Zu-rich,2014:740-755.
[16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Advances in Neural Information Processing Systems,2012,25:1097-1105.
[17]SIMONYAN K,ZISSERMANA.Very deep convolutional net-works for large-scale image recognition[J].arXiv:1409.1556,2014.
[18]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston,2015:1-9.
[19]SZEGEDY C,VANHOUCKE V,IOFFES,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,2016:2818-2826.
[20]GIRSHICK R,DONAHUE J,DARRELLT,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Columbus,2014:580-587.
[21]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.Santiago,2015:1440-1448.
[22]REN S,HE K,GIRSHICKR,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[23]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,2017:4700-4708.
[24]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,2018:7132-7141.
[25]XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residualtransformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Ho-nolulu,2017:1492-1500.
[26]LI X,WANG W,HU X,et al.Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Long Beach,2019:510-519.
[27]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,2017:2117-2125.
[28]LIU W,ANGUELOV D,ERHAND,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016.Amsterdam.2016:21-37.
[29]SUN S,PANG J,SHI J,et al.Fishnet:A versatile backbone for image,region,and pixel level prediction[J].arXiv:1901.03495,2019.
[30]CHEN C F,FAN Q,MALLINAR N,et al.Big-little net:An efficient multi-scale feature representation for visual and speech recognition[J].arXiv:1807.03848,2018.
[31]LI Y,CHEN Y,WANG N,et al.Scale-aware trident networks for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Long Beach,2019:6054-6063.
[32]DAI J,QI H,XIONG Y,et al.Deformable convolutional net-works[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,2017:764-773.
[33]TAN M,LE Q V.Mixconv:Mixed depthwise convolutional kernels[J].arXiv:1907.09595,2019.
[34]CAI Z,VASCONCELOS N.Cascade R-CNN:High Quality Object Detection and Instance Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(5):1483-1498.
[35]DUAN K,BAI S,XIE L,et al.Centernet:Keypoint triplets for object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.Long Beach,2019:6569-6578.
[36]REDMON J,FARHADI A.Yolov3:An incremental improve-ment[J].arXiv:1804.02767,2018.

Related Articles 15

[1]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4]	LIU Dong-mei, XU Yang, WU Ze-bin, LIU Qian, SONG Bin, WEI Zhi-hui. Incremental Object Detection Method Based on Border Distance Measurement [J]. Computer Science, 2022, 49(8): 136-142.
[5]	WANG Can, LIU Yong-jian, XIE Qing, MA Yan-chun. Anchor Free Object Detection Algorithm Based on Soft Label and Sample Weight Optimization [J]. Computer Science, 2022, 49(8): 157-164.
[6]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[7]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[8]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[9]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[10]	YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[11]	HUANG Shao-bin, SUN Xue-wei, LI Rong-sheng. Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network [J]. Computer Science, 2022, 49(6A): 119-124.
[12]	WU Zi-bin, YAN Qiao. Projected Gradient Descent Algorithm with Momentum [J]. Computer Science, 2022, 49(6A): 178-183.
[13]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[14]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[15]	CHEN Yong-ping, ZHU Jian-qing, XIE Yi, WU Han-xiao, ZENG Huan-qiang. Real-time Helmet Detection Algorithm Based on Circumcircle Radius Difference Loss [J]. Computer Science, 2022, 49(6A): 424-428.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Object Detection Algorithm Based on Improved Split-attention Network

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0