Computer Science ›› 2024, Vol. 51 ›› Issue (11A): 240100138-5.doi: 10.11896/jsjkx.240100138

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Road Obstacle Detection Method Based on Self-attention and Bidirectional Feature Fusion

LI Ting, ZHAO Erdun, YANG Jun   

  1. School of Computer Science,Central China Normal University,Wuhan 430079,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:LI Ting,born in 1997,postgraduate.Her main research interests include computer vision,object detection and autopilot.
    ZHAO Erdun,born in 1972,Ph.D,associate professor.His main research interests include computer vision,object detection and autopilot.

Abstract: With the rapid development of technology,assisted driving technology has become an important direction for the future development of the automotive industry.In image-based road obstacle detection,existing methods have limited detection capabilities for targets with large scale changes,small targets,and targets with occlusion,often resulting in misjudgments and omissions.To address this problem,a road obstacle detection method based on self-attention and bidirectional feature fusion(CoXt-FCOS) is proposed.This method introduces a grouped self-attention mechanism module CoXT in the backbone to enhance the global information capture capabilities of the network.To solve the occlusion problem,a cross-stage pyramid pooling module SPPCSPC is introduced.In the feature fusion module,a path enhancement network is introduced,forming a bidirectional feature fusion module ESPAFPN,to enhance the network's perception of small targets.Experiments show that the CoXT-FCOS model has high accuracy,with an mAP of 88% on the CODA dataset,and can more accurately detect obstacles on the road.

Key words: Obstacle detection, Autopilot, Fully convolutional one-stage object detection, Self-attentio, Feature fusion

CLC Number: 

  • TP391
[1]TIAN Z,SHEN C,CHEN H,et al.Fcos:Fully convolutionalone-stage object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:9627-9636.
[2]GIRSHICK R.Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[3]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[4]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[5]REDMON J,DIVVALA S,GIRSHICKR,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[6]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal Loss for Dense Object Detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[7]CARION N,MASSA F,SYNNAEVEG,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer International Publishing,2020:213-229.
[8]ZHANG H,LI F,LIU S,et al.Dino:Detr with improved denoising anchor boxes for end-to-end object detection[J].arXiv:2203.03605,2022.
[9]LIU S,QI L,QIN H,et al.Path aggregation network for in-stance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8759-8768.
[10]LI Y,YAO T,PANY,et al.Contextual transformer networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(2):1489-1500.
[11]XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1492-1500.
[12]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475.
[13]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[14]WANG C Y,LIAO H Y M,WU Y H,et al.CSPNet:A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:390-391.
[15]ZHENG Z,WANG P,LIU W,et al.Distance-IoU loss:Fasterand better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:12993-13000.
[16]LI K,CHEN K,WANG H,et al.Coda:A real-world road corner case dataset for object detection in autonomous driving[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:406-423.
[17]ZHAO S,CHEN N.Review on Small-scale Pedestrian DetectionTechnology for Complex Pavement[J].Computer Systems and Applications,2022,31(7):1-11.
[1] LI Xin, PU Yuanyuan, ZHAO Zhengpeng, LI Yupan, XU Dan. Image Arbitrary Style Transfer via Artistic Aesthetic Enhancement [J]. Computer Science, 2024, 51(9): 129-139.
[2] LIU Qian, BAI Zhihao, CHENG Chunling, GUI Yaocheng. Image-Text Sentiment Classification Model Based on Multi-scale Cross-modal Feature Fusion [J]. Computer Science, 2024, 51(9): 258-264.
[3] LIU Sichun, WANG Xiaoping, PEI Xilong, LUO Hangyu. Scene Segmentation Model Based on Dual Learning [J]. Computer Science, 2024, 51(8): 133-142.
[4] WANG Chao, TANG Chao, WANG Wenjian, ZHANG Jing. Infrared Human Action Recognition Method Based on Multimodal Attention Network [J]. Computer Science, 2024, 51(8): 232-241.
[5] CHEN Siyu, MA Hailong, ZHANG Jianhui. Encrypted Traffic Classification of CNN and BiGRU Based on Self-attention [J]. Computer Science, 2024, 51(8): 396-402.
[6] LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao. Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images [J]. Computer Science, 2024, 51(7): 197-205.
[7] CAI Wenliang, HUANG Jun. Lane Detection Method Based on RepVGG [J]. Computer Science, 2024, 51(7): 236-243.
[8] WANG Yanlin, SUN Jing, YANG Hongbo, GUO Tao, PAN Jiahua, WANG Weilian. Classification Model of Heart Sounds in Pulmonary Hypertension Based on Time-Frequency Fusion Features [J]. Computer Science, 2024, 51(6A): 230800091-7.
[9] QUE Yue, GAN Menghan, LIU Zhiwei. Object Detection with Receptive Field Expansion and Multi-branch Aggregation [J]. Computer Science, 2024, 51(6A): 230600151-6.
[10] LIU Heng, LIN Hongyu, WU Tao. Detection Method for Workers’ Illegal Operation Behavior in PackagingWorkshop of CigaretteFactory [J]. Computer Science, 2024, 51(6A): 230700123-8.
[11] LIU Xiaohu, CHEN Defu, LI Jun, ZHOU Xuwen, HU Shan, ZHOU Hao. Speaker Verification Network Based on Multi-scale Convolutional Encoder [J]. Computer Science, 2024, 51(6A): 230700083-6.
[12] LI Guo, CHEN Chen, YANG Jing, QUN Nuo. Study on Tibetan Short Text Classification Based on DAN and FastText [J]. Computer Science, 2024, 51(6A): 230700064-5.
[13] KANG Zhiyong, LI Bicheng, LIN Huang. User Interest Recognition Method Incorporating Category Labels and Topic Information [J]. Computer Science, 2024, 51(6A): 230500169-8.
[14] HAN Zhigeng, ZHOU Ting, CHEN Geng, FU Chunshuo, CHEN Jian. RM-RT2NI:A Recommendation Model with Review Timeliness and Trusted Neighbor Influence [J]. Computer Science, 2024, 51(6A): 230800160-7.
[15] ZHANG Lanxin, XIANG Ling, LI Xianze, CHEN Jinpeng. Intelligent Fault Diagnosis Method for Rolling Bearing Based on SAMNV3 [J]. Computer Science, 2024, 51(6A): 230700167-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!