计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230500176-7.doi: 10.11896/jsjkx.230500176

• 图像处理&多媒体技术 • 上一篇    下一篇

基于SPD-Conv结构和NAM注意力机制的鱼群小目标检测

谌雨章, 王诗琦, 周雯, 周婉婷   

  1. 湖北大学计算机与信息工程学院 武汉 430062
  • 发布日期:2024-06-06
  • 通讯作者: 王诗琦(wangshiqi888@foxmail.com)
  • 作者简介:(hubucyz@foxmail.com)
  • 基金资助:
    教育部产学合作协同育人项目(202101142041)

Small Object Detection for Fish Based on SPD-Conv and NAM Attention Module

CHEN Yuzhang, WANG Shiqi, ZHOU Wen, ZHOU Wanting   

  1. School of Computer Science and Information Engineering,Hubei University,Wuhan 430062,China
  • Published:2024-06-06
  • About author:CHEN Yuzhang,born in 1984,Ph.D,associate professor.His main researchinterests include photoelectric detection and image processing.
    WANG Shiqi,born in 1999,posgra-duate.Her main research interests include deep learning and neural networks.
  • Supported by:
    Industry-University Cooperation and Education Program of the Ministry of Education(202101142041).

摘要: 为解决因水下成像环境退化导致图像分辨率较低,以及因鱼群目标较小等因素导致的检测精度不高的问题,提出了一种结合SPD-Conv结构和NAM注意力机制的改进YOLOv7检测算法。首先,采用Space-to-Depth(SPD)结构改进头部网络,取代了网络中原有的跨步卷积结构,保留了更多的细粒度信息,提升了特征学习的效率,提高了网络对低分辨率图像的检测效果。然后在网络中引入Normalization-based Attention Module(NAM)注意力机制,采用CBAM的模块集成方式,使用BN缩放因子来计算注意力权重,抑制了不显著的特征,提升了小目标检测的准确率。最后针对水下成像退化,对检测图片做反卷积预处理,减小了水下成像退化因素对检测造成的影响。实验结果显示,在WildFish数据集上模型的整体精度达到97.2%,与YOLOv7算法相比提升了7.6%,准确率提升了8.5%,召回率提升了9.8%,与Efficientdet,SSD,YOLOv5及YOLOv8算法相比,所提模型精度分别提升了12.6%,17.8%,4%及2.9%,在Aquarium数据集上模型的整体精度达到80.5%,相比Efficientdet,SSD,YOLOv5,YOLOv7及YOLOv8分别提升了18.4%,11.6%,6.9%,2.0%及2.7%,可以满足水下鱼群识别的需求。

关键词: SPD-Conv结构, NAM注意力机制, YOLOv7算法, 鱼群检测, 目标检测

Abstract: In order to solve the problem of low image resolution due to the degradation of underwater imaging environment and low detection accuracy caused by small fish targets,an improved YOLOv7 detection algorithm combining SPD-Conv structure and NAM attention mechanism is proposed.Firstly,the space-to-fepth(SPD) structure is used to improve the head network,which replaces the original straddle convolution structure in the network,retains more fine-grained information,improves the efficiency of feature learning,and improves the detection effect of the network on low-resolution images.Then,the normalization-based attention module(NAM) attention mechanism is introduced into the network,and the module integration method of CBAM is adopted,and the BN scaling factor is used to calculate the attention weight,which suppresses the insignificant features and improves the accuracy of small target detection.Finally,for underwater imaging degradation,the detection image is deconvolved and preprocessed,which reduces the impact of underwater imaging degradation factors on detection.Experimental results show that in the WildFish dataset,the overall accuracy of the model reaches 97.2%,which is 7.6% higher than that of the YOLOv7 algorithm,the accuracy rate is increased by 8.5%,and the recall rate is increased by 9.8%,compared with the Efficientdet,SSD,YOLOv5 and YOLOv8 algorithms,the accuracy of the proposed model is improved by 12.6%,17.8%,4% and 2.9%,respectively.The overall accuracy of the model reaches 80.5%,which is 18.4%,11.6%,6.9%,2.0% and 2.7% higher than that of Efficientdet,SSD,YOLOv5,YOLOv7 and YOLOv8,respectively,which can meet the needs of underwater fish identification.

Key words: Space-to-Depth Conv(SPD Conv), Normalization-based attention module(NAM), YOLOv7, Fish detection, Object detection

中图分类号: 

  • TP391.41
[1]ID L,MIAO Z,PENG F,et al.Automatic counting methods in aquaculture:a review[J].Journal of the World Aquaculture Society,2021,52(2):269-283.
[2]FAN L Z,LIU Y.Automate fry counting using computer vision and multi-class least squares support vector machine[J].Aquaculture,2013,380/381/382/383:91-98.
[3]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.Columbus,OH,USA.IEEE,2014:580-587.
[4]LIU W,ANGUELOV D,ERHAND,et al.Ssd:Single shotmultibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
[5]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[6]REDMON J,FARHADIA.YOLO9000:better,faster,stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7263-7271.
[7]REDMON J,FARHADI A.Yolov3:An incremental improve-ment[J].arXiv:1804.02767,2018.
[8]TSENG C H,KUO C H.Detecting and counting harvested fish and identifying fish types in electronic monitoring system videos using deep convolutional neural networks[J].ICES Journal of Marine Science,2020,77(4):1367-1378.
[9]ZENG L C,SUN B,ZHU D Q.Underwater target detectionbased on Faster R-CNN and adversarial occlusion network[J].Engineering Applications of Artificial Intelligence,2021,100:104190.
[10]ZHAO D,YANG B,DOU Y,et al.Underwater fish detection in sonar image based on an improved Faster RCNN[C]//2022 9th International Forum on Electrical Engineering and Automation(IFEEA).Zhuhai,China,2022:358-363.
[11]SHEN J Y,LI L Y,DAI Y L,et al.A fish stock detection method based on feature fusion SSD[J].Computer Simulation,2020,37(11):422-426,469.
[12]ZHANG L,HUANG L,LI B B,et al.Fish counting method based on multi-scale fusion and anchorless YOLO v3[J].Transactions of the Chinese Society for Agricultural Machinery,2021,52(S1):237-244.
[13]ABDULLAH A M,FAKHRUL H,MD F H B E,et al.Fahad Hasan Bhuiyan EMON,et al.YOLO-Fish:A robust fish detection model to detect fish in realistic underwater environment[J].Ecological Informatics,2022,72:101847.
[14]ZHAO S L,ZHANG S,LU J M,et al.A lightweight dead fish detection method based on deformable convolution and YOLOV4[J].Computers and Electronics in Agriculture,2022,198:107098.
[15]ZHANG Y S,XU W X,YANG S S,et al.Improved YOLOX detection algorithm for contraband in X-ray images[J].Applied Optics,2022,61:6297-6310.
[16]VIJIYAKUMAR K,GOVINDASAMY V,AKILA G.Hybridi-zation of Deep Convolutional Neural Network for Underwater Object Detection and Tracking Model[J].Microprocessors and Microsystems,2022,94:104628.
[17]WANG C Y,BOCHKOVSKIY Al,LIAO H Y.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J].arXiv:2207.02696,2022.
[18]SUNKARA R,LUO T.No More Strided Convolutions or Pooling:A New CNN Building Block for Low-Resolution Images and Small Objects[J].arXiv:2208.03641,2022.
[19]LIU Y,SHAO Z,TENG Y,et al.NAM:Normalization-basedAttention Module[J].arXiv,abs/2111.12419,2021.
[20]MIAO Y.Underwater image adaptive restoration and analysisby turbulence model[C]//2012 World Congress on Information and Communication Technologies.IEEE,2012:1182-1187.
[21]ZHUANG P,WANG Y,QIAO Y Y.Wildfish:A large bench-mark for fish recognition in the wild[C]//Proceedings of the 26th ACM international conference on Multimedia.2018:1301-1309.
[22]Roboflow.Aquarium Combined Image Dataset[EB/OL].https://roboflow.com
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!