计算机科学 ›› 2021, Vol. 48 ›› Issue (11A): 334-339.doi: 10.11896/jsjkx.210100138

• 图像处理& 多媒体技术 • 上一篇    下一篇

SCTD1.0:声呐常见目标检测数据集

周彦1, 陈少昌1, 吴可1, 宁明强1, 陈宏昆2, 张鹏1   

  1. 1 海军工程大学电子工程学院 武汉430033
    2 中国人民解放军92118部队 浙江 舟山316000
  • 出版日期:2021-11-10 发布日期:2021-11-12
  • 通讯作者: 张鹏(pengzhang.ai@outlook.com)
  • 作者简介:986165364@qq.com
  • 基金资助:
    国家自然科学基金(61671461)

SCTD 1.0:Sonar Common Target Detection Dataset

ZHOU Yan1, CHEN Shao-chang1, WU Ke1, NING Ming-qiang1, CHEN Hong-kun2, ZHANG Peng1   

  1. 1 School of Electronic Engineering,Naval University of Engineering,Wuhan 430033,China
    2 92118 Troops of PLA,Zhoushan,Zhejiang 316000,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:ZHOU Yan,born in 1991,postgraduate.His main research interests include deep learning and computer vision.
    ZHANG Peng,born in 1996,Ph.D.His main research interests include research on rotation pattern recognition of convo-lutional neural network and automated deep learning.
  • Supported by:
    National Natural Science Foundation of China(61671461).

摘要: 近年来,卷积神经网络(CNN)在大规模自然图像数据集(如ImageNet,COCO)中获得了广泛应用,但在声呐图像检测识别领域的应用研究较缺乏,其存在声呐图像目标检测和分类数据集缺乏且水下目标样本往往面临样本稀少、不平衡等问题。针对这一问题,在进行广泛收集声呐图像的基础上,构建了一个完全公开的、可以用于开展声呐图像检测和分类研究的声呐常见目标检测数据集SCTD1.0,该数据集目前已包含水下沉船、失事飞机残骸、遇难者3类典型目标,共计596个样本。在SCTD1.0的基础上,文中采用迁移学习的方式测试了检测和分类的基准,具体来说:针对检测任务,使用特征金字塔网络对多尺度特征进行组合利用,比较了YOLOv3,Faster R-CNN,Cascade R-CNN这3种检测框架在本数据集上的性能表现;针对分类任务,对比了VGGNet,ResNet50,DenseNet 3种网络的分类性能,分类准确率达到了90%左右。

关键词: 检测与分类, 卷积神经网络, 迁移学习, 声呐图像, 数据集

Abstract: In recent years,convolutional neural networks (CNN) have been widely used in large-scale natural image datasets (such as ImageNet,COCO).However,there is a lack of applied research in the field of sonar image detection and recognition,which suffers from a lack of sonar image target detection and classification datasets and often faces sparse and unbalanced samples of underwater targets.In response to this problem,based on the extensive collection of sonar images,this paper constructs a completely open sonar common target detection dataset SCTD1.0 that can be used for sonar image detection and classification research.The dataset currently contains three types of typical targets:underwater shipwreck,wreckage of crashed aircraft,and victims,with a total of 596 samples.On the basis of SCTD1.0,this paper uses transfer learning to test the benchmarks of detection and classification.Specifically,for the detection task,the feature pyramid network is used to combine and utilize multi-scale features,and the performance of the three detection frameworks YOLOv3,Faster R-CNN,and Cascade R-CNN on this dataset is compared.For classification tasks,the classification performance of the three networks of VGGNet,ResNet50,and DenseNet is compared,and the classification accuracy rate reaches about 90%.

Key words: Convolutional neural network, Dataset, Detection and classification, Sonar image, Transfer learning

中图分类号: 

  • TP391
[1]LUO X B,XU D M,HU J J,et al.Application Research of 3D Imaging Sonar System in Salvage Process[J].Applied Mechanics and Materials,2014,643(8):279-282.
[2]DONG L Y,SHAN R,LIU H M,et al.Survey RecognitionMethod of Side Scan Sonar Image Based on Fractal Texture Features[J/OL].Marine Geology and Quaternary Geology:1-8.[2021-02-10].https://doi.org/10.16562/j.cnki.0256-1492.2020070301.
[3]EILER J H,GROTHUES T M,DOBARRO J A,et al.Comparing autonomous underwater vehicle(AUV) and vessel-based tracking performance for locating acoustically tagged fish[J].Marine Fisheries Review,2013,75(4):27-42.
[4]SHEN J Y,LI L Y,DAI Y L,et al.Fish detection and monitoring system based on YOLO algorithm[J].Journal of Suzhou University of Science and Technology (Natural Science),2020,37(3):68-73.
[5]SHEN W,ZHU Z H,ZHANG J,et al.Fish target recognition and counting based on dual-frequency recognition sonar[J].Fishery Modernization,2020,47(6):83-89.
[6]DOBECK G J,HYLAND J C,SMEDLEY L,et al.Automateddetection and classification of sea mines in sonar imagery[J].1997,3079:90-110.
[7]HU H B,MEI X H.A brief description of mine target detection technology based on sonar image[J].Digital Ocean and Underwater Warfare,2020,3(4):303-308.
[8]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[J].Advances in Neural Information Processing Systems,2012,25(2):84-90.
[9]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[10]REDMON J,FARHADI A.YOLOv3:An Incremental Improvement[J].Computer Vision and Pattern Recognition,2018,4(2):1-6.
[11]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//European Conference Computer Vision.2016.
[12]KAIMING H,GEORGIA G,PIOTR D,et al.Mask R-CNN[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.2017.
[13]WILLIAMS D P.Underwater target classification in syntheticaperture sonar imagery using deep convolutional neural networks[C]//2016 23rd International Conference on Pattern Recognition (ICPR).IEEE,2016.
[14]KIM J,YU S C.Convolutional neural network-based real-timeROV detection using forward-looking sonar image[C]//Autonomous Underwater Vehicles.IEEE,2016:396-400.
[15]VALDENEGRO-TOROM.Object recognition in forward-loo-king sonar images with Convolutional Neural Networks[C]//Oceans.IEEE,2016.
[16]KIM J,CHO H,PYO J,et al.The convolution neural network based agent vehicle detection using forward-looking sonar image[C]//OCEANS 2016 MTS/IEEE Monterey.2016.
[17]VALDENEGRO-TORO M.Objectness Scoring and DetectionProposals in Forward-Looking Sonar Images with Convolutional Neural Networks[C]//IAPR Workshop on Artificial Neural Networks in Pattern Recognition. Springer International Publishing,2016,21(3):209-219.
[18]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[19]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scale
hierarchical image database[C]//IEEE Conference on Computer Vision & Pattern Recognition.IEEE,2009.
[20]EVERINGHAM M,GOOL L V,WILLIAMS C K I,et al.The Pascal Visual Object Classes (VOC) Challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[21]LIN T Y,MAIRE M,BELONGIES,et al.Microsoft COCO:Common Objects in Context[J].Springer International Publishing,2014,86(93):740-755.
[22]https://www.sdms.afrl.af.mil/datasets/mstar/.
[23]LI J,QU C,SHAO J.Ship detection in SAR images based on an improved faster R-CNN[C]//Sar in Big Data Era:Models,Methods & Applications.IEEE,2017.
[24]HUANG L,LIU B,LI B,et al.OpenSARShip:A Dataset Dedicated to Sentinel-1 Ship Interpretation[J].IEEE Journal of Selected Topics in Applied Earth Observations & Remote Sensing,2018,11(1):195-208.
[25]SUN X,WANG Z R,SUN Y R,et al.AIR-SARShip-1.0:High-resolution SAR ship detection dataset[J].Journal of Radars,2019,8(6):852-862.
[26]JIANG L,CAI T,MAQ,et al.Active Object Detection in Sonar Images[J].IEEE Access,2020,99(4):1-14.
[27]LIN T Y,DOLLARP,GIRSHICK R,et al.Feature PyramidNetworks for Object Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2017.
[28]CAI Z,VASCONCELOS N.Cascade R-CNN:Delving into High Quality Object Detection[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018.
[29]CHEN K,WANG J,PANG J,et al.MMDetection:OpenMMLab Detection Toolbox and Benchmark[J].Computer Vision and Pattern Recognition,2019,6(7):1-11.
[30]HE K,ZHANG X,REN S,et al.Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2016.
[31]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Scien-ce,2014,23(7):1-14.
[32]HUANG G,LIU Z,VAN DER MAATEN L,et al.DenselyConnected Convolutional Networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2017.
[33]IOFFE S,SZEGEDY C.Batch Normalization:Accelerating Deep Network Training by Reducing Internal Covariate Shift[J].Computer Science,2015,12(27):1-11.
[1] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[2] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[3] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[4] 方义秋, 张震坤, 葛君伟.
基于自注意力机制和迁移学习的跨领域推荐算法
Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning
计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011
[5] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[6] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[8] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[9] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[10] 刘月红, 牛少华, 神显豪.
基于卷积神经网络的虚拟现实视频帧内预测编码
Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network
计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179
[11] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[12] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[13] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[14] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[15] 孙洁琪, 李亚峰, 张文博, 刘鹏辉.
基于离散小波变换的双域特征融合深度卷积神经网络
Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation
计算机科学, 2022, 49(6A): 434-440. https://doi.org/10.11896/jsjkx.210900199
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!