计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 231000034-6.doi: 10.11896/jsjkx.231000034

• 图像处理&多媒体技术 • 上一篇    下一篇

基于改进Yolov8的敦煌壁画元素检测算法

周颜林1,2, 邬开俊1, 梅源1, 田彬1, 俞天秀2   

  1. 1 兰州交通大学 兰州 730070
    2 敦煌研究院 甘肃 敦煌 736200
  • 出版日期:2024-11-16 发布日期:2024-11-13
  • 通讯作者: 周颜林(zhouyanlin@dha.ac.cn)
  • 基金资助:
    甘肃省自然科学基金(23JRRA913)

Dunhuang Mural Element Detection Algorithm Based on Improved Yolov8

ZHOU Yanlin1,2, WU Kaijun1, MEI Yuan1, TIAN Bin1, YU Tianxiu2   

  1. 1 Lanzhou Jiaotong University,Lanzhou 730070,China
    2 Dunhuang Academy,Dunhuang,Gansu 736200,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:ZHOU Yanlin,born in 1989,bachelor degree,librarian.His main research interests include cultural relics digitization,and deep learning and computer vision.
  • Supported by:
    Natural Science Foundation of Gansu Province(23JRRA913).

摘要: 敦煌壁画因其极高的艺术价值、历史价值、研究价值而备受关注。在壁画文创研发中,壁画元素检测扮演了一个十分重要的角色。但是,受到壁画脱落、颜料褪色、病虫害破坏、元素体量差异大等因素的影响,给壁画元素的检测工作带来了极大的困扰。为此,文中基于Yolov8算法进行了改进拓展工作并将其引入壁画元素的检测任务。具体来说,考虑到部分元素特征不明显的问题,设计了改进的SPPCSPC模块以增强模型的特征感知能力,扩大模型的感受野;考虑到元素体量差异巨大、元素风格多变的问题,在C2f模块末端引入CoordAtt注意力机制以增强网络对局部及非显著信息的关注能力。在敦煌壁画元素检测任务上,相比5项前沿检测算法,所提算法取得了先进的壁画原始检测性能。相比Yolov8基线算法取得了2.2%@mAP的性能提升,尤其是在main_buddha类别上提升了12.2%@mAP的检测性能。所提方法有效支撑了敦煌壁画的后续相关研究工作。

关键词: 敦煌壁画, 改进的Yolov8, 目标检测, 特征增强

Abstract: The Dunhuang murals have garnered significant attention for their artistic,historical,and research value.In the research and development of cultural tourism surrounding frescoes,detecting elements within these frescoes is crucial.However,due to factors such as shedding,pigment fading,pest damage,and the significant discrepancies in elemental volume,detecting mural elements has become difficult.For this reason,this paper,which is based on the Yolov8 algorithm,continues the improvement and expansion work by introducing it into the fresco element detection task.Specifically,the design of an enhanced SPPCSPC module improves the feature-perception ability of the model and expands its sensory field.Additionally,the CoordAttention mechanism is introduced at the end of the C2f module to improve the network's ability to focus on local and non-significant information,which addresses the variability in volume and style of the elements.On the issue of detecting elements within Dunhuang murals,our algorithm outperforms five other cutting-edge detection algorithms in terms of mural detection accuracy.Compared to the Yolov8 baseline algorithm,it achieves a 2.2% improvement in mAP,particularly in the main_buddha category where we see a 12.2% improvement in detection accuracy.This accomplishment offers significant support for future research focused on Dunhuang murals analysis.

Key words: Dunhuang murals, Improved Yolov8, Target detection, Feature enhancement

中图分类号: 

  • TP399
[1]WU J.Digital Development of mural cultural heritage-A case study of Dunhuang Mogao Grottoes[J].China Cultural Heritage,2016(2):34-38.
[2]SHEN J X,BAO M Y.Detection and Analysis of Grotto Mural Diseases Based on Improved YOLOv4[J].Journal of Shanxi Datong University(Natural Science Edition),2023,39(2):15-17,22.
[3]DONG Y X.Research on Automatic and High Precision Detection of Mural Diseases Ingrottoes[D].Beijing:Beijing University of Civil Engineering and Architecture,2022.
[4]ZHANG Y E,WU L G.The Detection of Cave Mural Damage Based on Deep Learning[J].Yungang Research,2022,2(1):85-90.
[5]LI X Y.Chronological Classification Anddamage Detection ofDunhuang Muralsbased on Deep Learning[D].Harbin:Harbin Institute of Technology,2019.
[6]YU T,LIN C,ZHANG S,et al.Artificial Intelligence for Dunhuang Cultural Heritage Protection:The Project and the Dataset[J/OL].Internation al Journal of Computer Vision,2022.https://doi.org/10.1007/s11263-022-01665-x.
[7]https://github.com/ultralytics/Yolov5.
[8]https://github.com/ultralytics/ultralytics.
[9]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[J].IEEE Computer Society,2014.
[10]GIRSHICK R.Fast R-CNN[J].Computer Science,2015.
[11]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[C]//NIPS.2016.
[12]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//Computer Vision & Pattern Recognition.IEEE,2016.
[13]REDMON J,FARHADI A.Yolo9000:Better,faster,stronger[C]//CVPR.2017.
[14]REDMON J,FARHADI A.Yolov3:An incremental improve-ment[J].arXiv:1804.02767,2018.
[15]BOCHKOVSKIY A,WANG C Y,LIAO M.Yolov4:Optimalspeed and accuracy of object detection[J].arXiv:2004.10934,2020.
[16]LI C,LI L,JIANG H,et al.YOLOv6:A single-stage object detection framework for industrial applications[J].arXiv:2209.02976,2022.
[17]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.Yolov7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[J].arXiv e-prints,2022.
[18]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//Computer Vision-ECCV 2020:16th European Conference,Glasgow,UK.Springer International Publishing,2020:213-229.
[19]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2018.
[20]JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial Transformer Networks[J].arXiv:1506.02025,2015.
[21]LIU J,ZHANG W,TANG Y,et al.Residual feature aggregation network for image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:2359-2368.
[22]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[23]LIU Y,SHAO Z,HOFFMANN N.Global attention mecha-nism:Retain information to enhance channel-spatial interactions[J].arXiv:2112.05561,2021.
[24]HOU Q,ZHOU D,FENG J.Coordinate Attention for Efficient Mobile Network Design[J].2021.DOI:10.48550/arXiv.2103.02907.
[25]YANG L,ZHANG R Y,LI L,et al.Simam:A simple,parameter-free attention module for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2021:11863-11874.
[26]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,37(9):1904-1916.
[27]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[28]LIU J,LI C,LIANG F,et al.Inception convolution with efficient dilation search[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:11486-11495.
[29]LIU S,HUANG D.Receptive field block net for accurate and fast object detection[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:385-400.
[30]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[31]WANG C Y,LIAO H Y M,WU Y H,et al.CSPNet:A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:390-391.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!