计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230700111-7.doi: 10.11896/jsjkx.230700111

• 大数据&数据科学 • 上一篇    下一篇

基于OOD评分的工业缺陷增强数据筛选研究

尹旭东, 陈俊洋, 周波   

  1. 合肥工业大学计算机与信息学院 安徽 宣城 242000
  • 发布日期:2024-06-06
  • 通讯作者: 周波(zhoubo810707@163.com)
  • 作者简介:(2020217676@mail.hfut.edu.cn)
  • 基金资助:
    国家自然科学基金(61602146);国家重点基础研究发展计划(2017YFB1402200);安徽省科技攻关计划(1604d0802009);国家级大学生创新创业训练计划项目(202210359103)

Study on Industrial Defect Augmentation Data Filtering Based on OOD Scores

YIN Xudong, CHEN Junyang, ZHOU Bo   

  1. School of Computer and Information,Hefei University of Technology,Xuancheng,Anhui 242000,China
  • Published:2024-06-06
  • About author:YIN Xudong,born in 2002,undergra-duate.His main research interests include deep learning,data enhancement and data visualization.
    ZHOU Bo,born in 1981,Ph.D,associate professor.His main research interests include deep learning,image proces-sing,and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61602146),National Basic Research Program of China(2017YFB1402200),Anhui Provincial Science and Technology Research Plan(1604d0802009) and National College Students’ Innovative Entrepreneurial Training Plan Program(202210359103).

摘要: 在基于深度学习的工业缺陷检测中,数据增强能在一定程度上缓解部分缺陷数据缺乏的窘境,但如何从大量增强数据中筛选出有效的增强数据,提升工业检测模型的性能,目前尚未有相关研究。针对这一问题,进行了基于分布外检测(Out-of-Distribution Detection,OOD)评分的工业缺陷增强数据筛选研究。首先使用pix2pix网络生成工业增强数据,接着采用基于深度集成的OOD评分方法获得OOD评分,并利用该评分对增强数据进行分组;然后通过降维投影视图对增强数据分布进行分组观察;最后使用目标检测算法对增强数据进行分组缺陷检测,根据目标检测模型的精度增益探索分布外程度对增强数据质量的影响。实验结果表明,OOD评分较高的工业缺陷增强数据与训练数据分布差异较大,将这部分增强数据用于训练集的数据扩充能够提高模型的泛化性,可以更有效地提升目标检测算法的检测精度。

关键词: 数据增强, 缺陷检测, 分布外检测, 数据可视化, 深度学习

Abstract: In deep learning-based industrial defect detection,data augmentation plays a crucial role in mitigating the scarcity of defect data.However,the effective selection of augmented data from a vast pool of candidates remains an unexplored area,hampering the performance enhancement of industrial detection models.To address this issue,this study focuses on the research of industrial defect augmentation data filtering based on out-of-distribution(OOD) scores.The proposed approach involves the generation of industrial enhancement data using the pix2pix network.Subsequently,OOD scores are computed using a deep ensemble-based scoring method,which facilitates the grouping of augmented data based on their OOD scores.Furthermore,the distribution of the augmented data is analyzed through dimensionality reduction and projection views.Finally,defect detection of the grouped augmented data is performed using object detection algorithms,while investigating the impact of the out-of-distribution degree on the quality of the augmented data through the accuracy gain of the object detection model.Experimental results demonstrate a substantial difference in the distribution between industrial defect augmented data with higher OOD scores and the training data.Incorporating this subset of augmented data for training data expansion enhances the generalization of the model and significantly improves the detection accuracy of the object detection algorithm.

Key words: Data augmentation, Defect detection, Out-of-distribution detection, Data visualization, Deep learning

中图分类号: 

  • TP391
[1]HUANG H X,TANG X D,WEN F,et al.Small object detection method with shallow feature fusion network for chip surface defect detection[J].Scientific Reports,2022,12(1):15-16.
[2]SHAO L.Surface Defect Detection Methods for Industrial Pro-ducts:A Review[J].Applied Sciences,2021,11(16):7657;7659.
[3]JAIN S,SETH G,PARUTHI A,et al.Synthetic data augmentation for surface defect detection and classification using deep learning[J].Journal of Intelligent Manufacturing,2022,33(4):1007-1020.
[4]SIMARD P Y,STEINKRAUS D,PLATT J C.Best Practicesfor Convolutional Neural Networks Applied to Visual Document Analysis[C]//7th International Conference on Document Ana-lysis and Recognition(ICDAR 2003),2-Volume Set,3-6 August 2003,Edinburgh,Scotland,UK.IEEE Computer Society,2003.
[5]MORENO-BAREAFJ,STRAZZERA F,JEREZJ M,et al.For-ward noise adjustment scheme for dataaugmentation[C]//Proceedings of 2018 IEEE Symposium Series on Computational Intelligence.2018:728-734.
[6]TAYLOR L,NITSCHKE G.Improving deep learning using generic data augmentation[J].arXiv:1708.06020,2017.
[7]ZHONG Z,ZHENG L,KANG G,et al.Random Erasing Data Augmentation[J].Proceedings of the AAAI Conference on Artificial Intelligence,2017,34(7):225-228.
[8]SHORTEN C,KHOSHGOFTAAR T M.A survey on ImageData Augmentation for Deep Learning[J].Journal of Big Data,2019,6(1):125-127.
[9]IAN G,JEAN P B,MEHDI M,et al.Generative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[10]ZHU X H,QIAN L P,FU W.Research review of image data enhancement technology[J].Journal of Software Guide,2019,20(5):230-236.
[11]WANG H W,QIU X H.A method of image data expansion based on generative adversarial network[J].Journal of Computer Technology and Development,2020,30(3):6-8.
[12]OUYANG X,CHENG Y,JIANG Y,et al.Pedestrian-Synthesis-GAN:Generating Pedestrian Data in Real Scene and Beyond[J].arXiv:1804.02047,2018.
[13]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1125-1134.
[14]LIN Z P,ZENG L B,WU Q S.Cervical cell image data enhancement based on generative adversarial network[J].Science Technology and Engineering,2019,20(28):11672-11677.
[15]RADFORD A,METZ L,CHINTALA S.Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks[J].Computer Ence,2015(7):35-36.
[16]SUN X,DING X L.Facial Expression data enhancement Method based on generative adversarial network[J].Computer Engineering and Applications,2020,56(4):115-121.
[17]LUO Y T,DUAN C,JIANG P F,etal.An improved method of industrial defect data enhancement based on pix2pix[J].Computer Engineering and Science,2022,44(12):2206-2212.
[18]SONG C,XIE Z P.Data set enhancement quality evaluationmethod for Chinese error correction tasks[J/OL].Computer Engineering and Application:1-12.[2023-02-23].http://kns.cnki.net/kcms/detail/11.2127.TP.20230214.1447.036.html.
[19]LIN X,LIN N,FU Y,et al.How to choose “Good” Samples for Text Data Augmentation[J].Computation and Language,2023,6(1):50.
[20]LAKSHMINARAYANAN B,PRITZEL A,BLUNDELL C.Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. 2016,(9):74-76..
[21]CHEN C,YUAN J,LU Y,et al.OoDAnalyzer:Interactive Ana-lysis of Out-of-Distribution Samples[J].IEEE Transactions on Visualization and Computer Graphics,2020,27(7):3335-3349.
[22]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].Computer Science,2014,(7)11-12.
[23]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2016:770-778.
[24]GHOWARD A,ZHU M L,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[25]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the Inception Architecture for Computer Vision[J].IEEE,2016(10):2818-2826.
[26]CHIN C S,JIANTING S,CLARE A S,et al.Intelligent Image Recognition System for Marine Fouling Using Softmax Transfer Learning and Deep Convolutional Neural Networks[J].Complexity,2017,2017:1-9.
[27]LI Y,ZHANG Y F,XU Y L,et al.Characteristics and nonlinear dimension reduction based on depth image datasets visualization methods[J].Computer Application Research,2017,34(2):5.
[28]ZHAO Q.Review of Principal Component Analysis methods[J].Software Engineering,2016,19(6):1-3.
[29]TSAI F S.Comparative Study of Dimensionality ReductionTechniques for Data Visualization[J].Journal of Artificial Intelligence,2010,3(3):294-303.
[30]AHMED M,SERAJ R,ISLAM S M S.The k-means Algo-rithm:A Comprehensive Survey and Performance Evaluation[J].Electronics,2020,9(8),1295.
[31]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!