计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 220900034-10.doi: 10.11896/jsjkx.220900034
周士金, 邢红杰
ZHOU Shijin, XING Hongjie
摘要: 基于知识蒸馏的异常检测方法通常将经过预训练的网络作为教师网络,并将与该教师网络的模型结构及规模大小相同的网络用作学生网络,对于待测数据,利用教师网络与学生网络之间的差异判定其为正常数据或异常数据。然而,教师网络与学生网络的结构和规模均相同,一方面,会使得基于知识蒸馏的异常检测方法在异常数据上产生的差异过小;另一方面,教师网络的预训练数据集在规模上远大于学生网络的训练集,这会使得学生网络产生大量的冗余信息。为了解决上述问题,将高效通道注意力(Efficient Channel Attention,ECA)模块引入到基于知识蒸馏的异常检测方法中,利用ECA的跨通道交互策略,设计比教师网络结构更简单且规模更小的学生网络,既可以有效地获取正常数据的特征,去除冗余信息,又能增大教师网络与学生网络之间的差异,提高异常检测的性能。在6个图像数据集上的实验结果表明,与其他5种相关方法相比,所提方法取得了更优的检测性能。
中图分类号:
[1]RUFF L,KAUFFMANN J R,VANDERMEULEN R A,et al.A Unifying Review of Deep and Shallow Anomaly Detection[J].Proceedings of the IEEE,2021,109(5):756-795. [2]MALAIYA R K,KWON D,KIM J,et al.An Empirical Evalu-ation of Deep Learning for Network Anomaly Detection[C]//2018 International Conference on Computing,Networking and Communications(ICNC).IEEE,2018. [3]ZHENG Y J,ZHOU X H,SHENG W G,et al.Generative ad-versarial network based telecom fraud detection at the receiving bank[J].Neural Networks,2018,102:78-86. [4]ZHAO R,YAN R,CHEN Z,et al.Deep learning and its applications to machine health monitoring[J].Mechanical Systems and Signal Processing,2019,115(15):213-237. [5]GUO P,XUE Z,MTEMA Z,et al.Ensemble Deep Learning for Cervix Image Selection toward Improving Reliability in Automated Cervical Precancer Screening[J].Diagnostics(Basel),2020,10(7):451. [6]ZHANG Z,CHEN S,SUN L.P-KDGAN:Progressive Know-ledge Distillation with GANs for One-class Novelty Detection[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence.2020. [7]SALEHI M,SADJADI N,BASELIZADEH S,et al.Multiresolution Knowledge Distillation for Anomaly Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Nashville,TN,USA,2021. [8]HINTON G,VINYALS O,DEAN J.Distilling the Knowledgein a Neural Network[J].arXiv:1503.02531,2015. [9]ZAGORUYKO S,KOMODAKIS N.Paying More Attention to Attention:Improving the Performance of Convolutional Neural Networks via Attention Transfer[J].arXiv:1612.03928,2016. [10]HUANG Z,WANG N.Like What You Like:Knowledge Distill via Neuron Selectivity Transfer[J].arXiv:1707.01219,2017. [11]KIM J,PARK S,KWAK N.Paraphrasing complex network:Network compression via factor transfer[J].Advances in Neural Information Processing Systems,2018,31:2765-2774. [12]HEO B,LEE M,YUN S,et al.Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons[C]//Proceedings of the AAAI Conference on Artificial Intelligence,Honolulu,Hawaii,USA,2019. [13]PASSALIS N,TZELEPI M,TEFAS A.Heterogeneous Know-ledge Distillation Using Information Flow Modeling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,Seattle,WA,USA,2020. [14]PASSALIS N,TEFAS A.Learning Deep Representations withProbabilistic Knowledge Transfer[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Sprin-ger,2018. [15]JIN X,PENG B,WU Y,et al.Knowledge Distillation via Route Constrained Optimization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).2019. [16]CHEN D,MEI J P,ZHANG Y,et al.Cross-Layer Distillationwith Semantic Calibration[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021. [17]AKCAY S,ATAPOUR-ABARGHOUEI A,BRECKON T P.GANomaly:Semi-supervised Anomaly Detection via Adversarial Training[C]//Asian Conference on Computer Vision.Springer,2018. [18]WANG Q,WU B,ZHU P,et al.ECA-Net:Efficient Channel Attention for Deep Convolutional Neural Networks[C]//IEEE/CVF Conference on Computer Vision Pattern Recognition,Seattle,WA,USA,2020. [19]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Salt Lake City,UT,USA,2018. [20]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Springer,2018. [21]HU J,SHEN L,ALBANIE S,et al.Gather-Excite:Exploiting Feature Context in Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems 31(NeurIPS 2018).2018. [22]ROY A G,NAVAB N,WACHINGER C.Recalibrating FullyConvolutional Networks With Spatial and Channel “Squeeze and Excitation” Blocks[J].IEEE Transactions on Medical Imaging,2019,38(2):540-549. [23]GAO Z,XIE J,WANG Q,et al.Global Second-Order PoolingConvolutional Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2019. [24]FU J,LIU J,TIAN H,et al.Dual Attention Network for Scene Segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019. [25]NAIR V,HINTON G E.Rectified Linear Units Improve Re-stricted Boltzmann Machines[C]//Proceedings of the 27th International Conference on International Conference on Machine Learning.Haifa,Israel,2010:807-814. [26]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the International Conference on Machine Learning,PMLR,2015. [27]CHEN Y,DAI X,LIU M,et al.Dynamic ReLU[C]//Procee-dings of the Computer Vision-ECCV 2020.Cham:Springer International Publishing,2020. [28]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324. [29]XIAO H,RASUL K,VOLLGRAF R.Fashion-MNIST:a Novel Image Dataset for Benchmarking Machine Learning Algorithms[J].arXiv:1708.07747,2017. [30]KRIZHEVSKY A,HINTON G.Learning Multiple Layers ofFeatures from Tiny Images[J/OL].https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=0D60E5DD558A91470E0EA1725FF36E0A?doi=10.1.1.222.9220&rep=rep1&type=pdf. [31]NETZER Y,WANG T,COATES A,et al.Reading digits in natural images with unsupervised feature learning[J/OL].http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf. [32]COATES A,NG A,LEE H.An Analysis of Single-Layer Networks in Unsupervised Feature Learning[C]//Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics.2011. [33]BERGMANN P,FAUSER M,SATTLEGGER D,et al.MVTecAD-A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition(CVPR).Los Alamitos,CA,USA,2019. [34]FAWCETT T.An introduction to ROC analysis[J].PatternRecognition Letters,2006,27(8):861-874. [35]CAMPOS G O,ZIMEK A,SANDER J,et al.On the evaluation of unsupervised outlier detection:measures,datasets,and an empirical study[J].Data Mining Knowledge Discovery,2016,30(4):891-927. [36]GONG D,LIU L,LE V,et al.Memorizing Normality to Detect Anomaly:Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).Long Beach,CA,USA,2019. [37]SCHLEGL T,SEEBOCK P,WALDSTEIN S M,et al.f-AnoGAN:Fast unsupervised anomaly detection with generative adversarial networks[J].Med Image Anal,2019,54:30-44. [38]RUFF L,VANDERMEULEN R,GOERNITZ N,et al.DeepOne-Class Classification[C]//Proceedings of the 35th International Conference on Machine Learning,Stockholm.PMLR,2018. [39]CHENG H,YANG L,LIU Z.Relation-Based Knowledge Distillation for Anomaly Detection[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Springer,2021. |
|