Computer Science ›› 2023, Vol. 50 ›› Issue (11A): 220900034-10.doi: 10.11896/jsjkx.220900034

• Big Data & Data Science • Previous Articles     Next Articles

Novelty Detection Method Based on Knowledge Distillation and Efficient Channel Attention

ZHOU Shijin, XING Hongjie   

  1. Hebei Key Laboratory of Machine Learning and Computational Intelligence,College of Mathematics and Information Science,Hebei University,Baoding,Hebei 071002,China
  • Published:2023-11-09
  • About author:ZHOU Shijin,born in 1997,postgra-duate.His main research interests include novelty detection and generative adversarial network.
    XING Hongjie,born in 1976,Ph.D,professor,master supervisor.His main research interests include kernel me-thods,neural networks,novelty detection,and ensemble learning.
  • Supported by:
    National Natural Science Foundation of China(61672205),Natural Science Foundation of Hebei Province(F2017201020),High-Level Talents Research Start-Up Project of Hebei University(521100222002),Affiliated Hospital Foundation Project of Hebei University(2019Q003) and Open Foundation of Engineering Research Center of Intelligent Computing for Complex Energy Systems(ESIC202101).

Abstract: The knowledge distillation based novelty detection method usually utilizes the pre-trained network as the teacher network.The network that has the same model structure and size as the teacher network is used as the student network.For testing data,the difference between the teacher network and the student network is utilized to discriminate them as normal or novel.However,the teacher network and the student network have the same network structure and size.On the one hand,the know-ledge distillation based novelty detection method may produce a small difference in the novel data.On the other hand,because the pre-trained data set of the teacher network is much larger in scale than the training set of the student network,the student network may thus obtain lots of redundant information.To solve this problem,the efficient channel attention(ECA) module is introduced into the knowledge distillation based novelty detection method.Utilizing the cross-channel interaction strategy,the student network with a simpler network structure and smaller size in comparison with the teacher network is designed.Hence,the features of the normal data can be efficiently obtained.The redundant information may be removed.The difference between the teacher network and the student network can also be enlarged.Moreover,the novelty detection performance may be improved.In comparison with 5 related methods,experimental results on the 6 image data sets demonstrate that the proposed method obtains better detection performance.

Key words: Novelty detection, Knowledge distillation, Attention mechanism, Teacher network, Student network

CLC Number: 

  • TP391.4
[1]RUFF L,KAUFFMANN J R,VANDERMEULEN R A,et al.A Unifying Review of Deep and Shallow Anomaly Detection[J].Proceedings of the IEEE,2021,109(5):756-795.
[2]MALAIYA R K,KWON D,KIM J,et al.An Empirical Evalu-ation of Deep Learning for Network Anomaly Detection[C]//2018 International Conference on Computing,Networking and Communications(ICNC).IEEE,2018.
[3]ZHENG Y J,ZHOU X H,SHENG W G,et al.Generative ad-versarial network based telecom fraud detection at the receiving bank[J].Neural Networks,2018,102:78-86.
[4]ZHAO R,YAN R,CHEN Z,et al.Deep learning and its applications to machine health monitoring[J].Mechanical Systems and Signal Processing,2019,115(15):213-237.
[5]GUO P,XUE Z,MTEMA Z,et al.Ensemble Deep Learning for Cervix Image Selection toward Improving Reliability in Automated Cervical Precancer Screening[J].Diagnostics(Basel),2020,10(7):451.
[6]ZHANG Z,CHEN S,SUN L.P-KDGAN:Progressive Know-ledge Distillation with GANs for One-class Novelty Detection[C]//Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence.2020.
[7]SALEHI M,SADJADI N,BASELIZADEH S,et al.Multiresolution Knowledge Distillation for Anomaly Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Nashville,TN,USA,2021.
[8]HINTON G,VINYALS O,DEAN J.Distilling the Knowledgein a Neural Network[J].arXiv:1503.02531,2015.
[9]ZAGORUYKO S,KOMODAKIS N.Paying More Attention to Attention:Improving the Performance of Convolutional Neural Networks via Attention Transfer[J].arXiv:1612.03928,2016.
[10]HUANG Z,WANG N.Like What You Like:Knowledge Distill via Neuron Selectivity Transfer[J].arXiv:1707.01219,2017.
[11]KIM J,PARK S,KWAK N.Paraphrasing complex network:Network compression via factor transfer[J].Advances in Neural Information Processing Systems,2018,31:2765-2774.
[12]HEO B,LEE M,YUN S,et al.Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons[C]//Proceedings of the AAAI Conference on Artificial Intelligence,Honolulu,Hawaii,USA,2019.
[13]PASSALIS N,TZELEPI M,TEFAS A.Heterogeneous Know-ledge Distillation Using Information Flow Modeling[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,Seattle,WA,USA,2020.
[14]PASSALIS N,TEFAS A.Learning Deep Representations withProbabilistic Knowledge Transfer[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Sprin-ger,2018.
[15]JIN X,PENG B,WU Y,et al.Knowledge Distillation via Route Constrained Optimization[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).2019.
[16]CHEN D,MEI J P,ZHANG Y,et al.Cross-Layer Distillationwith Semantic Calibration[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021.
[17]AKCAY S,ATAPOUR-ABARGHOUEI A,BRECKON T P.GANomaly:Semi-supervised Anomaly Detection via Adversarial Training[C]//Asian Conference on Computer Vision.Springer,2018.
[18]WANG Q,WU B,ZHU P,et al.ECA-Net:Efficient Channel Attention for Deep Convolutional Neural Networks[C]//IEEE/CVF Conference on Computer Vision Pattern Recognition,Seattle,WA,USA,2020.
[19]HU J,SHEN L,SUN G.Squeeze-and-Excitation Networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Salt Lake City,UT,USA,2018.
[20]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision(ECCV).Cham:Springer,2018.
[21]HU J,SHEN L,ALBANIE S,et al.Gather-Excite:Exploiting Feature Context in Convolutional Neural Networks[C]//Advances in Neural Information Processing Systems 31(NeurIPS 2018).2018.
[22]ROY A G,NAVAB N,WACHINGER C.Recalibrating FullyConvolutional Networks With Spatial and Channel “Squeeze and Excitation” Blocks[J].IEEE Transactions on Medical Imaging,2019,38(2):540-549.
[23]GAO Z,XIE J,WANG Q,et al.Global Second-Order PoolingConvolutional Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2019.
[24]FU J,LIU J,TIAN H,et al.Dual Attention Network for Scene Segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019.
[25]NAIR V,HINTON G E.Rectified Linear Units Improve Re-stricted Boltzmann Machines[C]//Proceedings of the 27th International Conference on International Conference on Machine Learning.Haifa,Israel,2010:807-814.
[26]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the International Conference on Machine Learning,PMLR,2015.
[27]CHEN Y,DAI X,LIU M,et al.Dynamic ReLU[C]//Procee-dings of the Computer Vision-ECCV 2020.Cham:Springer International Publishing,2020.
[28]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[29]XIAO H,RASUL K,VOLLGRAF R.Fashion-MNIST:a Novel Image Dataset for Benchmarking Machine Learning Algorithms[J].arXiv:1708.07747,2017.
[30]KRIZHEVSKY A,HINTON G.Learning Multiple Layers ofFeatures from Tiny Images[J/OL].https://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=0D60E5DD558A91470E0EA1725FF36E0A?doi=10.1.1.222.9220&rep=rep1&type=pdf.
[31]NETZER Y,WANG T,COATES A,et al.Reading digits in natural images with unsupervised feature learning[J/OL].http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf.
[32]COATES A,NG A,LEE H.An Analysis of Single-Layer Networks in Unsupervised Feature Learning[C]//Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics.2011.
[33]BERGMANN P,FAUSER M,SATTLEGGER D,et al.MVTecAD-A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition(CVPR).Los Alamitos,CA,USA,2019.
[34]FAWCETT T.An introduction to ROC analysis[J].PatternRecognition Letters,2006,27(8):861-874.
[35]CAMPOS G O,ZIMEK A,SANDER J,et al.On the evaluation of unsupervised outlier detection:measures,datasets,and an empirical study[J].Data Mining Knowledge Discovery,2016,30(4):891-927.
[36]GONG D,LIU L,LE V,et al.Memorizing Normality to Detect Anomaly:Memory-Augmented Deep Autoencoder for Unsupervised Anomaly Detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).Long Beach,CA,USA,2019.
[37]SCHLEGL T,SEEBOCK P,WALDSTEIN S M,et al.f-AnoGAN:Fast unsupervised anomaly detection with generative adversarial networks[J].Med Image Anal,2019,54:30-44.
[38]RUFF L,VANDERMEULEN R,GOERNITZ N,et al.DeepOne-Class Classification[C]//Proceedings of the 35th International Conference on Machine Learning,Stockholm.PMLR,2018.
[39]CHENG H,YANG L,LIU Z.Relation-Based Knowledge Distillation for Anomaly Detection[C]//Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Springer,2021.
[1] YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[2] LUO Yuanyuan, YANG Chunming, LI Bo, ZHANG Hui, ZHAO Xujian. Chinese Medical Named Entity Recognition Method Incorporating Machine ReadingComprehension [J]. Computer Science, 2023, 50(9): 287-294.
[3] LI Ke, YANG Ling, ZHAO Yanbo, CHEN Yonglong, LUO Shouxi. EGCN-CeDML:A Distributed Machine Learning Framework for Vehicle Driving Behavior Prediction [J]. Computer Science, 2023, 50(9): 318-330.
[4] ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[5] TENG Sihang, WANG Lie, LI Ya. Non-autoregressive Transformer Chinese Speech Recognition Incorporating Pronunciation- Character Representation Conversion [J]. Computer Science, 2023, 50(8): 111-117.
[6] WANG Jiahao, ZHONG Xin, LI Wenxiong, ZHAO Dexin. Human Activity Recognition with Meta-learning and Attention [J]. Computer Science, 2023, 50(8): 193-201.
[7] WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[8] YAN Mingqiang, YU Pengfei, LI Haiyan, LI Hongsong. Arbitrary Image Style Transfer with Consistent Semantic Style [J]. Computer Science, 2023, 50(7): 129-136.
[9] ZHAO Ran, YUAN Jiabin, FAN Lili. Medical Ultrasound Image Super-resolution Reconstruction Based on Video Multi-frame Fusion [J]. Computer Science, 2023, 50(7): 143-151.
[10] GAO Xiang, TANG Jiqiang, ZHU Junwu, LIANG Mingxuan, LI Yang. Study on Named Entity Recognition Method Based on Knowledge Graph Enhancement [J]. Computer Science, 2023, 50(6A): 220700153-6.
[11] ZHANG Tao, CHENG Yifei, SUN Xinxu. Graph Attention Networks Based on Causal Inference [J]. Computer Science, 2023, 50(6A): 220600230-9.
[12] CUI Lin, CUI Chenlu, LIU Zhengwei, XUE Kai. Speech Emotion Recognition Based on Improved MFCC and Parallel Hybrid Model [J]. Computer Science, 2023, 50(6A): 220800211-7.
[13] DUAN Jianyong, YANG Xiao, WANG Hao, HE Li, LI Xin. Document-level Relation Extraction of Graph Attention Convolutional Network Based onInter-sentence Information [J]. Computer Science, 2023, 50(6A): 220800189-6.
[14] ZHAO Jiangjiang, WANG Yang, XU Yingying, GAO Yang. Extractive Automatic Summarization Model Based on Knowledge Distillation [J]. Computer Science, 2023, 50(6A): 210300179-7.
[15] YANG Xing, SONG Lingling, WANG Shihui. Remote Sensing Image Classification Based on Improved ResNeXt Network Structure [J]. Computer Science, 2023, 50(6A): 220100158-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!