计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 164-169.doi: 10.11896/jsjkx.210600044

• 人工智能 • 上一篇    下一篇

基于主动采样的深度鲁棒神经网络学习

周慧1,2, 施皓晨1,2, 屠要峰1,3, 黄圣君1,2   

  1. 1 南京航空航天大学计算机科学与技术学院 南京211106
    2 模式分析与机器智能工信部重点实验室 南京211106
    3 移动网络和移动多媒体技术国家重点实验室 广东 深圳518057
  • 收稿日期:2021-06-04 修回日期:2021-10-19 出版日期:2022-07-15 发布日期:2022-07-12
  • 通讯作者: 黄圣君(huangsj@nuaa.edu.cn)
  • 作者简介:(zhouhui@nuaa.edu.cn)
  • 基金资助:
    科技创新2030-新一代人工智能重大项目(2020AAA0107000);国家自然科学基金(62076128)

Robust Deep Neural Network Learning Based on Active Sampling

ZHOU Hui1,2, SHI Hao-chen1,2, TU Yao-feng1,3, HUANG Sheng-jun1,2   

  1. 1 College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
    2 MIIT Key Laboratory of Pattern Analysis and Machine Intelligence,Nanjing 211106,China
    3 State Key Laboratory of Mobile Network and Mobile Multimedia Technology,Shenzhen,Guangdong 518057,China
  • Received:2021-06-04 Revised:2021-10-19 Online:2022-07-15 Published:2022-07-12
  • About author:ZHOU Hui,born in 1997,master.Her main research interests include machine learning and so on.
    HUANG Sheng-jun,born in 1987,professor.His main research interests include machine learning and data mi-ning.
  • Supported by:
    Technological Innovation 2030-“New Generation Artificial Intelligence” Major Project(2020AAA0107000) and National Natural Science Foundation of China(62076128).

摘要: 随着深度模型在许多实际任务中的广泛应用,提高模型的鲁棒性已经成为了机器学习的重要研究方向。最新的研究表明,通过对训练样本添加噪声扰动进行训练能有效地提升深度模型的鲁棒性。然而,该训练过程往往需要大量已标注样本。在许多实际应用中,准确地标注每一个样本的标记信息往往代价高昂且异常困难。主动学习是降低样本标注代价的主要方法,通过主动选择最有价值的样本进行标注,在提高模型性能的同时,能最大限度地降低查询标记的代价。提出一种基于主动采样的鲁棒神经网络学习框架,该框架能以较低的标注代价显著地提升深度模型的鲁棒性。在该框架中,基于不一致性的主动采样方法通过生成系列扰动样本并采用其预测差异来衡量每个未标注样本对提升模型鲁棒性的潜在效用,同时挑选不一致性值最大的样本用于深度模型的加噪训练。在基准图像分类任务数据集上进行的实验表明,基于不一致性的主动采样策略能以更低的样本标注代价有效地提升深度神经网络模型的鲁棒性。

关键词: 不一致性, 模型鲁棒性, 深度学习, 噪声干扰, 主动学习

Abstract: Recently,deep learning models have been widely used in various real-world tasks.Improving the robustness of deep neural networks has become an important research direction in machine learning field.Recent works show that training the deep model with noise perturbations can significantly improve the model robustness.However,its training requires a large set of precisely labeled examples,which is often expensive and difficult to collect in real-world scenario.Active learning(AL) is a primary approach for reducing the labeling cost,which progressively selects the most useful samples and queries their labels,with the target of training an effective model with less queries.This paper proposes an active sampling based neural network learning framework,which aims to improve the model robustness with low labeling cost.In this framework,the proposed inconsistency sampling strategy is employed to measure the potential utility for improving the model robustness of each unlabeled example with a series of perturbations.Then,those examples with the largest inconsistency will be selected for training the deep model with noise perturbations.Experimental results on the benchmark image classification task data set show that the inconsistency-based active sampling strategy can effectively improve the robustness of the deep neural network model with lower sample labeling cost.

Key words: Active learning, Deep learning, Inconsistency, Model robustness, Noise perturbations

中图分类号: 

  • TP181
[1]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[2]HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE International Conference on Computer vision.2015:1026-1034.
[3]XIONG W,DROPPO J,HUANG X,et al.Achieving humanparity in conversational speech recognition[J].arXiv:1610.05256,2016.
[4]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[J].arXiv:1409.3215,2014.
[5]LIU L,OUYANGW,WANG X,et al.Deep learning for generic object detection:A survey[J].International Journal of Computer Vision,2020,128(2):261-318.
[6]GEIRHOS R,RUBISH P,MICHAELIS C,et al.ImageNet-trained CNNs are biased towards texture;increasing shape bias improves accuracy and robustness[J].arXiv:1811.12231,2018.
[7]RUSAK E,SCHOTT L,ZIMMERMANN R S,et al.A simple way to make neural networks robust against diverse image corruptions [C]//European Conference on Computer Vision.Cham:Springer,2020:53-69.
[8]HENDRYCKS D,MAZEIKA M,KADAVATH S,et al.Usingself-supervised learning can improve model robustness and uncertainty[J].arXiv:1906.12340,2019.
[9]HENDRYCKS D,MU N,CUBUK E D,et al.Augmix:A simple data processing method to improve robustness and uncertainty[J].arXiv:1912.02781,2019.
[10]MAO C,ZHONG Z,YANG J,et al.Metric learning for adversarial robustness[J].arXiv:1909.00900,2019.
[11]ZHANG R.Making convolutional networks shift-invriance again[J].Proceedings of Machine Learning Research,2019,97:7324-7334.
[12]TRAMER F,CARLINI N,BRENDEL W,et al.On adaptive attacks to adversarial example defenses[J].arXiv:2002.08347,2020.
[13]MADRY A,MAKELOV A,SCHMIDT L,et al.Towards deep learning models resistant to adversarial attacks[J].arXiv:1706.06083,2017.
[14]WONG E,SCHMIDT F,KOLTER Z.Wasserstein adversarialexamples via projected sinkhorn iterations[C]//Internatioal Conference on Machine Learning.PMLR,2019:6808-6817.
[15]SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[J].arXiv:1312.6199,2013.
[16]CHAPELLE O,WESTON J,BOTTOU L,et al.Vicinal riskminimization[C]//Conference and Workshop on Neural Information Processing Systems(NIPS).2000.
[17]HENDRYCKS D,DIETTERICH T.Benchmarking neural net-work robustness to common corruptions and perturbations[J].arXiv:1903.12261,2019.
[18]XIE Q,LUONG M T,HOVY E,et al.Self-training with noisy student improves imagenet classification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2020:10687-10698.
[19]MAHAJAN D,GIRSHICK R,RAMANATHAN V,et al.Ex-ploring the limits of weakly supervised pretraining[C]//Proceedings of the European Conference on Computer Vision(ECCV).IEEE,2018:181-196.
[20]UESATO J,ALAYRAC J B,HUANG P S,et al.Are labels required for improving adversarial robustness?[J].arXiv:1905.13725,2019.
[21]LEWIS D D,GALE W A.A sequential algorithm for trainingtext classifiers[C]//SIGIR'94.London:Springer,1994:3-12.
[22]TONG S,KOLLER D.Support vector machine active learning with applications to text classification[J].Journal of machine learning research,2001,23:45-66.
[23]YAN Y,HUANG S J.Cost-Effective Active Learning for Hierarchical Multi-Label Classification[C]//IJCAI.2018:2962-2968.
[24]HUANG S J,GAO N,CHEN S.Multi-instance multi-label active learning[C]//IJCAI.2017:1886-1892.
[25]HUANG S J,ZHOU Z H.Active query driven by uncertainty and diversity for incremental multi-label learning[C]//2013 IEEE 13th International Conference on Data Mining.IEEE,2013:1079-1084.
[26]HUANG S J,RONG J,ZHOU Z H.Active learning by querying informative and representative examples[C]//Conference and Workshop on Neural Information Processing Systems(NIPS).2014:1936-1949.
[27]WANG Z,YE J P.Querying discriminative and representativesamples for batch mode active learning[J].ACM Transactions on Knowledge Discovery from Data(TKDD),2015,9(3):1-23.
[28]DODGE S,KARAM L.Understanding how image qualityaffects deep neural networks[C]//2016 Eighth International Conference on Quality of Multimedia Experience(QoMEX).IEEE,2016:1-6.
[29]DODGE S,KARAM L.A study and comparison of human and deep learning recognition performance under visual distortions[C]//2017 26th International Conference on Computer Communication and Networks(ICCCN).IEEE,2017:1-7.
[30]GEIRHOS R,TEMME C R M,RAUBER J,et al.Generalisation in humans and deep neural networks[J].arXiv:1808.08750,2018.
[31]CARLINI N,ATHALYE A,PAPERNOT N,et al.On evaluating adversarial robustness[J].arXiv:1902.06705,2019.
[32]CHEN P,ZHAO J C,YU X S.Ensemble Method of K-NearestNeighbor Enhancement Fuzzy Minimax neuralNetworks with Centroid[J].Journal of Chongqing University of Technology(Natural Science),2021,35(9):116-129.
[33]FU Y,ZHU X,AND LI B.A survey on instance selection for active learning[J].Knowledge and Information Systems,2013,35(2):249-283.
[34]SEUNG HS,OPPER M,SOMPOLINSKY H.Query by committee[C]//Proceedings of the Fifth Annual Workshop on Computational Learning Theory.1992:287-294.
[35]YOU X,WANG R,TAO D.Diverse expected gradient activelearning for relative attributes[J].IEEE Transactions on Image Processing,2014,23(7):3203-3217.
[36]ROY N,MCCALLUM A.Toward optimal active learningthrough sampling estimation of error reduction[C]//International Conferenceon Machine Learning(ICML).2001:441-448.
[37]GEMAN S,BIENENSTOCK E,DOURSAT R.Neural networks and the bias/variance dilemma[J].NeuralComputation,1992,4(1):1-58.
[38]NING KP,TAO L,CHEN S,et al.Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries[J].arXiv:2103.14824.2021.
[39]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[40]YAO L,MILLER J.Tiny imagenet classification with convolutional neural networks[J].CS 231N,2015,2(5):8.
[41]MU N,GILMER J.Mnist-c:A robustness benchmark for computer vision[J].arXiv:1906.02337,2019.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[12] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[13] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[14] 毛典辉, 黄晖煜, 赵爽.
符合监管合规性的自动合成新闻检测方法研究
Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance
计算机科学, 2022, 49(6A): 523-530. https://doi.org/10.11896/jsjkx.210300083
[15] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!