Computer Science ›› 2025, Vol. 52 ›› Issue (1): 289-297.doi: 10.11896/jsjkx.231100075

• Artificial Intelligence • Previous Articles     Next Articles

Active Learning Based on Maximum Influence Set

LI Yahe, XIE Zhipeng   

  1. School of Computer Science,Fudan University,Shanghai 200438,China
  • Received:2023-11-13 Revised:2023-12-20 Online:2025-01-15 Published:2025-01-09
  • About author:LI Yahe,born in 2000,postgraduate.His main research interests include machine learning and nature language processing.
    XIE Zhipeng,born in 1976,Ph.D,associate professor,Ph.D supervisor,is a member of CCF(No.50903M).His main research interests include data mining,machine learning and natural language processing.

Abstract: With the continuous progress of deep learning,it has been widely applied in numerous fields.However,the training of deep models requires a large amount of labeled data,and the cost of time and resources is high.How to maximize the model performance with the least amount of labeled data has become an important research topic.Active learning aims to address this issue by selecting the most valuable samples for annotation and utilizing them for model training.Traditional active learning approaches usually concentrate on uncertainty or diversity,aiming to query the most difficult or representative samples.Nevertheless,these methods typically only take into account one-sided effects and overlook the interaction between labeled and unlabeled data in active learning scenarios.Another type of active learning method utilizes auxiliary networks for sample selection,but these methods usually result in higher computational complexity.This paper proposes a novel active learning approach designed to optimize the model’s total performance gain by taking into account sample-to-sample interactions and comprehensively measuring local uncertainty and the influence of candidate samples on other samples.The method first estimates the influence of samples on each other based on the distance between the hidden layer representations of the samples,and further estimates the potential gain that the sample can bring based on the influence of candidate samples and the uncertainty of unlabeled samples.The sample with the highest global gain is iteratively chosen for annotation.On a series of tasks across several domains,the study further compares the proposed method with other active learning strategies.Experimental results demonstrate that the proposed method outperforms all competitors in all tasks.Further quantitative analysis experiments have also demonstrated that it balances uncertainty and diversity well,and explores the factors that should be emphasized at different stages of active learning.

Key words: Active learning, Deep learning, Uncertainty

CLC Number: 

  • TP391
[1]LEWIS D D,CATLETT J.Heterogeneous uncertainty sampling for supervised learning [C]//Machine Learning Proceedings 1994.Elsevier,1994:148-156.
[2]COHN D A,GHAHRAMANI Z,JORDAN M I.Active learning with statistical models [J].Journal of Artificial Intelligence Research,1996,4:129-145.
[3]LEWIS D D.A sequential algorithm for training text classifiers:Corrigendum and additional data[C]//Acm Sigir Forum.1995:13-19.
[4]GAL Y,ISLAM R,GHAHRAMANI Z.Deep bayesian active learning with image data[C]//Proceedings of the 34th International Conference on Machine Learning.2017:1183-1192.
[5]NGUYEN H T,SMEULDERS A.Active learning using pre-clustering[C]//Proceedings of the Twenty-first International Conference on Machine Learning.2004.
[6]SENER O,SAVARESE S.Active learning for convolutionalneural networks:A core-set approach[C]//International Conference on Learning Representations.2018:1-13.
[7]ASH J T,ZHANG C,KRISHNAMURTHY A,et al.Deepbatch active learning by diverse,uncertain gradient lower bounds[C]//International Conference on Learning Representations.2020:1-26.
[8]GISSIN D,SHALEV-SHWARTZ S.Discriminative active lear-ning [J].arXiv:1907.06347,2019.
[9]YOO D,KWEON I S.Learning loss for active learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:93-102.
[10]ROTH D,SMALL K.Margin-based active learning for structured output spaces[C]//European Conference on Machine Learning.2006:413-424.
[11]MARGATINA K,VERNIKOS G,BARRAULT L,et al.Active learning by acquiring contrastive examples[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:650-663.
[12]HOULSBY N,HUSZR F,GHAHRAMANI Z,et al.Bayesianactive learning for classification and preference learning [J].arXiv:11125745,2011.
[13]GAL Y,GHAHRAMANI Z.Dropout as a bayesian approximation:Representing model uncertainty in deep learning[C]//Proceedings of the 33nd International Conference on Machine Learning.2016:1050-1059.
[14]SIDDHANT A,LIPTON Z C.Deep bayesian active learning for natural language processing:Results of a large-scale empirical study[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:2904-2909.
[15]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[16]EIN-DOR L,HALFON A,GERA A,et al.Active Learning for BERT:An Empirical Study[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:7949-7962.
[17]SHELMANOV A,PUZYREV D,KUPRIYANOVA L,et al.Active learning for sequence tagging with deep pre-trained mo-dels and Bayesian uncertainty estimates[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:1698-1712.
[18]MARGATINA K,BARRAULT L,ALETRAS N.On the Importance of Effectively Adapting Pretrained Language Models for Active Learning[C]//Proceedings of the 60th Annual Mee-ting of the Association for Computational Linguistics(Volume 2:Short Papers).2022:825-836.
[19]LINDENBAUM M,MARKOVITCH S,RUSAKOV D.Selective sampling for nearest neighbor classifiers [J].Machine Learning,2004,54(2):125-152.
[20]WAN F,YUAN T,FU M,et al.Nearest Neighbor Classifier Embedded Network for Active Learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:10041-10048.
[21]ARTHUR D,VASSILVITSKII S.K-means++ the advantages of careful seeding[C]//Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms.2007:1027-1035.
[22]YUAN D,CHANG X,LIU Q,et al.Active Learning for Deep Visual Tracking [J].IEEE Transactions on Neural Networks and Learning Systems,2023,35(10):13284-13296.
[23]FREYTAG A,RODNER E,DENZLER J.Selecting influential examples:Active learning with expected model output changes[C]//European Conference on Computer Vision.2014:562-577.
[24]KDING C,RODNER E,FREYTAG A,et al.Active and conti-nuous exploration with deep neural networks and expected mo-del output changes [J].arXiv:1612.06129,2016.
[25]ROY N,MCCALLUM A.Toward optimal active learningthrough sampling estimation of error reduction[C]//Procee-dings of the Eighteenth International Conference on Machine Learning.2001:441-448.
[26]MAC AODHA O,CAMPBELL N D F,KAUTZ J,et al.Hierarchical subquery evaluation for active learning on a graph[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:564-571.
[27]SETTLES B,CRAVEN M,RAY S.Multiple-instance activelearning[C]//Advances in Neural Information Processing Systems 20,Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems.2007:1289-1296.
[28]FANG M,LI Y,COHN T.Learning how to active learn:A deep reinforcement learning approach[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Proces-sing.2017:595-605.
[29]LIU M,BUNTINE W,HAFFARI G.Learning how to actively learn:A deep imitation learning approach[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:1874-1883.
[30]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems(Volume 2).2014:2672-2680.
[31]SINHA S,EBRAHIMI S,DARRELL T.Variational adversarial active learning[C]//Proceedings of the IEEE/CVF Interna-tional Conference on Computer Vision.2019:5972-5981.
[32]KIM K,PARK D,KIM K I,et al.Task-aware variational adversarial active learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8166-8175.
[33]ZHANG B,LI L,YANG S,et al.State-relabeling adversarial ac-tive learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8756-8765.
[34]CHO J W,KIM D J,JUNG Y,et al.Mcdal:Maximum classifier discrepancy for active learning [J].IEEE Transactions on Neural Networks and Learning Systems,2023,34(11):8753-8763.
[35]GENG L,LIU N,QIN J.Multi-classifier adversarial optimization for active learning[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.2023:7687-7695.
[36]ZHOU H,SHI H C,TU Y F,et al.Robust Deep Neural Network Learning Based on Active Sampling [J].Computer Science,2022,49(7):164-169.
[37]DING H,ZOU P,ZHAO J,et al.Active Learning-based Text Entity and Relation Joint Extraction Method [J].Computer Science,2023,50(10):126-134.
[38]DHIMAN G,KUMAR A V,NIRMALAN R,et al.Multi-modal active learning with deep reinforcement learning for target feature extraction in multi-media image processing applications [J].Multimedia Tools and Applications,2023,82(4):5343-5367.
[39]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition [J].Proceedings of the IEEE,1998,86(11):2278-2324.
[40]VOORHEES E M,TICE D M.Building a question answeringtest collection[C]//Proceedings of the 23rd Annual Interna-tional ACM SIGIR Conference on Research and Development in Information Retrieval.2000:200-207.
[41]SCHUSTER M,PALIWAL K.Bidirectional recurrent neuralnetworks [J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681.
[42]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.2014:1532-1543.
[43]KINGMA D P,BA J.Adam:A method for stochastic optimization[C]//International Conference on Learning Representations.2015:1-15.
[44]YUAN M,LIN H T,BOYD-GRABER J.Cold-start activelearning through self-supervised language modeling[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:7935-7948.
[45]ZHDANOV F.Diverse mini-batch active learning [J].arXiv:1901.05954,2019.
[46]ZHU J,WANG H,YAO T,et al.Active learning with sampling by uncertainty and density for word sense disambiguation and text classification[C]//Proceedings of the 22nd International Conference on Computational Linguistics(Coling 2008).2008:1137-1144.
[1] ZHANG Yusong, XU Shuai, YAN Xingyu, GUAN Donghai, XU Jianqiu. Survey on Cross-city Human Mobility Prediction [J]. Computer Science, 2025, 52(1): 102-119.
[2] LIU Yuming, DAI Yu, CHEN Gongping. Review of Federated Learning in Medical Image Processing [J]. Computer Science, 2025, 52(1): 183-193.
[3] LI Yujie, MA Zihang, WANG Yifu, WANG Xinghe, TAN Benying. Survey of Vision Transformers(ViT) [J]. Computer Science, 2025, 52(1): 194-209.
[4] ZHU Xiaoyan, WANG Wenge, WANG Jiayin, ZHANG Xuanping. Just-In-Time Software Defect Prediction Approach Based on Fine-grained Code Representationand Feature Fusion [J]. Computer Science, 2025, 52(1): 242-249.
[5] ZHANG Jian, LI Hui, ZHANG Shengming, WU Jie, PENG Ying. Review of Pre-training Methods for Visually-rich Document Understanding [J]. Computer Science, 2025, 52(1): 259-276.
[6] ZHANG Xin, ZHANG Han, NIU Manyu, JI Lixia. Adversarial Sample Detection in Computer Vision:A Survey [J]. Computer Science, 2025, 52(1): 345-361.
[7] SU Chaoran, ZHANG Dalong, HUANG Yong, DONG An. RF Fingerprint Recognition Based on SE Attention Multi-source Domain Adversarial Network [J]. Computer Science, 2025, 52(1): 412-419.
[8] DU Yu, YU Zishu, PENG Xiaohui, XU Zhiwei. Padding Load:Load Reducing Cluster Resource Waste and Deep Learning Training Costs [J]. Computer Science, 2024, 51(9): 71-79.
[9] XU Jinlong, GUI Zhonghua, LI Jia'nan, LI Yingying, HAN Lin. FP8 Quantization and Inference Memory Optimization Based on MLIR [J]. Computer Science, 2024, 51(9): 112-120.
[10] WANG Tianjiu, LIU Quan, WU Lan. Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight [J]. Computer Science, 2024, 51(9): 265-272.
[11] SUN Yumo, LI Xinhang, ZHAO Wenjie, ZHU Li, LIANG Ya’nan. Driving Towards Intelligent Future:The Application of Deep Learning in Rail Transit Innovation [J]. Computer Science, 2024, 51(8): 1-10.
[12] KONG Lingchao, LIU Guozhu. Review of Outlier Detection Algorithms [J]. Computer Science, 2024, 51(8): 20-33.
[13] TANG Ruiqi, XIAO Ting, CHI Ziqiu, WANG Zhe. Few-shot Image Classification Based on Pseudo-label Dependence Enhancement and NoiseInterferenceReduction [J]. Computer Science, 2024, 51(8): 152-159.
[14] XIAO Xiao, BAI Zhengyao, LI Zekai, LIU Xuheng, DU Jiajin. Parallel Multi-scale with Attention Mechanism for Point Cloud Upsampling [J]. Computer Science, 2024, 51(8): 183-191.
[15] ZHANG Junsan, CHENG Ming, SHEN Xiuxuan, LIU Yuxue, WANG Leiquan. Diversified Label Matrix Based Medical Image Report Generation [J]. Computer Science, 2024, 51(8): 200-208.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!