Computer Science ›› 2020, Vol. 47 ›› Issue (9): 190-197.doi: 10.11896/jsjkx.200700077

• Artificial Intelligence • Previous Articles     Next Articles

Active Label Distribution Learning Based on Marginal Probability Distribution Matching

DONG Xin-yue, FAN Rui-dong, HOU Chen-ping   

  1. College of Liberal Arts and Sciences,National University of Defense Technology,Changsha 410008,China
  • Received:2020-05-05 Published:2020-09-10
  • About author:DONG Xin-yue,born in 1996,postgra-duate.Her main research interests include statistical data analysis and machine learning.
    HOU Chen-ping,born in 1982,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include machine learning,statistical data analysis,pattern recognition and computer vision.
  • Supported by:
    National Natural Science Foundation of China (61922087,61906201) and Natural Science Foundation for Distinguished Young Scholars of Hunan Province (2019JJ20020).

Abstract: Label distribution learning (LDL)is a new learning paradigm for learning on instances labeled with label distribution,and has been successfully applied to real world scenes such as face age estimation,head pose estimation,and emotion recognition in recent years.In label distribution learning,enough data labeled by label distribution is needed when people train a model with good prediction performance.However,label distribution learning sometimes faces the dilemma that labeled data is insufficient and that marking enough label distribution data means high annotation cost.The Active label distribution learning based on marginal probability distribution matching (ADLD-MMD)algorithm is designed to solve the problem of high annotation cost for label distribution learning,by reducing the amount of labeled data required to train the model and reducing the annotation cost accor-dingly.The ALDL-MMD algorithm trains a linear regression model.While ensuring the minimum training error of the linear regression model,it learns a sparse vector that reflects that which instance in the unlabeled data set are selected,so that the data distribution of the training data set and unlabeled data set after instance selection is as similar as possible.We relax the vector for easy calculation.An effective method to optimize the objective function in ALDL-MMD is given,and proof for the convergence of ALDL-MMD is also provided.The experimental results on multiple label distribution data sets show that the ALDL-MMD algorithm is superior to the existing active example selection methods on the two evaluation measures of "Canberra Metric" (distance) and “Intersection” (similarity) to measure that what degree of the label distribution of the instance is accurate,which reflects its effectiveness in reducing annotation costs.

Key words: Active learning, Label distribution learning, Linear model, Marginal probability distribution matching, Maximum mean discrepancy

CLC Number: 

  • TP391
[1] ZHANG M,ZHOU Z.A Review on Multi-Label Learning Algorithms[J].IEEE Transactions on Knowledge & Data Enginee-ring,2014,26(8):1819-1837.
[2] GENG X,JI R.Label Distribution Learning[C]//IEEE International Conference on Data Mining Workshops.IEEE Computer Society,2013.
[3] HE Z,LI X,ZHANG Z.Data-Dependent Label DistributionLearning for Age Estimation[J].IEEE Trans. Image Process.,2017,26(8):3846-3858.
[4] KONG S,OYINI MBOUNA R.Head Pose Estima-tion from a 2-D Face Image using 3-D Face Morphing with Depth Parameters [J].IEEE Transactions on Image Processing,2015,24(6):1-1.
[5] ZHANG Z L,LAI C H,LIU H,et al.Infrared Facial Expression Recognition via Gaussian-based Label Distribution Learning in the Dark Illumination Environment[J].Neurocomputing,2020,409:341-350.
[6] ZHOU D Y,ZHANG X,ZHOU Y,et al.Emotion Distribution Learning from Texts[C]//Conference on Empirical Methods in Natural Language Processing.2016.
[7] SEUNG H S.Query by Committee[C]//Workshop on Computational Learning Theory.ACM,1992.
[8] FREUND Y,SEUNG H S,SHAMIR E,et al.Selective Sampling Using the Query by Committee Algorithm[J].Machine Learning,1997,28(2/3):133-168.
[9] GU S,CAI Y,SHAN J,et al.Active Learning with Error-Correcting Output Codes[J].Neurocomputing,2019,364:182-191.
[10] LINDLEY D V.On a Measure of the Information Provided by an Experiment[J].Annals of Mathematical Statistics,1956,27(4):986-1005.
[11] YU K,BI J B,TRESP V,et al.Active learning via transductive experimental design[C]//International Conference on Machine Learning.2006:1081-1088.
[12] PATRA S,BRUZZONE L.A cluster-assumption based batchmode active learning technique[J].Pattern Recognition Letters,2012,33(9):1042-1048.
[13] BURGES C J.A Tutorial on Support Vector Machines for Pattern Recognition[J].Data Mining and Knowledge Discovery,1998,2(2):121-167.
[14] MCCALLUM A,NIGAM K.A comparison of event models for naive bayes text classification[C]//National Conference on Artificial Intelligence.1998:41-48.
[15] PIETRA S D,PIETRA V D,LAFFERTY J,et al.Inducing features of random fields[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(4):380-393.
[16] NOCEDAL J,WRIGHT S.Numerical optimization[M].Springer Science & Business Media,2006:61-63.
[17] PRINCE M.Does Active Learning Work? A Review of the Research[J].Journal of Engineering Education,2004,93(3):223-231.
[18] CAI D,HE X.Manifold Adaptive Experimental Design for Text Categorization[J].IEEE Transactions on Knowledge and Data Engineering,2011,24(4):707-719.
[19] SINDHWANI V,NIYOGI P,BELKIN M,et al.Beyond thepoint cloud:from transductive to semi-supervised learning[C]//International Conference on Machine Learning.2005:824-831.
[20] PAN S J,TSANG I W,KWOK J T,et al.Domain Adaptation via Transfer Component Analysis[J].IEEE Transactions on Neural Networks,2011,22(2):199-210.
[21] GRANT M,BOYD S.CVX:Matlab software for disciplined convex programming[J].International Journal of Communications,Network and System Science,2008,1(1).
[22] EISEN M B,SPELLMAN P T,BROWN P O,et al.Clusteranalysis and display of genome-wide expression patterns[J].Proceedings of the National Academy of Sciences of the United States of America,1998,95(25):14863-14868.
[23] CHA S H.Comprehensive Survey on Distance/ Similarity Mea-sures Between Probability Density Functions[J].International Journal of Mathematical Models and Methods in Applied Sciences,2007,1(4):300-307.
[1] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[2] HOU Xia-ye, CHEN Hai-yan, ZHANG Bing, YUAN Li-gang, JIA Yi-zhen. Active Metric Learning Based on Support Vector Machines [J]. Computer Science, 2022, 49(6A): 113-118.
[3] WANG Xue-guang, ZHANG Ai-xin, DOU Bing-lin. Non-linear Load Capacity Model of Complex Networks [J]. Computer Science, 2021, 48(6): 282-287.
[4] ZHANG Ren-zhi, ZHU Yan. Malicious User Detection Method for Social Network Based on Active Learning [J]. Computer Science, 2021, 48(6): 332-337.
[5] MA Chuang, TIAN Qing, SUN He-yang, CAO Meng, MA Ting-huai. Unsupervised Domain Adaptation Based on Weighting Dual Biases [J]. Computer Science, 2021, 48(2): 217-223.
[6] WANG Ti-shuang, LI Pei-feng, ZHU Qiao-ming. Chinese Implicit Discourse Relation Recognition Based on Data Augmentation [J]. Computer Science, 2021, 48(10): 85-90.
[7] YAO Cheng-liang, ZHU Qing-sheng. Label Distribution Learning Based on Natural Neighbors [J]. Computer Science, 2020, 47(8): 132-136.
[8] YUAN Chen-hui, CHENG Chun-ling. Deep Domain Adaptation Algorithm Based on PE Divergence Instance Filtering [J]. Computer Science, 2020, 47(8): 151-156.
[9] LI Yi-hong, LIU Fang-zheng, DU Zhen-yu. Malware Detection Algorithm for Improving Active Learning [J]. Computer Science, 2019, 46(5): 92-99.
[10] ZHAO Hai-yan, WANG Jing, CHEN Qing-kui, CAO Jian. Application of Active Learning in Recommendation System [J]. Computer Science, 2019, 46(11A): 153-158.
[11] SUN Jin, CHEN Ruo-yu, LUO Heng-li. Research on Face Tagging Based on Active Learning [J]. Computer Science, 2018, 45(9): 299-302.
[12] LV Ju-jian, ZHAO Hui-min, CHEN Rong-jun, LI Jian-hong. Unsupervised Active Learning Based on Adaptive Sparse Neighbors Reconstruction [J]. Computer Science, 2018, 45(6): 251-258.
[13] LI Chang-li, ZHANG Lin, FAN Tang-huai. Hyperspectral Image Classification Based on Adaptive Active Learning and Joint Bilateral Filtering [J]. Computer Science, 2018, 45(12): 223-228.
[14] TANG Ying, SUN Kang-gao, QIN Xu-jia, ZHOU Jian-mei. Local Model Weighted Ensemble for Top-N Movie Recommendation [J]. Computer Science, 2018, 45(11A): 439-444.
[15] LI Feng and WAN Xiao-qiang. SMS Automatic Classification Based on Relational Matrix [J]. Computer Science, 2017, 44(Z6): 428-432.
Full text



No Suggested Reading articles found!