计算机科学 ›› 2017, Vol. 44 ›› Issue (1): 37-41.doi: 10.11896/j.issn.1002-137X.2017.01.007

• 2016第六届中国数据挖掘会议 • 上一篇    下一篇

在线序列主动学习方法

翟俊海,臧立光,张素芳   

  1. 河北大学数学与信息科学学院河北省机器学习与计算智能重点实验室 保定071002,河北大学计算机科学与技术学院 保定071002,中国气象局气象干部培训学院河北分院 保定071000
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金项目(71371063),河北省自然科学基金项目(F2013201220),河北省高等学校科学技术研究重点项目(ZD20131028),河北省高等学校科学技术研究项目(QN20131153)资助

Online Sequential Active Learning Approach

ZHAI Jun-hai, ZANG Li-guang and ZHANG Su-fang   

  • Online:2018-11-13 Published:2018-11-13

摘要: 现实世界中存在着大量无类标的数据,如各种医疗图像数据、网页数据等。在大数据时代,这种情况更加突出。标注这些无类标的数据需要付出巨大的代价。主动学习是解决这一问题的有效手段,也是近几年机器学习和数据挖掘领域中的一个研究热点。提出了一种基于在线序列极限学习机的主动学习算法,该算法利用在线序列极限学习机增量学习的特点,可显著提高学习系统的效率。另外,该算法用样例熵作为启发式度量无类标样例的重要性,用K-近邻分类器作为Oracle标注选出的无类标样例的类别。实验结果显示,提出的算法具有学习速度快、标注准确的特点。

关键词: 主动学习,极限学习机,在线序列学习,样例熵,K-近邻

Abstract: In the real world,there are a lot of unlabelled data,such as various medical images and web data,etc.In the era of big data,this situation is more prominent.It is expensive to label large amount of unlabelled data.Active learning is an effective method to solve this problem,and it is one of the hot research topics in the field of machine learning and data mining.Based on online sequential extreme learning machine,an active learning algorithm was proposed in this paper.Due to the nature of incremental learning embedded in online sequential extreme learning machine,the proposed algorithm can significantly improve the efficiency of learning system.Furthermore,the proposed algorithm uses instance entropy as heuristic to measure the importance of the unlabeled instances,and uses K-nearest neighbor classifier as Oracle to label the selected instances.The experimental results show that the proposed algorithm has fast learning speed with exact labeling.

Key words: Active learning,Extreme learning machine,Online sequential learning,Instance entropy,K-nearest neighbors

[1] ANGLUIN D.Queries and concept learning [J].Machine Lear-ning,1988,2(4):319-342.
[2] SEUNG H,OPPER M,SOMPOLINSKY H.Query by committee [C]∥Proceedings of the Fifth Annual Workshop on Computational Learning Theory.1992:287-294.
[3] LEWIS D,GAIL W.A sequential algorithm for training textclassifiers [C]∥Proceedings of the 17th ACM International Conference on Research and Development in Information Retrieval.Berlin:Springer,New York,1994:3-12.
[4] SCHOHN G,COHN D.Less is more:active learning with support vector machines [C]∥Proceedings 17th International Conference on Machine Learning.Morgan Kaufmann,San Francisco,CA,2000:839-846.
[5] WANG X Z,DONG L C,YAN J H.Maximum ambiguity based sample selection in fuzzy decision tree induction [J].IEEE Transactions on Knowledge and Data Engineering,2012,24(8):1491-1505.
[6] TONG S,KOLLER D.Support vector machine active learning with applications to text classification [J].The Journal of Machine Learning Research,2002,2:45-66.
[7] COHN D,ATLAS L,LADNER R.Improving generalizationwith active learning [J].Machine Learning,1994,15(2):201-221.
[8] WANG R,KWONG S,CHEN D G.Inconsistency-based active learning for support vector machines [J].Pattern Recognition,2012(45):3751-3767.
[9] LUGHOFER E.Hybrid active learning for reducing the annotation effort of operators in classification systems [J].Pattern Recognition,2012(45):884-896.
[10] WANG Z,YAN S H,ZHANG C S.Active learning with adaptive regularization [J].Pattern Recognition,2011,44(10/11):2375-2383.
[11] SUN S,HARDOON D R.Active learning with extremely sparse labeled examples [J].Neurocomputing,2010,73(16-18):2980-2988.
[12] HE Y B,GENG Z.Active Learning of Causal Networks with Intervention Experiments and Optimal Designs [J].Journal of Machine Learning Research,2008,9:2523-2547.
[13] HOI S C H,JIN R,LYU M R.Batch mode active learning with applications to text categorization and image retrieval [J].IEEE Transactions on Knowledge and Data Engineering,2009,21(9):1233-1248.
[14] TIAN Chun-na,GAO Xin-bo,LI Jie.An example selection me-thod for active learning based on embedded bootstrap algorithm [J].Journal of Computer Research and Development,2006,43(10):1706-1712.(in Chinese) 田春娜,高新波,李洁.基于嵌入式Bootstrap的主动学习示例选择方法 [J].计算机研究与发展,2006,43(10):1706-1712.
[15] GAO X B,SU Y,LI X L,et al.A review of active appearance models [J].IEEE Transaction on System,Man,and Cybernetics Part C:Applications and Reviews,2010,40(2):145-158.
[16] ZHANG C S,WANG F.A multilevel approach for learning from labeled and unlabeled data on graphs [J].Pattern Recognition,2010,43(6):2301-2315.
[17] YU H,SUN C,YANG W,et al.AL-ELM:One uncertainty-based active learning algorithm using extreme learning machine [J].Neurocomputing,2015,166:140-150.
[18] YONG Z,MENG J E.Sequential active learning using meta-cognitive extreme learning machine [J].Neurocomputing,2016,3:835-844.
[19] GU Y,JIN Z,CHIU S C.Active learning combining uncertainty and diversity for multi-class image classification [J].IET Computer Vision,2015,9(3):400-407.
[20] LONG B,BIAN J,CHAPELLE O,et al.Active learning forranking through expected loss optimization[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(5):1180-1191.
[21] HUANG S J,JIN R,ZHOU Z H.Active Learning by Querying Informative and Representative Examples [J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,36(10):1936-1949.
[22] HU L S,LU S X,WANG X Z.A new and informative active learning approach for support vector machine [J].Information Sciences,2013,244(7):142-160.
[23] HUANG G B,ZHU Q Y,SIEW C K.Extreme learning ma-chine:Theory and applications [J].Neurocomputing,2006,70(1-3):489-501.
[24] LIANG N Y,HUANG G B,SARATCHANDRAN P,et al.A fast and accurate on-line sequential learning algorithm for feedforward networks [J].IEEE Transactions on Neural Networks,2006,7(6):1411-1423.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75 .
[2] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[3] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[4] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[5] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99 .
[6] 周燕萍,业巧林. 基于L1-范数距离的最小二乘对支持向量机[J]. 计算机科学, 2018, 45(4): 100 -105 .
[7] 刘博艺,唐湘滟,程杰仁. 基于多生长时期模板匹配的玉米螟识别方法[J]. 计算机科学, 2018, 45(4): 106 -111 .
[8] 耿海军,施新刚,王之梁,尹霞,尹少平. 基于有向无环图的互联网域内节能路由算法[J]. 计算机科学, 2018, 45(4): 112 -116 .
[9] 崔琼,李建华,王宏,南明莉. 基于节点修复的网络化指挥信息系统弹性分析模型[J]. 计算机科学, 2018, 45(4): 117 -121 .
[10] 王振朝,侯欢欢,连蕊. 抑制CMT中乱序程度的路径优化方案[J]. 计算机科学, 2018, 45(4): 122 -125 .