Computer Science ›› 2019, Vol. 46 ›› Issue (6A): 56-59.

• Intelligent Computing • Previous Articles     Next Articles

Automatic Extraction of Diversity Keyphrase by Utilizing Integer Liner Programming

LI Shan-shan, CHEN Li, TANG Yu-ting, WANG Yi-lin, YU Zhong-hua   

  1. College of Computer Science,Sichuan University,Chengdu 610065,China
  • Online:2019-06-14 Published:2019-07-02

Abstract: Keyphrases are the concise summary of text information,which can represent the main topics and the core ideas of texts.And the automatic extraction of key phrases is one of the important tasks for natural language processing and information retrieval.Aiming at the existing problem caused by semantic over-generation on candidate phrases with unsupervised method,this paper proposed an algorithm for automaticextraction of keyphrase by using integer linear programming (ILP) and similarity of candidate phrases,in which candidate phrases with high sematic similarity are punished for maximizing the object function to obtain diversified keyphrases.TextRand and TFIDF algorithms are applied in the proposed method to create candidate phrases based on two different corpus sets and the proposedoptimization algorithm is utilized to optimize the weight scores of candidate phrases.Finally,the results of the proposed optimization algorithm is compared with the ones of baseline methods,and the experimental results show that the proposed method can solve the semantic over-generation problem effectively by punishing candidate phrases with high semantic similarity.Moreover,the optimization algorithm can obtain more diverse keyphrases and the optimized results of P,R and F value outperform the ones of baseline methods.

Key words: Automatic keyphrase extraction, Diversity, Integer liner programming, Semantic over-generation

CLC Number: 

  • TP309
[1]BARKER K,CORNACCHIA N.Using Noun Phrase Heads to Extract Document Keyphrases[C]∥Advances in Artificial Intelligence.Springer Berlin Heidelberg,2000:40-52.
[2]EKMAN P.An argument for basic emotions[J].Cogition and emotion,1992,6(3-4):169-200.
[3]CARAGEA C,BULGAROV F A,GODEA A,et al.Citation-Enhanced Keyphrase Extraction from Research Papers:A Supervised Approach[C]∥Conference on Empirical Methods in Natural Language Processing.2014:1435-1446.
[4]MIHALCEA R,TARAU P.TextRank:Bringing Order into Texts[M].Emnlp,2004:404-411.
[5]WAN X,XIAO J.CollabRank:Towards a Collaborative Ap-proach to SingleDocumentKeyphrase Extraction[C]∥Proceedings of the Conference International Conference on Computational Linguistics,COLING 2008.Manchester,Uk.DBLP,2008:969-976.
[6]LIU Z,LIANG C,SUN M.Topical Word Trigger Model for Keyphrase Extraction[C]∥COLING.2012:1715-1730.
[7]NGUYEN T D,KAN M Y.Keyphrase extraction in scientific publications[C]∥International Conference on Asian Digital Libraries:Looking Back 10 Years and Forging New Frontiers.Springer-Verlag,2007:317-326.
[8]TOMOKIYO T,HURST M.A language model approach to keyphrase extraction[C]∥ACL 2003 Workshop on Multiword Expressions:Analysis,Acquisition and Treatment.Association for Computational Linguistics,2003:33-40.
[9]HASAN K S,NG V.Automatic Keyphrase Extraction:A Survey of the State of the Art[C]∥Meeting of the Association for Computational Linguistics.2014:1262-1273.
[10]WITTEN I H,PAYNTER G W,FRANK E,et al.KEA:practical automatic keyphrase extraction[C]∥ACM Conference on Digital Libraries.Berkeley,CA,USA.CiteSeer,1999:254-255.
[11]TURNEY P D.Coherent keyphrase extraction via Web mining [C]∥International Joint Conference on Artificial Intelligence.Morgan Kaufmann Publishers Inc,2003:434-439.
[12]FLORESCU C,CARAGEA C.A Position-Biased PageRank Algorithm for Keyphrase Extraction[C]∥Proceedings of the American Association for Artificial Intelligence.2017.
[13]HASAN K S,NG V.Automatic Keyphrase Extraction:A Survey of the State of the Art[C]∥Meeting of the Association for Computational Linguistics.2014:1262-1273.
[14]BOUDIN F.Reducing Over-generation Errors for Automatic Keyphrase Extraction using Integer Linear Programming[C]∥ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction.China,2015.
[15]HULTH A.Improved automatic keyword extraction given more linguistic knowledge[C]∥Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2003:216-223.
[16]KIM S N,MEDELYANO,KAN M Y,et al.SemEval-2010 task 5:Automatic keyphrase extraction from scientific articles[C]∥International Workshop on Semantic Evaluation.Association for Computational Linguistics,2010:21-26.
[17]LE T T N,NGUYEN M L,SHIMAZU A.Unsupervised Keyphrase Extraction:Introducing New Kinds of Words to Keyphrases[C]∥Australasian Joint Conference on Artificial Intelligence.Springer International Publishing,2016:665-671.
[1] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[2] CHEN Zhuang, ZOU Hai-tao, ZHENG Shang, YU Hua-long, GAO Shang. Diversity Recommendation Algorithm Based on User Coverage and Rating Differences [J]. Computer Science, 2022, 49(5): 159-164.
[3] LIU Yi, MAO Ying-chi, CHENG Yang-kun, GAO Jian, WANG Long-bao. Locality and Consistency Based Sequential Ensemble Method for Outlier Detection [J]. Computer Science, 2022, 49(1): 146-152.
[4] ZHOU Gang, GUO Fu-liang. Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data [J]. Computer Science, 2021, 48(6A): 250-254.
[5] ZHANG Yan-hong, ZHANG Chun-guang, ZHOU Xiang-zhen, WANG Yi-ou. Diverse Video Recommender Algorithm Based on Multi-property Fuzzy Aggregate of Items [J]. Computer Science, 2019, 46(8): 78-83.
[6] ZHANG Xue-fu, ZENG Pan, JIN Min. Cancer Classification Prediction Model Based on Correlation and Similarity [J]. Computer Science, 2019, 46(7): 300-307.
[7] GUAN Xiao-qiang, PANG Ji-fang, LIANG Ji-ye. Randomization of Classes Based Random Forest Algorithm [J]. Computer Science, 2019, 46(2): 196-201.
[8] CHANG Xiao-lin, FAN Yong-wen, ZHU Wei-jun, LIU Yang. Management Information System Based on Mimic Defense [J]. Computer Science, 2019, 46(11A): 438-441.
[9] SHI Jin-ping,LI Jin,HE Feng-zhen. Diversity Recommendation Approach Based on Social Relationship and User Preference [J]. Computer Science, 2018, 45(6A): 423-427.
[10] ZHANG Yu-jia, PANG Jian-min, ZHANG Zheng and WU Jiang-xing. Mimic Security Defence Strategy Based on Software Diversity [J]. Computer Science, 2018, 45(2): 215-221.
[11] SHAN Tian-yu, GUAN Yu-yang. Differential Evolution Algorithm with Adaptive Population Size Reduction Based on Population Diversity [J]. Computer Science, 2018, 45(11A): 160-166.
[12] CAO Min-zi, ZHANG Lin-lin, BI Xue-hua, ZHAO Kai. Personalized (α,l)-diversity k-anonymity Model for Privacy Preservation [J]. Computer Science, 2018, 45(11): 180-186.
[13] XIA Jun, LIU Jun-fa, JIANG Xin-long, CHEN Yi-qiang. Incremental Indoor Localization for Device Diversity Issues [J]. Computer Science, 2018, 45(10): 69-77.
[14] WANG Zhong-min, ZHANG Shuang and HE Yan. Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Measurement Cluster [J]. Computer Science, 2018, 45(1): 307-312.
[15] HE Xu, JING Xiao-ning, FENG Chao and CHENG Yue. Diversity-guided FPSO Algorithm for Solving Air Refueling Region Deplaying Problem [J]. Computer Science, 2017, 44(Z11): 547-551.
Full text



No Suggested Reading articles found!