Computer Science ›› 2019, Vol. 46 ›› Issue (6A): 56-59.

• Intelligent Computing • Previous Articles     Next Articles

Automatic Extraction of Diversity Keyphrase by Utilizing Integer Liner Programming

LI Shan-shan, CHEN Li, TANG Yu-ting, WANG Yi-lin, YU Zhong-hua   

  1. College of Computer Science,Sichuan University,Chengdu 610065,China
  • Online:2019-06-14 Published:2019-07-02

Abstract: Keyphrases are the concise summary of text information,which can represent the main topics and the core ideas of texts.And the automatic extraction of key phrases is one of the important tasks for natural language processing and information retrieval.Aiming at the existing problem caused by semantic over-generation on candidate phrases with unsupervised method,this paper proposed an algorithm for automaticextraction of keyphrase by using integer linear programming (ILP) and similarity of candidate phrases,in which candidate phrases with high sematic similarity are punished for maximizing the object function to obtain diversified keyphrases.TextRand and TFIDF algorithms are applied in the proposed method to create candidate phrases based on two different corpus sets and the proposedoptimization algorithm is utilized to optimize the weight scores of candidate phrases.Finally,the results of the proposed optimization algorithm is compared with the ones of baseline methods,and the experimental results show that the proposed method can solve the semantic over-generation problem effectively by punishing candidate phrases with high semantic similarity.Moreover,the optimization algorithm can obtain more diverse keyphrases and the optimized results of P,R and F value outperform the ones of baseline methods.

Key words: Automatic keyphrase extraction, Integer liner programming, Semantic over-generation, Diversity

CLC Number: 

  • TP309
[1] BARKER K,CORNACCHIA N.Using Noun Phrase Heads to Extract Document Keyphrases[C]∥Advances in Artificial Intelligence.Springer Berlin Heidelberg,2000:40-52.
[2] EKMAN P.An argument for basic emotions[J].Cogition and emotion,1992,6(3-4):169-200.
[3] CARAGEA C,BULGAROV F A,GODEA A,et al.Citation-Enhanced Keyphrase Extraction from Research Papers:A Supervised Approach[C]∥Conference on Empirical Methods in Natural Language Processing.2014:1435-1446.
[4] MIHALCEA R,TARAU P.TextRank:Bringing Order into Texts[M].Emnlp,2004:404-411.
[5] WAN X,XIAO J.CollabRank:Towards a Collaborative Ap-proach to SingleDocumentKeyphrase Extraction[C]∥Proceedings of the Conference International Conference on Computational Linguistics,COLING 2008.Manchester,Uk.DBLP,2008:969-976.
[6] LIU Z,LIANG C,SUN M.Topical Word Trigger Model for Keyphrase Extraction[C]∥COLING.2012:1715-1730.
[7] NGUYEN T D,KAN M Y.Keyphrase extraction in scientific publications[C]∥International Conference on Asian Digital Libraries:Looking Back 10 Years and Forging New Frontiers.Springer-Verlag,2007:317-326.
[8] TOMOKIYO T,HURST M.A language model approach to keyphrase extraction[C]∥ACL 2003 Workshop on Multiword Expressions:Analysis,Acquisition and Treatment.Association for Computational Linguistics,2003:33-40.
[9] HASAN K S,NG V.Automatic Keyphrase Extraction:A Survey of the State of the Art[C]∥Meeting of the Association for Computational Linguistics.2014:1262-1273.
[10] WITTEN I H,PAYNTER G W,FRANK E,et al.KEA:practical automatic keyphrase extraction[C]∥ACM Conference on Digital Libraries.Berkeley,CA,USA.CiteSeer,1999:254-255.
[11] TURNEY P D.Coherent keyphrase extraction via Web mining [C]∥International Joint Conference on Artificial Intelligence.Morgan Kaufmann Publishers Inc,2003:434-439.
[12] FLORESCU C,CARAGEA C.A Position-Biased PageRank Algorithm for Keyphrase Extraction[C]∥Proceedings of the American Association for Artificial Intelligence.2017.
[13] HASAN K S,NG V.Automatic Keyphrase Extraction:A Survey of the State of the Art[C]∥Meeting of the Association for Computational Linguistics.2014:1262-1273.
[14] BOUDIN F.Reducing Over-generation Errors for Automatic Keyphrase Extraction using Integer Linear Programming[C]∥ACL 2015 Workshop on Novel Computational Approaches to Keyphrase Extraction.China,2015.
[15] HULTH A.Improved automatic keyword extraction given more linguistic knowledge[C]∥Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2003:216-223.
[16] KIM S N,MEDELYANO,KAN M Y,et al.SemEval-2010 task 5:Automatic keyphrase extraction from scientific articles[C]∥International Workshop on Semantic Evaluation.Association for Computational Linguistics,2010:21-26.
[17] LE T T N,NGUYEN M L,SHIMAZU A.Unsupervised Keyphrase Extraction:Introducing New Kinds of Words to Keyphrases[C]∥Australasian Joint Conference on Artificial Intelligence.Springer International Publishing,2016:665-671.
[1] ZHANG Yan-hong, ZHANG Chun-guang, ZHOU Xiang-zhen, WANG Yi-ou. Diverse Video Recommender Algorithm Based on Multi-property Fuzzy Aggregate of Items [J]. Computer Science, 2019, 46(8): 78-83.
[2] ZHANG Xue-fu, ZENG Pan, JIN Min. Cancer Classification Prediction Model Based on Correlation and Similarity [J]. Computer Science, 2019, 46(7): 300-307.
[3] GUAN Xiao-qiang, PANG Ji-fang, LIANG Ji-ye. Randomization of Classes Based Random Forest Algorithm [J]. Computer Science, 2019, 46(2): 196-201.
[4] SHI Jin-ping,LI Jin,HE Feng-zhen. Diversity Recommendation Approach Based on Social Relationship and User Preference [J]. Computer Science, 2018, 45(6A): 423-427.
[5] ZHANG Yu-jia, PANG Jian-min, ZHANG Zheng and WU Jiang-xing. Mimic Security Defence Strategy Based on Software Diversity [J]. Computer Science, 2018, 45(2): 215-221.
[6] SHAN Tian-yu, GUAN Yu-yang. Differential Evolution Algorithm with Adaptive Population Size Reduction Based on Population Diversity [J]. Computer Science, 2018, 45(11A): 160-166.
[7] CAO Min-zi, ZHANG Lin-lin, BI Xue-hua, ZHAO Kai. Personalized (α,l)-diversity k-anonymity Model for Privacy Preservation [J]. Computer Science, 2018, 45(11): 180-186.
[8] XIA Jun, LIU Jun-fa, JIANG Xin-long, CHEN Yi-qiang. Incremental Indoor Localization for Device Diversity Issues [J]. Computer Science, 2018, 45(10): 69-77.
[9] WANG Zhong-min, ZHANG Shuang and HE Yan. Selective Ensemble Learning Human Activity Recognition Model Based on Diversity Measurement Cluster [J]. Computer Science, 2018, 45(1): 307-312.
[10] HE Xu, JING Xiao-ning, FENG Chao and CHENG Yue. Diversity-guided FPSO Algorithm for Solving Air Refueling Region Deplaying Problem [J]. Computer Science, 2017, 44(Z11): 547-551.
[11] YE Xiang, ZHANG Guo-an and WU Min. Research on Performance of Cooperative Communication Based on Constrained Area Relay in VANET [J]. Computer Science, 2017, 44(6): 102-107.
[12] JIAO Chong-yang, ZHOU Qing-lei and ZHANG Wen-ning. MPSO and Its Application in Test Data Automatic Generation [J]. Computer Science, 2017, 44(12): 249-254.
[13] SU Ding-wei, ZHOU Chuang-ming and WANG Yi. Particle Swarm Algorithm for Multi-objective Optimization Based on Intuitionistic Fuzzy Entropy [J]. Computer Science, 2016, 43(8): 262-266.
[14] ZHANG Rui, JIN Zhi-gang and WANG Ying. Recommendation Model of Microblog User Tags Based on Hybrid Grain [J]. Computer Science, 2016, 43(4): 192-196, 230.
[15] ZHANG Dong, CAI Guo-yong and XIA Bin-bin. Improving Recommendation Diversity via Probabilistic Selection [J]. Computer Science, 2016, 43(2): 72-77.
Full text



[1] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .
[2] ZHU Shu-qin, WANG Wen-hong and LI Jun-qing. Chosen Plaintext Attack on Chaotic Image Encryption Algorithm Based on Perceptron Model[J]. Computer Science, 2018, 45(4): 178 -181, 189 .
[3] RAN Zheng, LUO Lei, YAN Hua and LI Yun. Study on Automatic Method for AUTOSAR Runnable Entity-task Mapping[J]. Computer Science, 2018, 45(4): 190 -195, 226 .
[4] GUO Jun-xia, GUO Ren-fei, XU Nan-shan and ZHAO Rui-lian. Study on Construction of EFSM Model for Web Application Based on Session[J]. Computer Science, 2018, 45(4): 203 -207, 214 .
[5] DING Shu-yang, LI Bing and SHI Hong-bo. Study on Flexible Job-shop Scheduling Problem Based on Improved Discrete Particle Swarm Optimization Algorithm[J]. Computer Science, 2018, 45(4): 233 -239, 256 .
[6] LI Hao-yang and FU Yun-qing. Collaborative Filtering Recommendation Algorithm Based on Tag Clustering and Item Topic[J]. Computer Science, 2018, 45(4): 247 -251 .
[7] LIU Meng-jun, LIU Shu-bo and DING Yong-gang. 0-1 Code Based Privacy-preserving Data Value Matching in Participatory Sensing[J]. Computer Science, 2018, 45(3): 131 -137 .
[8] LI Hui, ZHOU Lin and XIN Wen-bo. Optimization of Networked Air-defense Operational Formation Structure Based on Bilevel Programming[J]. Computer Science, 2018, 45(4): 266 -272, 300 .
[9] WU Wei-nan, LIU Jian-ming. Dynamic Retransmission Algorithm inLow-power Wireless Sensor Networks[J]. Computer Science, 2018, 45(6): 96 -99,123 .
[10] ZHANG Yu, GAO Ke-ning, YU Ge. Method of Link Prediction in Social Networks Using Node Attribute Information[J]. Computer Science, 2018, 45(6): 41 -45 .