Computer Science ›› 2023, Vol. 50 ›› Issue (4): 181-187.doi: 10.11896/jsjkx.220700164

• Artificial Intelligence • Previous Articles     Next Articles

Incorporating Multi-granularity Extractive Features for Keyphrase Generation

ZHEN Tiange, SONG Mingyang, JING Liping   

  1. School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China
    Beijing Key Lab of Traffic Data Analysis and Mining,Beijing Jiaotong University,Beijing 100044,China
  • Received:2022-07-18 Revised:2022-11-21 Online:2023-04-15 Published:2023-04-06
  • About author:ZHEN Tiange,born in 1997,bachelor.Her main research interests include machine learning and natural language processing.
    JING Liping,born in 1978,Ph.D,professor,Ph.D supervisor,is a senior member of China Computer Federation.Her main research interests include machine learning and its applications.
  • Supported by:
    National Natural Science Foundation of China(61822601,61773050,61632004),Natural Science Foundation of Beijing,China(Z180006) and Program of Beijing Municipal Science & Technology Commission(Z181100008918012).

Abstract: Keyphrase is a set of phrases that summarizes the core theme and key content of a given text.At present,information overload is becoming more and more serious,it is crucial to predict phrases with their central ideas for a given large amount of textual information.Therefore,keyphrase prediction,as one of the basic tasks of natural language processing,has received more and more attention from research scholars.Its corresponding methods mainly contain two categories,namely keyphrase extraction and keyphrase generation.Keyphrase extraction is the fast and accurate extraction of salient phrases that appear in the given text.Unlike keyphrase extraction,keyphrase generation predicts both phrases that appear in the given text and those do not appear in the given text.In summary,both have their advantages and disadvantages.However,most of the existing work on keyphrase ge-neration has ignored the potential benefits that extractive features may bring to keyphrase generation models.Extractive features can indicate important fragments of the original text and play an important role for the model to learn the deep semantic representation of the original text.Therefore,combining the advantages of extractive and generative approaches,this paper proposes a new keyphrase generation model incorporating multi-granularity extractive features(MGE-Net).Compared with recent keyphrase ge-neration models on a series of publicly available datasets,the proposed model achieves significant performance improvements in most evaluation metrics.

Key words: Natural language processing, Sequence-to-Sequence, Keyphrase generation, Extractive features, Multi-task learning

CLC Number: 

  • TP391
[1]JONES S,STAVELEY M S.Phrasier:a system for interactivedocument retrieval using keyphrases[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,1999:160-167.
[2]HULTH A,MEGYESI B.A study on automatically extracted keywords in text categorization[C]//Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2006:537-544.
[3]QAZVINIAN V,RADEV D,ÖZGÜR A.Citation summari-zation through keyphrase extraction[C]//Proceedings of the 23rd International Conference on Computational Linguistics(COLING 2010).New York:ACM,2010:895-903.
[4]TOMOKIYO T,HURST M.A language model approach tokeyphrase extraction[C]//Proceedings of the ACL 2003 Workshop on Multiword Expressions:Analysis,Acquisition and Treatment.Stroudsburg,PA:ACL,2003:33-40.
[5]LIU Z Y,LI P,ZHENG Y B,et al.Clustering to find exemplar terms for keyphrase extraction[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Proces-sing.Stroudsburg,PA:ACL,2009:257-266.
[6]CHO K,VAN M B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014.
[7]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2.Cambridge,MA:MIT Press,2014:3104-3112.
[8]ALZAIDY R,CARAGEA C,GILES C L.Bi-LSTM-CRF se-quence labeling for keyphrase extraction from scholarly documents[C]//The World Wide Web Conference.New York:ACM,2019:2551-2557.
[9]BASALDELLA M,CHIARADIA G,TASSO C.Evaluatinganaphora and coreference resolution to improve automatic keyphrase extraction[C]//Proceedings of COLING 2016,the 26th International Conference on Computational Linguistics:Technical Papers.New York:ACM,2016:804-814.
[10]LOPEZ P,ROMARY L.HUMB:Automatickey term extraction from scientific articles in GROBID[C]//Proceedings of the 5th International Workshop on Semantic Evaluation.Stroudsburg,PA:ACL,2010:248-251.
[11]MIHALCEA R,TARAU P.Textrank:Bringing order into text[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2004:404-411.
[12]MENG R,ZHAO S Q,HAN S G,et al.Deep keyphrase generation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Stroudsburg,PA:ACL,2017:582-592.
[13]YUAN X D,WANG T,MENG R,et al.Onesize does not fit all:Generating and evaluating variable number of keyphrases[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2020:7961-7975.
[14]CHEN J,ZHANG X M,WU Y,et al.Keyphrase generationwith correlation constraints[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2018:4057-4066.
[15]ZHAO J,ZHANG Y X.Incorporating linguistic constraints into keyphrase generation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2019:5224-5233.
[16]BAHULEYAN H,EL A L.Diverse keyphrase generation with neural unlikelihood training[C]//Proceedings of the 28th International Conference on Computational Linguistics.New York:ACM,2020:5271-5287.
[17]YE H,WANG L.Semi-supervised learning for neural keyphrase generation[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2018:4142-4153.
[18]CHEN W,GAO Y F,ZHANG J N,et al.Title-guided encoding for keyphrase generation[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence.Menlo Park,CA:AAAI,2019:6268-6275.
[19]ZHANG Y,FANG Y,XIAO W D.Deep keyphrase generation with a convolutional sequence to sequence model[C]//2017 4th International Conference on Systems and Informatics(ICSAI).Piscataway,NJ:IEEE,2017:1477-1485.
[20]CHAN H P,CHEN W,WANG L,et al.Neural keyphrase ge-neration via reinforcement learning with adaptive rewards[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2019:2163-2174.
[21]WANG Y,LI J,CHAN H P,et al.Topic-aware neural key-phrase generation for social media language[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2019:2516-2526.
[22]CHEN W,CHAN H P,LI P J,et al.Exclusive hierarchical decoding for deep keyphrase generation[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2020:1095-1105.
[23]AHMAD W,BAI X,LEE S,et al.Select,extract and generate:Neural keyphrase generation with layer-wise coverage attention[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confernece on Natural Language Processing(Volume 1:Long Papers).Stroudsburg,PA:ACL,2021:1389-1404.
[24]BA J L,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016.
[25]HULTH A.Improved automatic keyword extraction given morelinguistic knowledge[C]//Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:ACL,2003:216-223.
[26]KRAPIVIN M,AUTAEU A,MARCHESE M.Large datasetfor keyphrases extraction[R].Trento,Italy:Information Engineering and Computer Science Department of Trento Univer-sity,2009.
[27]NGUYEN T D,KAN M Y.Keyphrase extraction in scientific publications[C]//Proceedings of the 10th International Confe-rence on Asian Digital Libraries.Berlin:Springer,2007:317-326.
[28]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[1] ZHENG Cheng, MEI Liang, ZHAO Yiyan, ZHANG Suhang. Text Classification Method Based on Bidirectional Attention and Gated Graph Convolutional Networks [J]. Computer Science, 2023, 50(1): 221-228.
[2] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[3] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[4] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[5] LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73.
[6] ZHAO Kai, AN Wei-chao, ZHANG Xiao-yu, WANG Bin, ZHANG Shan, XIANG Jie. Intracerebral Hemorrhage Image Segmentation and Classification Based on Multi-taskLearning of Shared Shallow Parameters [J]. Computer Science, 2022, 49(4): 203-208.
[7] YANG Xiao-yu, YIN Kang-ning, HOU Shao-qi, DU Wen-yi, YIN Guang-qiang. Person Re-identification Based on Feature Location and Fusion [J]. Computer Science, 2022, 49(3): 170-178.
[8] ZHANG Hu, BAI Ping. Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification [J]. Computer Science, 2022, 49(2): 279-284.
[9] ZHU Yi-na, CAO Yang, ZHONG Jing-yue, ZHENG Yong-zhi. Survey on Event Extraction Technology [J]. Computer Science, 2022, 49(12): 264-273.
[10] LIU Xiao-ying, WANG Huai, WU Jisiguleng. GAN and Chinese WordNet Based Text Summarization Technology [J]. Computer Science, 2022, 49(12): 301-304.
[11] XU Hui, WANG Zhong-qing, LI Shou-shan, ZHANG Min. Personalized Dialogue Generation Integrating Sentimental Information [J]. Computer Science, 2022, 49(11A): 211100019-6.
[12] Abudukelimu ABULIZI, ZHANG Yu-ning, Alimujiang YASEN, GUO Wen-qiang, Abudukelimu HALIDANMU. Survey of Research on Extended Models of Pre-trained Language Models [J]. Computer Science, 2022, 49(11A): 210800125-12.
[13] ZHENG Shun-yuan, HU Liang-xiao, LYU Xiao-qian, SUN Xin, ZHANG Sheng-ping. Edge Guided Self-correction Skin Detection [J]. Computer Science, 2022, 49(11): 141-147.
[14] CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107.
[15] WANG Li-mei, ZHU Xu-guang, WANG De-jia, ZHANG Yong, XING Chun-xiao. Study on Judicial Data Classification Method Based on Natural Language Processing Technologies [J]. Computer Science, 2021, 48(8): 80-85.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!