Computer Science ›› 2020, Vol. 47 ›› Issue (11A): 73-77.doi: 10.11896/jsjkx.200300121

• Artificial Intelligence • Previous Articles     Next Articles

Keyword Extraction Based on Multi-feature Fusion

DUAN Jian-yong, YOU Shi-xin, ZHANG Mei, WANG Hao   

  1. School of Information,North China University of Technology,Beijing 100144,China
  • Online:2020-11-15 Published:2020-11-17
  • About author:DUAN Jian-yong,born in 1978,Ph.D,professor,is a member of China Computer Federation.His main research interests include natural language processing and so on.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61972003,61672040).

Abstract: With the development of the Internet,webpage data,new media text and other data are increasing,the efficiency of information retrieval based on full text is not enough to support the retrieval of massive data,so the keyword extraction technology is widely used in search engines (such as Baidu search) and new media services (such as news retrieval).The fusion model is a model that uses the BiLSTM-CRF structure and fuses multiple manual features,which can more effectively complete the task of keyword extraction.Based on the features of words embedding,the fusion model incorporates the features of part of speech,word frequency,word length and word position.Themultidimensional feature information can help the model to extract deep keyword feature information more comprehensively.The fusion model combines the features of deep learning,such as wide coverage and high learning ability,with the ability of accurate expression of manual features to further improve the feature mining ability and shorten the training time.In addition,a labeling method called LMRSN is adopted in this modelto extract key phrases moreeffec-tively.Experimental results show that the fusion model achieves F1 score of 62.08 in comparison with the traditional model,and its performance is much better than that of the traditional model.

Key words: Deep learning, Feature fusion, Information retrieval, Keyword extraction, Long and short term memory network

CLC Number: 

  • TP391.1
[1] SALTON G,BUCKLEY C.Term-Weighting approaches in automatic text retrieval[J].Information Processing & Management,1988,24(5):513-523.
[2] HUANG L,WU Y P,ZHU Q F.Research and Improvement of TFIDF Text Feature Weighting Method[J].Computer Science,2014,41(6):204-207.
[3] BESILS R,MOSCHITTI A,PAZIENZA M.A text classifierbased on linguistic processing[C]//Proc.of the Int'l Joint Conf.on Artificial Intelligence.UCAI,1999:3640.
[4] MIHALCEA R,TARAU P.TextRank:Bringing order into text[C]//Proc.of the EMNLP 2004.Unt Scholarly Works,2004:404411.
[5] MA X Z,HOVY E.2016.End-to-end sequence labeling via bi-directional lstm-cnns-crf[J].arXiv:1603.01354.
[6] VITERBI A.Error bounds for convolutional codes and an asymptotically optimum decoding algorithm[J].IEEE Transactions on Information Theory,1967,13(2):260-269.
[1] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[9] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[10] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[11] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[12] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[13] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[14] YU Shu-hao, ZHOU Hui, YE Chun-yang, WANG Tai-zheng. SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion [J]. Computer Science, 2022, 49(6A): 256-260.
[15] LIU Bao-bao, YANG Jing-jing, TAO Lu, WANG He-ying. Study on Prediction of Educational Statistical Data Based on DE-LSTM Model [J]. Computer Science, 2022, 49(6A): 261-266.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!