一种基于注意力机制的中文短文本关键词提取模型

doi:10.11896/jsjkx.181202261

摘要/Abstract

摘要： 关键词抽取技术是自然语言处理领域的一个研究热点。在目前的关键词抽取算法中,深度学习方法较少考虑到中文的特点,汉字粒度的信息利用不充分,中文短文本关键词的提取效果仍有较大的提升空间。为了改进短文本的关键词提取效果,针对论文摘要关键词自动抽取任务,提出了一种将双向长短时记忆神经网络(Bidirectional Long Shot-Term Memory,BiLSTM)与注意力机制(Attention)相结合的基于序列标注(Sequence Tagging)的关键词提取模型(Bidirectional Long Short-term Memory and Attention Mechanism Based on Sequence Tagging,BAST)。首先使用基于词语粒度的词向量和基于字粒度的字向量分别表示输入文本信息;然后,训练BAST模型,利用BiLSTM和注意力机制提取文本特征,并对每个单词的标签进行分类预测;最后使用字向量模型校正词向量模型的关键词抽取结果。实验结果表明,在8159条论文摘要数据上,BAST模型的F1值达到66.93%,比BiLSTM-CRF(Bidirectional Long Shoft-Term Memory and Conditional Random Field)算法提升了2.08%,较其他传统关键词抽取算法也有进一步的提高。该模型的创新之处在于结合了字向量和词向量模型的抽取结果,充分利用了中文文本信息的特征,可以有效提取短文本的关键词,提取效果得到了进一步的改进。

关键词: LSTM, 词向量, 关键词抽取, 注意力机制, 字向量

Abstract: Keyphrase extraction technology is a research hotspot in the field of natural language processing.In the current keyphrase extraction algorithm,the deep learning method seldom takes into account the characteristics of Chinese,the information of Chinese character granularity is not fully utilized,and the extraction effect of Chinese short text keyworks still has a large improvement space.In order to improve the effect of the keyphrase extraction for short text,a model for automatic keyphrase extraction abstracts was proposed,namely BAST model,which combines the bidirectional long short-term memory and attention mechanism based on sequence tagging model.Firstly, word vectors in the word granularity and character vectors in the character granularity are used to represent input text information.Secondly,the BAST model is trained,text features are extracted by using BiLSTM and attention mechanism,and the label of each word is classified.Finally,the character vector model is used to correct the extraction results of the word vector model.The experimental results show that the F1-measure of the BAST model reaches 66.93% on 8159 abstract data,which is 2.08% higher than that of the BiLSTM-CRF(Bidirectional Long Shoft-Term Memory and Conditional Random Field) algorithm,and is further improved than other traditional keyphrase extraction algorithms.The innovation of the model lies in the combination of the extraction results of the word vector and the character vector model.The model makes full use of the characteristics of the Chinese text information and can effectively extract keyphrases from the short text,and extraction effect is further improved.

Key words: Attention mechanism, Character embedding, Keyphrase extraction, LSTM, Word embedding

中图分类号:

TP391

杨丹浩,吴岳辛,范春晓. 一种基于注意力机制的中文短文本关键词提取模型[J]. 计算机科学, 2020, 47(1): 193-198. https://doi.org/10.11896/jsjkx.181202261

YANG Dan-hao,WU Yue-xin,FAN Chun-xiao. Chinese Short Text Keyphrase Extraction Model Based on Attention[J]. Computer Science, 2020, 47(1): 193-198. https://doi.org/10.11896/jsjkx.181202261

参考文献

[1]GOLLAPALLI S,CARAGRA C.Extracting Keyphrases from Research Papers Using Citation Networks [C]∥ Proceedings of the National Conference on Artificial Intelligence.Quebec:AAAI Press,2014:1629-1635.
[2]FLORESCU C,CARAGEA C.Positionrank:An unsupervised approach to keyphrase extraction from scholarly documents[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada,2017:1105-1115.
[3]HASAN K,NG V.Automatic keyphrase extraction:A survey of the state of the art[C]∥Proceedings of the 27th International Conference on Computational Linguistics.Baltimore,Maryland,2014:1262-1273.
[4]LI G,WANG H.Improved automatic keyword extraction based on textrank using domain knowledge[C]∥ Proceedings of the 2014 Natural Language Processing and Chinese Computing.Berlin:Springer-Verlag,2014:403-413.
[5]BOUGOUIN A,BOUDIN F,DAILLE B.TopicRank:Graph- Based Topic Ranking for Keyphrase Extraction[C]∥Procee-dings of theInternational Joint Conference on Natural Language Processing.Nagoya,Japan,2013:543-551.
[6]TENEVA N,CHENG W.Salience rank:efficient keyphrase extraction with topic modeling[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Vancouver,Canada,2017:530-535.
[7]FLORESCU C,CARAGEA C.A Position-Biased PageRank Algorithm for Keyphrase Extraction[C]∥Proceedings of the American Association for Artificial Intelligence.San Francisco:AAAI Press,2017:4923-4924.
[8]ZHANG C,WANG H,LIU Y,et al.Automatic keyword extraction from documents using conditional random fields[J].Journal of Computational Information Systems,2008,4(3):1169-1180.
[9]HADDOUD M,MOKHRARI A,LECROQ T,et al.Accurate Keyphrase Extraction from Scientific Papers by Mining Linguistic Information[C]∥Proceedings of The Workshop on Mining Scientific Papers:Computational Linguistics and Bibliometrics.Istanbul,Turkey:CEUR-WS,2015:12-17.
[10]ONAN A,KORUKOGLU S,BULUT H.Ensemble of keyword extraction methods and classifiers in text classification[J].Expert Systems with Applications,2016,57(3):232-247.
[11]GOLLAPALLI S,LI X,YANG P.Incorporating expert know- ledge into keyphrase extraction[C]∥ Processings of the American Association for Artificial Intelligence.San Francisco:AAAI Press,2017:3180-3187.
[12]ZHANG Q,WANG Y,GONG Y,et al.Keyphrase Extraction Using Deep Recurrent Neural Networks on Twitter[C]∥ Proceedings of Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:Association for Computational Linguistics,2016:836-845.
[13]REKIA K,ZHANG Y,ZHANG W,et al.CCG Supertagging via Bidirectional LSTM-CRF Neural Architecture[J].Neurocomputing,2017,283(12):31-37.
[14]MOURAD G.Character-level neural network for biomedical named entity recognition[J].Journal of Biomedical Informatics,2017,70(5):85-91.
[15]ANDREJ Z,YORAM B,PASHA M,et al.Neural Named Entity Recognition Using a Self-Attention Mechanism[C]∥Procee-dings of International Conference on TOOLS with Artificial Intelligence.Boston:IEEE Computer Society,2017:652-656.
[16]SI Y,XIAO Y,XU J,et al.Recurrent neural network language model with vector-space word representations[C]∥Proceedings of the International Conference on Learning Representations.Beijing:International Institute of Acoustics and Vibrations,2014:3024-3031.
[17]SUNDERMEYER M,SCHLUTER R,NEY H.LSTM Neural Networks for Language Modeling[C]∥Proceedings of the 13th Annual Conference of the International Speech Communication Association Interspeech.Portland,OR,2012:194-197.
[18]GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Nrural Networks,2005,18(5):602-610.
[19]FENG S,LIU S,YANG N,et al.Improving attention modeling with implicit distortion and fertility for machine translation[C]∥Proceedings of 26th International Conference on Computational Linguistics.Osaka,Japan,2016:3082-3092.
[20]TAN Z,WANG M,XIE J,et al.Deep Semantic Role Labeling with Self-Attention[C]∥Proceedings of the American Association for Artificial Intelligence.San Francisco:AAAI Press,2017:4923-4924.

相关文章 15

[1]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3]	戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5]	熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[8]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[9]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[10]	汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[11]	侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12]	金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[13]	熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[14]	姜胜腾, 张亦弛, 罗鹏, 刘月玲, 曹阔, 赵海涛, 魏急波. 语义通信系统的性能度量指标分析 Analysis of Performance Metrics of Semantic Communication Systems 计算机科学, 2022, 49(7): 236-241. https://doi.org/10.11896/jsjkx.211200071
[15]	彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed