计算机科学 ›› 2020, Vol. 47 ›› Issue (5): 198-203.doi: 10.11896/jsjkx.190300154

• 人工智能 • 上一篇    下一篇

面向机器阅读理解的候选句抽取算法

郭鑫1, 张庚1, 陈千1,2, 王素格1,2   

  1. 1 山西大学计算机与信息技术学院 太原030006
    2 计算智能与中文信息处理教育部重点实验室 太原030006
  • 收稿日期:2019-03-28 出版日期:2020-05-15 发布日期:2020-05-19
  • 通讯作者: 陈千(chenqian857@163.com)
  • 作者简介:guoxinjsj@163.com
  • 基金资助:
    山西省应用基础研究计划项目(201701D221101,201901D111032);国家自然科学基金项目(61502288,61403238,61673248);山西省重点研发计划项目(201803D421024)

Candidate Sentences Extraction for Machine Reading Comprehension

GUO Xin1, ZHANG Geng1, CHEN Qian1,2, WANG Su-ge1,2   

  1. 1 School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory of Computational Intelligence and Chinese Information Processing,Ministry of Education,Taiyuan 030006,China
  • Received:2019-03-28 Online:2020-05-15 Published:2020-05-19
  • About author:GUO Xin,Ph.D,lecturer.Her main research interests include feature learning and natural language processing.
    CHEN Qian,associate professor.His main research interests include topic detection and natural language processing.
  • Supported by:
    This work was supported by the Natural Science Foundation of Shanxi Province(201701D221101,201901D111032),National Natural Science Foundation of China(61502288,61403238,61673248) and Key R&D Program of Shanxi Province(201803D421024).

摘要: 使机器理解人类自然语言是人工智能在认知领域的终极目标,机器阅读理解是自然语言处理技术中继语音识别、语义理解之后的一大挑战,要求计算机具有一定的背景常识,全面理解给定文本材料,并根据材料内容对相应的问题作答。随着深度学习的快速发展,阅读理解成为当前人工智能的热点研究方向,涉及机器学习、信息检索、语义计算等核心技术,在聊天机器人、问答系统、智能化教育等多个领域具有广泛的应用前景。文中聚焦微阅读模式,根据问题或选项从给定文本材料中抽取包含答案的候选句,缩小推理范围,为进一步实现机器阅读理解提供技术支持。传统基于特征的方法耗费大量人力,文中将答案候选句抽取看成一种语义相关度计算问题,提出了一种答案候选句排序方法,即Att-BiGRU/BiLSTM模型。首先,利用双向长短期记忆和门控循环单元来编码句子中表达的语义信息;其次,设计Atten结构,结合相异性和相似性对语义相关度进行建模;最后,采用Adam算法来学习模型的参数。在SemEval-SICK数据集上的实验结果显示,该模型在测试集上的pearson指标超过了基线方法BiGRU将近0.67,在MSE指标上超过BiGRU方法16.83%,收敛速度更快,表明双向和Atten结构能大大提高候选句抽取的精度。

关键词: 长短期记忆模型, 候选句抽取, 门控循环单元, 语义相关度计算

Abstract: The ultimate goal of artificial intelligence is to let machine understand human natural language in cognitive field.Machine reading comprehension raises great challenge in natural language processing which requires computer to have certain common knowledge,comprehensively understand text material,and correctly answer the corresponding questions according to that text material.With the rapid development of deep learning,machine reading comprehension becomes the current hotspot research direction in artificial intelligence,involving core technologies such as machine learning,information retrieval,semantic computing and has been widely used in chat robots,question answering systems and intelligent education.This paper focuses on micro-rea-ding mode,and answer candidate sentences containing answers are extracted from given text,which provide technology support for machine reading comprehension.Traditional feature-based methods consumes lots of manpower.This paper regards candidate sentences extracting as a semantic relevance calculation problem,and proposes an Att-BiGRU/LSTM model.First,LSTM and GRU are used to encode the semantic expressed in a sentence.Then,the dissimilarity and similarity are captured with an Atten structure for semantic correlation. Last, adam optimizer is used to learn the model parameters.Experiment results show that Att-BiGRU model exceeds the baseline method of nearly 0.67 in terms of pearson,16.8% in terms of MSE on SemEval-SICK test dataset,which proves that the combination of the bidirectional and Atten structure can greatly improve the accuracy of the candidate sentences extraction,as well as the convergence rate.

Key words: Candidate sentences extracting, Gated recurrent unit, Long short term memory, Semantic correlation calculation

中图分类号: 

  • TP391
[1]YAN Z,TANG D,DUAN N,et al.Assertion-based QA with Question-Aware Open Information Extraction[J].arXiv:1801.07414,2018.
[2]WANG S,YU M,CHANG S,et al.A Co-Matching Model for Multi-choice Reading Comprehension[J].arXiv:1806.04068 [cs],2018.
[3]CHEN Q,CHEN X F,GUO X,et al.Multiple-to-One Chinese Textual Entailment for Reading Comprehension[J].Journal of Chinese Information Processing,2018,32(4):87-94.
[4]QIAO P,WANG S G,CHEN X,et al.Word Association Based Answer Acquisition for Reading Comprehension Questions from Prose[J].Journal of Chinese Information Processing,2018,32(3):135-142.
[5]WANG Z,LIU J,XIAO X,et al.Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension[J].arXiv:1805.06145 [cs],2018.
[6]RAJPURKAR P,JIA R,LIANG P.Know What You Don't Know:Unanswerable Questions for SQuAD[J].arXiv:1806.03822 [cs],2018.
[7]BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
[8]DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by Latent Semantic Analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407.
[9]BENGIO Y,DUCHARME R,VINCENT P,et al.A neuralprobabilistic language model[J].Journal of Machine Learning Research,2003,3(6):1137-1155.
[10]ALEX R.Can Artificial Neural Networks Learn Language Mo-dels?[C]//Proceedings of the Sixth International Conference on Spoken Language Processing,ICSLP 2000/INTERSPEECH 2000.Beijing,China,2000:202-205.
[11]JOHNSON R,ZHANG T.Deep Pyramid Convolutional Neural Networks for Text Categorization[C]//55th Annual Meetings of the Association for Computational Linguistics.Vancouver,Canada,2017:562-570.
[12]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations From Tree-Structured Long Short-Term Memory Networks[C]//53rd Annual Meeting of the Association for Computational Linguistics,2015,5(1):1556-1566.
[13]ZHOU X,HU B,CHEN Q,et al.Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering[C]//Proceedings of the 53rd Annual Meeting of the Association for Computationl Linguistics.Beijing,China,2015:713-718.
[14]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[M]//Supervised Sequence Labelling with Recurrent Neural Networks.Springer Berlin Heidelberg,1997:1735-1780.
[15]CHO K,VAN M B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 19th Conference on Empirical Methods in Natural Language Processing.Doha,Qatar,2014:1724-1734.
[16]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[C]//International Joint Conference on Neural Networks.2015.
[17]SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[M]//Piscataway,NJ:IEEE Press,1997.
[18]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//Proceedings of the 3rd International Conference on Learning Representations.San Diego,2015:1-15.
[1] 赵丹丹, 黄德根, 孟佳娜, 董宇, 张攀.
基于BERT-GRU-ATT模型的中文实体关系分类
Chinese Entity Relations Classification Based on BERT-GRU-ATT
计算机科学, 2022, 49(6): 319-325. https://doi.org/10.11896/jsjkx.210600123
[2] 刘文洋, 郭延哺, 李维华.
识别关键蛋白质的混合深度学习模型
Identifying Essential Proteins by Hybrid Deep Learning Model
计算机科学, 2021, 48(8): 240-245. https://doi.org/10.11896/jsjkx.200700076
[3] 丁玲, 向阳.
基于分层次多粒度语义融合的中文事件检测
Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion
计算机科学, 2021, 48(5): 202-208. https://doi.org/10.11896/jsjkx.200800038
[4] 后同佳, 周良.
基于双向GRU神经网络和注意力机制的中文船舶故障关系抽取方法
Chinese Ship Fault Relation Extraction Method Based on Bidirectional GRU Neural Network and Attention Mechanism
计算机科学, 2021, 48(11A): 154-158. https://doi.org/10.11896/jsjkx.210100215
[5] 李杭, 李维华, 陈伟, 杨仙明, 曾程.
基于Node2vec和知识注意力机制的诊断预测
Diagnostic Prediction Based on Node2vec and Knowledge Attention Mechanisms
计算机科学, 2021, 48(11A): 630-637. https://doi.org/10.11896/jsjkx.210300070
[6] 朱培培, 王中卿, 李寿山, 王红玲.
基于篇章信息和Bi-GRU的中文事件检测
Chinese Event Detection Based on Document Information and Bi-GRU
计算机科学, 2020, 47(12): 233-238. https://doi.org/10.11896/jsjkx.191100031
[7] 孙中锋, 王静.
用于基于方面情感分析的RCNN-BGRU-HN网络模型
RCNN-BGRU-HN Network Model for Aspect-based Sentiment Analysis
计算机科学, 2019, 46(9): 223-228. https://doi.org/10.11896/j.issn.1002-137X.2019.09.033
[8] 石春丹, 秦岭.
基于BGRU-CRF的中文命名实体识别方法
Chinese Named Entity Recognition Method Based on BGRU-CRF
计算机科学, 2019, 46(9): 237-242. https://doi.org/10.11896/j.issn.1002-137X.2019.09.035
[9] 裴兰珍, 赵英俊, 王哲, 罗赟骞.
采用深度学习的DGA域名检测模型比较
Comparison of DGA Domain Detection Models Using Deep Learning
计算机科学, 2019, 46(5): 111-115. https://doi.org/10.11896/j.issn.1002-137X.2019.05.017
[10] 郑诚, 薛满意, 洪彤彤, 宋飞豹.
用于短文本分类的DC-BiGRU_CNN模型
DC-BiGRU_CNN Model for Short-text Classification
计算机科学, 2019, 46(11): 186-192. https://doi.org/10.11896/jsjkx.180901702
[11] 周枫, 李荣雨.
基于BGRU池的卷积神经网络文本分类模型
Convolutional Neural Network Model for Text Classification Based on BGRU Pooling
计算机科学, 2018, 45(6): 235-240. https://doi.org/10.11896/j.issn.1002-137X.2018.06.042
[12] 张红斌,姬东鸿,尹兰,任亚峰.
基于梯度核特征及N-gram模型的商品图像句子标注
Product Image Sentence Annotation Based on Gradient Kernel Feature and N-gram Model
计算机科学, 2016, 43(5): 269-273. https://doi.org/10.11896/j.issn.1002-137X.2016.05.051
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!