计算机科学 ›› 2020, Vol. 47 ›› Issue (5): 198-203.doi: 10.11896/jsjkx.190300154
郭鑫1, 张庚1, 陈千1,2, 王素格1,2
GUO Xin1, ZHANG Geng1, CHEN Qian1,2, WANG Su-ge1,2
摘要: 使机器理解人类自然语言是人工智能在认知领域的终极目标,机器阅读理解是自然语言处理技术中继语音识别、语义理解之后的一大挑战,要求计算机具有一定的背景常识,全面理解给定文本材料,并根据材料内容对相应的问题作答。随着深度学习的快速发展,阅读理解成为当前人工智能的热点研究方向,涉及机器学习、信息检索、语义计算等核心技术,在聊天机器人、问答系统、智能化教育等多个领域具有广泛的应用前景。文中聚焦微阅读模式,根据问题或选项从给定文本材料中抽取包含答案的候选句,缩小推理范围,为进一步实现机器阅读理解提供技术支持。传统基于特征的方法耗费大量人力,文中将答案候选句抽取看成一种语义相关度计算问题,提出了一种答案候选句排序方法,即Att-BiGRU/BiLSTM模型。首先,利用双向长短期记忆和门控循环单元来编码句子中表达的语义信息;其次,设计Atten结构,结合相异性和相似性对语义相关度进行建模;最后,采用Adam算法来学习模型的参数。在SemEval-SICK数据集上的实验结果显示,该模型在测试集上的pearson指标超过了基线方法BiGRU将近0.67,在MSE指标上超过BiGRU方法16.83%,收敛速度更快,表明双向和Atten结构能大大提高候选句抽取的精度。
中图分类号:
[1]YAN Z,TANG D,DUAN N,et al.Assertion-based QA with Question-Aware Open Information Extraction[J].arXiv:1801.07414,2018. [2]WANG S,YU M,CHANG S,et al.A Co-Matching Model for Multi-choice Reading Comprehension[J].arXiv:1806.04068 [cs],2018. [3]CHEN Q,CHEN X F,GUO X,et al.Multiple-to-One Chinese Textual Entailment for Reading Comprehension[J].Journal of Chinese Information Processing,2018,32(4):87-94. [4]QIAO P,WANG S G,CHEN X,et al.Word Association Based Answer Acquisition for Reading Comprehension Questions from Prose[J].Journal of Chinese Information Processing,2018,32(3):135-142. [5]WANG Z,LIU J,XIAO X,et al.Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension[J].arXiv:1805.06145 [cs],2018. [6]RAJPURKAR P,JIA R,LIANG P.Know What You Don't Know:Unanswerable Questions for SQuAD[J].arXiv:1806.03822 [cs],2018. [7]BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research,2003,3:993-1022. [8]DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by Latent Semantic Analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407. [9]BENGIO Y,DUCHARME R,VINCENT P,et al.A neuralprobabilistic language model[J].Journal of Machine Learning Research,2003,3(6):1137-1155. [10]ALEX R.Can Artificial Neural Networks Learn Language Mo-dels?[C]//Proceedings of the Sixth International Conference on Spoken Language Processing,ICSLP 2000/INTERSPEECH 2000.Beijing,China,2000:202-205. [11]JOHNSON R,ZHANG T.Deep Pyramid Convolutional Neural Networks for Text Categorization[C]//55th Annual Meetings of the Association for Computational Linguistics.Vancouver,Canada,2017:562-570. [12]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations From Tree-Structured Long Short-Term Memory Networks[C]//53rd Annual Meeting of the Association for Computational Linguistics,2015,5(1):1556-1566. [13]ZHOU X,HU B,CHEN Q,et al.Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering[C]//Proceedings of the 53rd Annual Meeting of the Association for Computationl Linguistics.Beijing,China,2015:713-718. [14]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[M]//Supervised Sequence Labelling with Recurrent Neural Networks.Springer Berlin Heidelberg,1997:1735-1780. [15]CHO K,VAN M B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 19th Conference on Empirical Methods in Natural Language Processing.Doha,Qatar,2014:1724-1734. [16]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[C]//International Joint Conference on Neural Networks.2015. [17]SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[M]//Piscataway,NJ:IEEE Press,1997. [18]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//Proceedings of the 3rd International Conference on Learning Representations.San Diego,2015:1-15. |
[1] | 赵丹丹, 黄德根, 孟佳娜, 董宇, 张攀. 基于BERT-GRU-ATT模型的中文实体关系分类 Chinese Entity Relations Classification Based on BERT-GRU-ATT 计算机科学, 2022, 49(6): 319-325. https://doi.org/10.11896/jsjkx.210600123 |
[2] | 刘文洋, 郭延哺, 李维华. 识别关键蛋白质的混合深度学习模型 Identifying Essential Proteins by Hybrid Deep Learning Model 计算机科学, 2021, 48(8): 240-245. https://doi.org/10.11896/jsjkx.200700076 |
[3] | 丁玲, 向阳. 基于分层次多粒度语义融合的中文事件检测 Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion 计算机科学, 2021, 48(5): 202-208. https://doi.org/10.11896/jsjkx.200800038 |
[4] | 后同佳, 周良. 基于双向GRU神经网络和注意力机制的中文船舶故障关系抽取方法 Chinese Ship Fault Relation Extraction Method Based on Bidirectional GRU Neural Network and Attention Mechanism 计算机科学, 2021, 48(11A): 154-158. https://doi.org/10.11896/jsjkx.210100215 |
[5] | 李杭, 李维华, 陈伟, 杨仙明, 曾程. 基于Node2vec和知识注意力机制的诊断预测 Diagnostic Prediction Based on Node2vec and Knowledge Attention Mechanisms 计算机科学, 2021, 48(11A): 630-637. https://doi.org/10.11896/jsjkx.210300070 |
[6] | 朱培培, 王中卿, 李寿山, 王红玲. 基于篇章信息和Bi-GRU的中文事件检测 Chinese Event Detection Based on Document Information and Bi-GRU 计算机科学, 2020, 47(12): 233-238. https://doi.org/10.11896/jsjkx.191100031 |
[7] | 孙中锋, 王静. 用于基于方面情感分析的RCNN-BGRU-HN网络模型 RCNN-BGRU-HN Network Model for Aspect-based Sentiment Analysis 计算机科学, 2019, 46(9): 223-228. https://doi.org/10.11896/j.issn.1002-137X.2019.09.033 |
[8] | 石春丹, 秦岭. 基于BGRU-CRF的中文命名实体识别方法 Chinese Named Entity Recognition Method Based on BGRU-CRF 计算机科学, 2019, 46(9): 237-242. https://doi.org/10.11896/j.issn.1002-137X.2019.09.035 |
[9] | 裴兰珍, 赵英俊, 王哲, 罗赟骞. 采用深度学习的DGA域名检测模型比较 Comparison of DGA Domain Detection Models Using Deep Learning 计算机科学, 2019, 46(5): 111-115. https://doi.org/10.11896/j.issn.1002-137X.2019.05.017 |
[10] | 郑诚, 薛满意, 洪彤彤, 宋飞豹. 用于短文本分类的DC-BiGRU_CNN模型 DC-BiGRU_CNN Model for Short-text Classification 计算机科学, 2019, 46(11): 186-192. https://doi.org/10.11896/jsjkx.180901702 |
[11] | 周枫, 李荣雨. 基于BGRU池的卷积神经网络文本分类模型 Convolutional Neural Network Model for Text Classification Based on BGRU Pooling 计算机科学, 2018, 45(6): 235-240. https://doi.org/10.11896/j.issn.1002-137X.2018.06.042 |
[12] | 张红斌,姬东鸿,尹兰,任亚峰. 基于梯度核特征及N-gram模型的商品图像句子标注 Product Image Sentence Annotation Based on Gradient Kernel Feature and N-gram Model 计算机科学, 2016, 43(5): 269-273. https://doi.org/10.11896/j.issn.1002-137X.2016.05.051 |
|