面向机器阅读理解的候选句抽取算法

doi:10.11896/jsjkx.190300154

Computer Science ›› 2020, Vol. 47 ›› Issue (5): 198-203.doi: 10.11896/jsjkx.190300154

• Artificial Intelligence • Previous Articles Next Articles

Candidate Sentences Extraction for Machine Reading Comprehension

GUO Xin¹, ZHANG Geng¹, CHEN Qian^1,2, WANG Su-ge^1,2

1 School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
2 Key Laboratory of Computational Intelligence and Chinese Information Processing,Ministry of Education,Taiyuan 030006,China

Received:2019-03-28 Online:2020-05-15 Published:2020-05-19
About author:GUO Xin,Ph.D,lecturer.Her main research interests include feature learning and natural language processing.
CHEN Qian,associate professor.His main research interests include topic detection and natural language processing.
Supported by:
This work was supported by the Natural Science Foundation of Shanxi Province(201701D221101,201901D111032),National Natural Science Foundation of China(61502288,61403238,61673248) and Key R&D Program of Shanxi Province(201803D421024).

Abstract

Abstract: The ultimate goal of artificial intelligence is to let machine understand human natural language in cognitive field.Machine reading comprehension raises great challenge in natural language processing which requires computer to have certain common knowledge,comprehensively understand text material,and correctly answer the corresponding questions according to that text material.With the rapid development of deep learning,machine reading comprehension becomes the current hotspot research direction in artificial intelligence,involving core technologies such as machine learning,information retrieval,semantic computing and has been widely used in chat robots,question answering systems and intelligent education.This paper focuses on micro-rea-ding mode,and answer candidate sentences containing answers are extracted from given text,which provide technology support for machine reading comprehension.Traditional feature-based methods consumes lots of manpower.This paper regards candidate sentences extracting as a semantic relevance calculation problem,and proposes an Att-BiGRU/LSTM model.First,LSTM and GRU are used to encode the semantic expressed in a sentence.Then,the dissimilarity and similarity are captured with an Atten structure for semantic correlation. Last, adam optimizer is used to learn the model parameters.Experiment results show that Att-BiGRU model exceeds the baseline method of nearly 0.67 in terms of pearson,16.8% in terms of MSE on SemEval-SICK test dataset,which proves that the combination of the bidirectional and Atten structure can greatly improve the accuracy of the candidate sentences extraction,as well as the convergence rate.

Key words: Candidate sentences extracting, Gated recurrent unit, Long short term memory, Semantic correlation calculation

CLC Number:

TP391

GUO Xin, ZHANG Geng, CHEN Qian, WANG Su-ge. Candidate Sentences Extraction for Machine Reading Comprehension[J].Computer Science, 2020, 47(5): 198-203.

References

[1]YAN Z,TANG D,DUAN N,et al.Assertion-based QA with Question-Aware Open Information Extraction[J].arXiv:1801.07414,2018.
[2]WANG S,YU M,CHANG S,et al.A Co-Matching Model for Multi-choice Reading Comprehension[J].arXiv:1806.04068 [cs],2018.
[3]CHEN Q,CHEN X F,GUO X,et al.Multiple-to-One Chinese Textual Entailment for Reading Comprehension[J].Journal of Chinese Information Processing,2018,32(4):87-94.
[4]QIAO P,WANG S G,CHEN X,et al.Word Association Based Answer Acquisition for Reading Comprehension Questions from Prose[J].Journal of Chinese Information Processing,2018,32(3):135-142.
[5]WANG Z,LIU J,XIAO X,et al.Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension[J].arXiv:1805.06145 [cs],2018.
[6]RAJPURKAR P,JIA R,LIANG P.Know What You Don't Know:Unanswerable Questions for SQuAD[J].arXiv:1806.03822 [cs],2018.
[7]BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
[8]DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by Latent Semantic Analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407.
[9]BENGIO Y,DUCHARME R,VINCENT P,et al.A neuralprobabilistic language model[J].Journal of Machine Learning Research,2003,3(6):1137-1155.
[10]ALEX R.Can Artificial Neural Networks Learn Language Mo-dels?[C]//Proceedings of the Sixth International Conference on Spoken Language Processing,ICSLP 2000/INTERSPEECH 2000.Beijing,China,2000:202-205.
[11]JOHNSON R,ZHANG T.Deep Pyramid Convolutional Neural Networks for Text Categorization[C]//55th Annual Meetings of the Association for Computational Linguistics.Vancouver,Canada,2017:562-570.
[12]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations From Tree-Structured Long Short-Term Memory Networks[C]//53rd Annual Meeting of the Association for Computational Linguistics,2015,5(1):1556-1566.
[13]ZHOU X,HU B,CHEN Q,et al.Answer Sequence Learning with Neural Networks for Answer Selection in Community Question Answering[C]//Proceedings of the 53rd Annual Meeting of the Association for Computationl Linguistics.Beijing,China,2015:713-718.
[14]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[M]//Supervised Sequence Labelling with Recurrent Neural Networks.Springer Berlin Heidelberg,1997:1735-1780.
[15]CHO K,VAN M B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 19th Conference on Empirical Methods in Natural Language Processing.Doha,Qatar,2014:1724-1734.
[16]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[C]//International Joint Conference on Neural Networks.2015.
[17]SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[M]//Piscataway,NJ:IEEE Press,1997.
[18]KINGMA D P,BA J.Adam:A Method for Stochastic Optimization[C]//Proceedings of the 3rd International Conference on Learning Representations.San Diego,2015:1-15.

Related Articles 15

[1]	YANG Han, WAN You, CAI Jie-xuan, FANG Ming-yu, WU Zhuo-chao, JIN Yang, QIAN Wei-xing. Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification [J]. Computer Science, 2022, 49(6A): 759-763.
[2]	HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258.
[3]	AO Tian-yu, LIU Quan. Upper Confidence Bound Exploration with Fast Convergence [J]. Computer Science, 2022, 49(1): 298-305.
[4]	LIN Zhen-xian, ZHANG Meng-kai, WU Cheng-mao, ZHENG Xing-ning. Face Image Inpainting with Generative Adversarial Network [J]. Computer Science, 2021, 48(9): 174-180.
[5]	LIU Wen-yang, GUO Yan-bu, LI Wei-hua. Identifying Essential Proteins by Hybrid Deep Learning Model [J]. Computer Science, 2021, 48(8): 240-245.
[6]	YIN Jiu, CHI Kai-kai, HUAN Ruo-hong. Aspect-level Sentiment Analysis of Text Based on ATT-DGRU [J]. Computer Science, 2021, 48(5): 217-224.
[7]	LI Hang, LI Wei-hua, CHEN Wei, YANG Xian-ming, ZENG Cheng. Diagnostic Prediction Based on Node2vec and Knowledge Attention Mechanisms [J]. Computer Science, 2021, 48(11A): 630-637.
[8]	ZHU Pei-pei, WANG Zhong-qing, LI Shou-shan, WANG Hong-ling. Chinese Event Detection Based on Document Information and Bi-GRU [J]. Computer Science, 2020, 47(12): 233-238.
[9]	SUN Zhong-feng, WANG Jing. RCNN-BGRU-HN Network Model for Aspect-based Sentiment Analysis [J]. Computer Science, 2019, 46(9): 223-228.
[10]	SHI Chun-dan, QIN Lin. Chinese Named Entity Recognition Method Based on BGRU-CRF [J]. Computer Science, 2019, 46(9): 237-242.
[11]	JIN Huan-huan,YIN Hai-bo,HE Ling-na. End-to-End Single-channel Automatic Staging Model for Sleep EEG Signal [J]. Computer Science, 2019, 46(3): 242-247.
[12]	MAO Ying-chi, CAO Hai, HE Jin-feng. Spatio-Temporal Integrated Forecasting Algorithm for Dam Deformation [J]. Computer Science, 2019, 46(2): 223-229.
[13]	YANG Jia-ning, HUANG Xiang-sheng, LI Zong-han, RONG Can, LIU Dao-wei. Spatio-temporal Trajectory Prediction of Power Grid Based on Double Layers Stacked Long Short-term Memory [J]. Computer Science, 2019, 46(11A): 23-27.
[14]	ZHENG Cheng, XUE Man-yi, HONG Tong-tong, SONG Fei-bao. DC-BiGRU_CNN Model for Short-text Classification [J]. Computer Science, 2019, 46(11): 186-192.
[15]	ZHOU Feng, LI Rong-yu. Convolutional Neural Network Model for Text Classification Based on BGRU Pooling [J]. Computer Science, 2018, 45(6): 235-240.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Candidate Sentences Extraction for Machine Reading Comprehension

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0