计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 325-330.doi: 10.11896/jsjkx.230300175

• 人工智能 • 上一篇    下一篇

一种面向中文自动问答的注意力交互深度学习模型

蒋锐, 杨凯辉, 王小明, 李大鹏, 徐友云   

  1. 南京邮电大学通信与信息工程学院 南京 210003
  • 收稿日期:2023-03-22 修回日期:2023-08-28 出版日期:2024-06-15 发布日期:2024-06-05
  • 通讯作者: 蒋锐(j_ray@njupt.edu.cn)
  • 基金资助:
    国家自然科学基金(62271266)

Attentional Interaction-based Deep Learning Model for Chinese Question Answering

JIANG Rui, YANG Kaihui, WANG Xiaoming, LI Dapeng, XU Youyun   

  1. School of Communications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
  • Received:2023-03-22 Revised:2023-08-28 Online:2024-06-15 Published:2024-06-05
  • About author:JIANG Rui,born in 1985,Ph.D,asso-ciate professor.His main research in-terests include artificial intelligence and wireless communication.
  • Supported by:
    National Natural Science Foundation of China(62271266).

摘要: 随着互联网、大数据的飞速发展,以深度神经网络(DNN)为代表的人工智能技术迎来了黄金发展时期,自动问答作为人工智能领域的一个重要分支,也得到越来越多学者的关注。现有网络模型可以提取问题或答案的语义特征,但其一方面忽略了问题与答案之间的语义联系,另一方面也不能从整体上把握问题或答案内部所有字符之间的潜在联系。基于此,提出了两种不同形式的注意力交互模块,即互注意力交互模块和自注意力交互模块,并设计出一套基于所提注意力交互模块的深度学习模型,用于证明该注意力交互模块的有效性。首先将问题和答案中的每个字符映射成固定长度的向量,分别得到问题和答案对应的字嵌入矩阵;然后将字嵌入矩阵送入注意力交互模块,得到综合考虑问题与答案所有字符之后的字嵌入矩阵,并与之前的字嵌入矩阵相加,送入深度神经网络模块,用于提取问题与答案的语义特征;最后得到问题与答案的向量表示并计算两者之间的相似度。实验结果表明,所提模型的Top-1准确度较主流深度学习模型最高提升了3.55%,证明了所提注意力交互模块对于改善上述问题的有效性。

关键词: 人工智能, 自动问答, 深度学习, 注意力, 字嵌入

Abstract: With the rapid development of the Internet and big data,artificial intelligence,represented by deep neural network(DNN),has ushered in a golden period of development.As an important branch in the field of artificial intelligence,question answering has attracted more and more scholars’ attention.The existing deep neural network module can extract the semantic features of the question or answer,however,on the one hand,it ignores the semantic relation between the question and answer,on the other hand,it cannot grasp the potential relation among all the characters in the question or answer as a whole.Therefore,two different forms of attention interaction module,namely cross-embedding and self-embedding,are used to solve the above pro-blems,and a set of deep learning model based on the proposed attention interaction module is designed to prove the effectiveness of this attention interaction module.Firstly,each character in the question and answer is mapped into a fixed length vector,and the corresponding character embedding matrix is obtained respectively.After that,the character embedding matrix is sent into the attentional interaction module to obtain the character embedding matrix that takes all characters of the question and answer into account.After adding the previous character embedding matrix,it is sent into the deep neural network module to extract the semantic features of the question and answer.Finally,the vector representations of the question and the answer are obtained,and the similarity between them is calculated.Experiments show that the accuracy of Top-1 of the proposed model is 3.55 % higher than that of the mainstream deep learning model at most,which proves the effectiveness of the proposed attention interaction module in resolving the above problems.

Key words: Artificial intelligence, Question answering, Deep learning, Attention, Character embedding

中图分类号: 

  • TP181
[1]ZHAO Y,LIU D X,WAN C X,et al.Retrieval-Based Automatic Question Answer:A Literature Survey[J].Chinese Journal of Computers,2021,44(6):1214-1232.
[2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[3]ZHANG M,SU H H,WEN J H.Classification of flower image based on attention mechanism and multi-loss attention network[J].Computer Communications,2021,179:307-317.
[4]ANDREW J,ANANT M.Deep learning for digital pathology image analysis:A comprehensive tutorial with selected use cases[J].Journal of Pathology Informatics,2016,7(1):1-18.
[5]SINGH S P,KUMAR A,DARBARI H,et al.Machine translation using deep learning:An overview[C]//2017 International Conference on Computer,Communications and Electronics(Comptelix).Jaipur,India:IEEE,2017:162-167.
[6]KUMAR M A,DHANALAKSHMI V,SOMAN K P,et al.Factored statistical machine translation system for English to Tamil language[J].Pertanika Journal of Social Science & Humanities,2014,22(4):1045-1061.
[7]LI Z H,XIE L H,WANG G Q.Deep learning features in facial identification and the likelihood ratio bound[J].Forensic Science International,2023,344:111576.
[8]NESRINE G,ACHRAF B H,MOHAMED H.Learning localrepresentations for scalable RGB-D face recognition[J].Expert Systems With Applications,2020,150:113319.
[9]CHO S W,BAEK N R,KIM M C,et al.Face Detection inNighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network[J].Sensors,2018,18(9):2995.
[10]ZHANG M,WANG S C,YANG D F,et al.Spatial attention model based target detection for aerial robotic systems[J].International Journal of Intelligent Robotics and Applications,2019,3(4):471-479.
[11]ULRICH M,FOLLMANN P,NEUDECK J H.A comparison of shape-based matching with deep-learning-based object detection[J].Technisches Messen:Sensoren,Gerate,Systeme,2019,86(11):685-698.
[12]LECUN Y,BOTTOU L.Gradient-based learning applied to docu-ment recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[13]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[J].Communications of the ACM,2012,60:84-90.
[14]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[C]//International Conference on Learning Representations(ICLR).2015:1-14.
[15]SZEGEDY C,LIU W,JIA Y,et al.Going Deeper with Convolutions[J].IEEE Computer Society,2014,2014(1):1-9.
[16]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA:IEEE,2016:770-778.
[17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems(NIPS’17).New York:Curran Associates,2017:6000-6010.
[18]YAO L,JIN Z,MAO C S,et al.Traditional Chinese medicine clinical records classification with BERT and domain specific corpora[J].Journal of the American Medical Informatics Association:JAMIA,2019,26(12):1632-1636.
[19]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:Association for Computational Linguistics,2018:2227-2237.
[20]LI S Y,SUNG Y.MRBERT:Pre-Training of Melody andRhythm for Automatic Music Generation[J].Mathematics,2023,11(4):798.
[21]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A Convolutional Neural Network for Modelling Sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore:Association for Computational Linguistics,2014:655-665.
[22]ZHANG Y,YU Z T,MAO C L,et al.Correlation analysis of law-related news combining bidirectional attention flow of news title and body[J].Journal of Intelligent & Fuzzy Systems,2021,40(3):5623-5635.
[23]FENG M,BING X,GLASS M R,et al.Applying Deep Learning to Answer Selection:A Study and An Open Task[C]//Automatic Speech Recognition & Understanding.AZ,USA:IEEE,2015:813-820.
[24]ALESSANDRO C,EURO B.LSTM-Based Deep Learning Mo-del for Predicting Individual Mobility Traces of Short-Term Fo-reign Tourists[J].Sustainability,2020,12(1):1-18.
[25]ZHANG S,ZHANG X,WANG H,et al.Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs[J].Applied Sciences,2017,7(8):1-17.
[26]YE D,ZHANG S,WANG H,et al.Multi-level Composite Neural Networks for Medical Question Answer Matching[C]//2018 IEEE Third International Conference on Data Science in Cyberspace(DSC).Guangzhou,China:IEEE Computer Society,2018:139-145.
[27]ZHANG Y T,LU W P,OU W H,et al.Chinese medical question answer matching with stack-CNN[C]//Proceedings of the 2018 International Symposium on Artificial Intelligence and Robotics(SCI 810).Cham:Springer,2018:455-462.
[28]ZHANG S,ZHANG X,WANG H,et al.Multi-Scale Attentive Interaction Networks for Chinese Medical Question Answer Selection[J].IEEE Access,2018,2018(6):74061-74071.
[29]ZHANG Y T,LU W P,OU W H,et al.Chinese medical question answer selection via hybrid models based on CNN and GRU[J].Multimedia Tools and Applications,2019,79:14751-14776.
[30]ZHANG H L,KANG X D,LI X J,et al.Semantic and Syntactic Features with Multi-Attentive Interaction for Medical Question Answering[J].Computer Engineering and Applications,2022,58(18):233-240.
[31]JIA L N,CHEN H,LI G Y.Chinese Medical QA Matching Based on Attention Hybrid Model[J].Computer Applications and Software,2021,38(11):148-154.
[32]TENG T,PAN H W,ZHANG K J,et al.Attention mechanism based Stack-CNN model to support Chinese medical questions and answers[J].Journal of Computer Applications,2022,42(4):1125-1130.
[33]LI S,YAO Y H.Improving Medical Q&A Matching by Augmenting Dual-Channel Attention with Global Similarity[J].Computational Intelligence and Neuroscience,2022,2022:8662227.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!