计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 325-330.doi: 10.11896/jsjkx.230300175
蒋锐, 杨凯辉, 王小明, 李大鹏, 徐友云
JIANG Rui, YANG Kaihui, WANG Xiaoming, LI Dapeng, XU Youyun
摘要: 随着互联网、大数据的飞速发展,以深度神经网络(DNN)为代表的人工智能技术迎来了黄金发展时期,自动问答作为人工智能领域的一个重要分支,也得到越来越多学者的关注。现有网络模型可以提取问题或答案的语义特征,但其一方面忽略了问题与答案之间的语义联系,另一方面也不能从整体上把握问题或答案内部所有字符之间的潜在联系。基于此,提出了两种不同形式的注意力交互模块,即互注意力交互模块和自注意力交互模块,并设计出一套基于所提注意力交互模块的深度学习模型,用于证明该注意力交互模块的有效性。首先将问题和答案中的每个字符映射成固定长度的向量,分别得到问题和答案对应的字嵌入矩阵;然后将字嵌入矩阵送入注意力交互模块,得到综合考虑问题与答案所有字符之后的字嵌入矩阵,并与之前的字嵌入矩阵相加,送入深度神经网络模块,用于提取问题与答案的语义特征;最后得到问题与答案的向量表示并计算两者之间的相似度。实验结果表明,所提模型的Top-1准确度较主流深度学习模型最高提升了3.55%,证明了所提注意力交互模块对于改善上述问题的有效性。
中图分类号:
[1]ZHAO Y,LIU D X,WAN C X,et al.Retrieval-Based Automatic Question Answer:A Literature Survey[J].Chinese Journal of Computers,2021,44(6):1214-1232. [2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444. [3]ZHANG M,SU H H,WEN J H.Classification of flower image based on attention mechanism and multi-loss attention network[J].Computer Communications,2021,179:307-317. [4]ANDREW J,ANANT M.Deep learning for digital pathology image analysis:A comprehensive tutorial with selected use cases[J].Journal of Pathology Informatics,2016,7(1):1-18. [5]SINGH S P,KUMAR A,DARBARI H,et al.Machine translation using deep learning:An overview[C]//2017 International Conference on Computer,Communications and Electronics(Comptelix).Jaipur,India:IEEE,2017:162-167. [6]KUMAR M A,DHANALAKSHMI V,SOMAN K P,et al.Factored statistical machine translation system for English to Tamil language[J].Pertanika Journal of Social Science & Humanities,2014,22(4):1045-1061. [7]LI Z H,XIE L H,WANG G Q.Deep learning features in facial identification and the likelihood ratio bound[J].Forensic Science International,2023,344:111576. [8]NESRINE G,ACHRAF B H,MOHAMED H.Learning localrepresentations for scalable RGB-D face recognition[J].Expert Systems With Applications,2020,150:113319. [9]CHO S W,BAEK N R,KIM M C,et al.Face Detection inNighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network[J].Sensors,2018,18(9):2995. [10]ZHANG M,WANG S C,YANG D F,et al.Spatial attention model based target detection for aerial robotic systems[J].International Journal of Intelligent Robotics and Applications,2019,3(4):471-479. [11]ULRICH M,FOLLMANN P,NEUDECK J H.A comparison of shape-based matching with deep-learning-based object detection[J].Technisches Messen:Sensoren,Gerate,Systeme,2019,86(11):685-698. [12]LECUN Y,BOTTOU L.Gradient-based learning applied to docu-ment recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324. [13]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[J].Communications of the ACM,2012,60:84-90. [14]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[C]//International Conference on Learning Representations(ICLR).2015:1-14. [15]SZEGEDY C,LIU W,JIA Y,et al.Going Deeper with Convolutions[J].IEEE Computer Society,2014,2014(1):1-9. [16]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA:IEEE,2016:770-778. [17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems(NIPS’17).New York:Curran Associates,2017:6000-6010. [18]YAO L,JIN Z,MAO C S,et al.Traditional Chinese medicine clinical records classification with BERT and domain specific corpora[J].Journal of the American Medical Informatics Association:JAMIA,2019,26(12):1632-1636. [19]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:Association for Computational Linguistics,2018:2227-2237. [20]LI S Y,SUNG Y.MRBERT:Pre-Training of Melody andRhythm for Automatic Music Generation[J].Mathematics,2023,11(4):798. [21]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A Convolutional Neural Network for Modelling Sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore:Association for Computational Linguistics,2014:655-665. [22]ZHANG Y,YU Z T,MAO C L,et al.Correlation analysis of law-related news combining bidirectional attention flow of news title and body[J].Journal of Intelligent & Fuzzy Systems,2021,40(3):5623-5635. [23]FENG M,BING X,GLASS M R,et al.Applying Deep Learning to Answer Selection:A Study and An Open Task[C]//Automatic Speech Recognition & Understanding.AZ,USA:IEEE,2015:813-820. [24]ALESSANDRO C,EURO B.LSTM-Based Deep Learning Mo-del for Predicting Individual Mobility Traces of Short-Term Fo-reign Tourists[J].Sustainability,2020,12(1):1-18. [25]ZHANG S,ZHANG X,WANG H,et al.Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs[J].Applied Sciences,2017,7(8):1-17. [26]YE D,ZHANG S,WANG H,et al.Multi-level Composite Neural Networks for Medical Question Answer Matching[C]//2018 IEEE Third International Conference on Data Science in Cyberspace(DSC).Guangzhou,China:IEEE Computer Society,2018:139-145. [27]ZHANG Y T,LU W P,OU W H,et al.Chinese medical question answer matching with stack-CNN[C]//Proceedings of the 2018 International Symposium on Artificial Intelligence and Robotics(SCI 810).Cham:Springer,2018:455-462. [28]ZHANG S,ZHANG X,WANG H,et al.Multi-Scale Attentive Interaction Networks for Chinese Medical Question Answer Selection[J].IEEE Access,2018,2018(6):74061-74071. [29]ZHANG Y T,LU W P,OU W H,et al.Chinese medical question answer selection via hybrid models based on CNN and GRU[J].Multimedia Tools and Applications,2019,79:14751-14776. [30]ZHANG H L,KANG X D,LI X J,et al.Semantic and Syntactic Features with Multi-Attentive Interaction for Medical Question Answering[J].Computer Engineering and Applications,2022,58(18):233-240. [31]JIA L N,CHEN H,LI G Y.Chinese Medical QA Matching Based on Attention Hybrid Model[J].Computer Applications and Software,2021,38(11):148-154. [32]TENG T,PAN H W,ZHANG K J,et al.Attention mechanism based Stack-CNN model to support Chinese medical questions and answers[J].Journal of Computer Applications,2022,42(4):1125-1130. [33]LI S,YAO Y H.Improving Medical Q&A Matching by Augmenting Dual-Channel Attention with Global Similarity[J].Computational Intelligence and Neuroscience,2022,2022:8662227. |
|