Computer Science ›› 2024, Vol. 51 ›› Issue (6): 325-330.doi: 10.11896/jsjkx.230300175

• Artificial Intelligence • Previous Articles     Next Articles

Attentional Interaction-based Deep Learning Model for Chinese Question Answering

JIANG Rui, YANG Kaihui, WANG Xiaoming, LI Dapeng, XU Youyun   

  1. School of Communications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210003,China
  • Received:2023-03-22 Revised:2023-08-28 Online:2024-06-15 Published:2024-06-05
  • About author:JIANG Rui,born in 1985,Ph.D,asso-ciate professor.His main research in-terests include artificial intelligence and wireless communication.
  • Supported by:
    National Natural Science Foundation of China(62271266).

Abstract: With the rapid development of the Internet and big data,artificial intelligence,represented by deep neural network(DNN),has ushered in a golden period of development.As an important branch in the field of artificial intelligence,question answering has attracted more and more scholars’ attention.The existing deep neural network module can extract the semantic features of the question or answer,however,on the one hand,it ignores the semantic relation between the question and answer,on the other hand,it cannot grasp the potential relation among all the characters in the question or answer as a whole.Therefore,two different forms of attention interaction module,namely cross-embedding and self-embedding,are used to solve the above pro-blems,and a set of deep learning model based on the proposed attention interaction module is designed to prove the effectiveness of this attention interaction module.Firstly,each character in the question and answer is mapped into a fixed length vector,and the corresponding character embedding matrix is obtained respectively.After that,the character embedding matrix is sent into the attentional interaction module to obtain the character embedding matrix that takes all characters of the question and answer into account.After adding the previous character embedding matrix,it is sent into the deep neural network module to extract the semantic features of the question and answer.Finally,the vector representations of the question and the answer are obtained,and the similarity between them is calculated.Experiments show that the accuracy of Top-1 of the proposed model is 3.55 % higher than that of the mainstream deep learning model at most,which proves the effectiveness of the proposed attention interaction module in resolving the above problems.

Key words: Artificial intelligence, Question answering, Deep learning, Attention, Character embedding

CLC Number: 

  • TP181
[1]ZHAO Y,LIU D X,WAN C X,et al.Retrieval-Based Automatic Question Answer:A Literature Survey[J].Chinese Journal of Computers,2021,44(6):1214-1232.
[2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[3]ZHANG M,SU H H,WEN J H.Classification of flower image based on attention mechanism and multi-loss attention network[J].Computer Communications,2021,179:307-317.
[4]ANDREW J,ANANT M.Deep learning for digital pathology image analysis:A comprehensive tutorial with selected use cases[J].Journal of Pathology Informatics,2016,7(1):1-18.
[5]SINGH S P,KUMAR A,DARBARI H,et al.Machine translation using deep learning:An overview[C]//2017 International Conference on Computer,Communications and Electronics(Comptelix).Jaipur,India:IEEE,2017:162-167.
[6]KUMAR M A,DHANALAKSHMI V,SOMAN K P,et al.Factored statistical machine translation system for English to Tamil language[J].Pertanika Journal of Social Science & Humanities,2014,22(4):1045-1061.
[7]LI Z H,XIE L H,WANG G Q.Deep learning features in facial identification and the likelihood ratio bound[J].Forensic Science International,2023,344:111576.
[8]NESRINE G,ACHRAF B H,MOHAMED H.Learning localrepresentations for scalable RGB-D face recognition[J].Expert Systems With Applications,2020,150:113319.
[9]CHO S W,BAEK N R,KIM M C,et al.Face Detection inNighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network[J].Sensors,2018,18(9):2995.
[10]ZHANG M,WANG S C,YANG D F,et al.Spatial attention model based target detection for aerial robotic systems[J].International Journal of Intelligent Robotics and Applications,2019,3(4):471-479.
[11]ULRICH M,FOLLMANN P,NEUDECK J H.A comparison of shape-based matching with deep-learning-based object detection[J].Technisches Messen:Sensoren,Gerate,Systeme,2019,86(11):685-698.
[12]LECUN Y,BOTTOU L.Gradient-based learning applied to docu-ment recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[13]KRIZHEVSKY A,SUTSKEVER I,HINTON G.ImageNetClassification with Deep Convolutional Neural Networks[J].Communications of the ACM,2012,60:84-90.
[14]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[C]//International Conference on Learning Representations(ICLR).2015:1-14.
[15]SZEGEDY C,LIU W,JIA Y,et al.Going Deeper with Convolutions[J].IEEE Computer Society,2014,2014(1):1-9.
[16]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA:IEEE,2016:770-778.
[17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems(NIPS’17).New York:Curran Associates,2017:6000-6010.
[18]YAO L,JIN Z,MAO C S,et al.Traditional Chinese medicine clinical records classification with BERT and domain specific corpora[J].Journal of the American Medical Informatics Association:JAMIA,2019,26(12):1632-1636.
[19]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:Association for Computational Linguistics,2018:2227-2237.
[20]LI S Y,SUNG Y.MRBERT:Pre-Training of Melody andRhythm for Automatic Music Generation[J].Mathematics,2023,11(4):798.
[21]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A Convolutional Neural Network for Modelling Sentences[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore:Association for Computational Linguistics,2014:655-665.
[22]ZHANG Y,YU Z T,MAO C L,et al.Correlation analysis of law-related news combining bidirectional attention flow of news title and body[J].Journal of Intelligent & Fuzzy Systems,2021,40(3):5623-5635.
[23]FENG M,BING X,GLASS M R,et al.Applying Deep Learning to Answer Selection:A Study and An Open Task[C]//Automatic Speech Recognition & Understanding.AZ,USA:IEEE,2015:813-820.
[24]ALESSANDRO C,EURO B.LSTM-Based Deep Learning Mo-del for Predicting Individual Mobility Traces of Short-Term Fo-reign Tourists[J].Sustainability,2020,12(1):1-18.
[25]ZHANG S,ZHANG X,WANG H,et al.Chinese Medical Question Answer Matching Using End-to-End Character-Level Multi-Scale CNNs[J].Applied Sciences,2017,7(8):1-17.
[26]YE D,ZHANG S,WANG H,et al.Multi-level Composite Neural Networks for Medical Question Answer Matching[C]//2018 IEEE Third International Conference on Data Science in Cyberspace(DSC).Guangzhou,China:IEEE Computer Society,2018:139-145.
[27]ZHANG Y T,LU W P,OU W H,et al.Chinese medical question answer matching with stack-CNN[C]//Proceedings of the 2018 International Symposium on Artificial Intelligence and Robotics(SCI 810).Cham:Springer,2018:455-462.
[28]ZHANG S,ZHANG X,WANG H,et al.Multi-Scale Attentive Interaction Networks for Chinese Medical Question Answer Selection[J].IEEE Access,2018,2018(6):74061-74071.
[29]ZHANG Y T,LU W P,OU W H,et al.Chinese medical question answer selection via hybrid models based on CNN and GRU[J].Multimedia Tools and Applications,2019,79:14751-14776.
[30]ZHANG H L,KANG X D,LI X J,et al.Semantic and Syntactic Features with Multi-Attentive Interaction for Medical Question Answering[J].Computer Engineering and Applications,2022,58(18):233-240.
[31]JIA L N,CHEN H,LI G Y.Chinese Medical QA Matching Based on Attention Hybrid Model[J].Computer Applications and Software,2021,38(11):148-154.
[32]TENG T,PAN H W,ZHANG K J,et al.Attention mechanism based Stack-CNN model to support Chinese medical questions and answers[J].Journal of Computer Applications,2022,42(4):1125-1130.
[33]LI S,YAO Y H.Improving Medical Q&A Matching by Augmenting Dual-Channel Attention with Global Similarity[J].Computational Intelligence and Neuroscience,2022,2022:8662227.
[1] LIU Chunling, QI Xuyan, TANG Yonghe, SUN Xuekai, LI Qinghao, ZHANG Yu. Summary of Token-based Source Code Clone Detection Techniques [J]. Computer Science, 2024, 51(6): 12-22.
[2] KONG Jialin, ZHANG Qi, WANG Caiyong. Review of Heterogeneous Iris Recognition [J]. Computer Science, 2024, 51(6): 186-197.
[3] LI Zekai, BAI Zhengyao, XIAO Xiao, ZHANG Yihan, YOU Yilin. Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework [J]. Computer Science, 2024, 51(6): 231-238.
[4] LIAO Junshuang, TAN Qinhong. DETR with Multi-granularity Spatial Attention and Spatial Prior Supervision [J]. Computer Science, 2024, 51(6): 239-246.
[5] GAO Nan, ZHANG Lei, LIANG Ronghua, CHEN Peng, FU Zheng. Scene Text Detection Algorithm Based on Feature Enhancement [J]. Computer Science, 2024, 51(6): 256-263.
[6] LIU Jiasen, HUANG Jun. Center Point Target Detection Algorithm Based on Improved Swin Transformer [J]. Computer Science, 2024, 51(6): 264-271.
[7] HOU Lei, LIU Jinhuan, YU Xu, DU Junwei. Review of Graph Neural Networks [J]. Computer Science, 2024, 51(6): 282-298.
[8] WANG Xiaolong, WANG Yanhui, ZHANG Shunxiang, WANG Caiqin, ZHOU Yuhao. Gender Discrimination Speech Detection Model Fusing Post Attributes [J]. Computer Science, 2024, 51(6): 338-345.
[9] WU Fengyuan, LIU Ming, YIN Xiaokang, CAI Ruijie, LIU Shengli. Remote Access Trojan Traffic Detection Based on Fusion Sequences [J]. Computer Science, 2024, 51(6): 434-442.
[10] BAO Kainan, ZHANG Junbo, SONG Li, LI Tianrui. ST-WaveMLP:Spatio-Temporal Global-aware Network for Traffic Flow Prediction [J]. Computer Science, 2024, 51(5): 27-34.
[11] ZHANG Jianliang, LI Yang, ZHU Qingshan, XUE Hongling, MA Junwei, ZHANG Lixia, BI Sheng. Substation Equipment Malfunction Alarm Algorithm Based on Dual-domain Sparse Transformer [J]. Computer Science, 2024, 51(5): 62-69.
[12] HE Shiyang, WANG Zhaohui, GONG Shengrong, ZHONG Shan. Cross-modal Information Filtering-based Networks for Visual Question Answering [J]. Computer Science, 2024, 51(5): 85-91.
[13] SONG Jianfeng, ZHANG Wenying, HAN Lu, HU Guozheng, MIAO Qiguang. Multi-stage Intelligent Color Restoration Algorithm for Black-and-White Movies [J]. Computer Science, 2024, 51(5): 92-99.
[14] SHAN Xinxin, LI Kai, WEN Ying. Medical Image Segmentation Network Integrating Full-scale Feature Fusion and RNN with Attention [J]. Computer Science, 2024, 51(5): 100-107.
[15] ZHOU Yu, CHEN Zhihua, SHENG Bin, LIANG Lei. Multi Scale Progressive Transformer for Image Dehazing [J]. Computer Science, 2024, 51(5): 117-124.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!