计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 270-276.doi: 10.11896/jsjkx.220400275
罗亮1, 程春玲1, 刘倩1, 归耀城2
LUO Liang1, CHENG Chunling1, LIU Qian1, GUI Yaocheng2
摘要: 答案选择是问答系统领域的关键子任务,其性能表现支撑着问答系统的发展。基于参数冻结的BERT模型生成的动态词向量存在句级语义特征匮乏、问答对词级交互关系缺失等问题。多层感知机具有多种优势,不仅能够实现深度特征挖掘,且计算成本较低。在动态文本向量的基础上,文中提出了一种基于多层感知机和语义矩阵的答案选择模型,多层感知机主要实现文本向量句级语义维度重建,而通过不同的计算方法生成语义矩阵能够挖掘不同的文本特征信息。多层感知机与基于线性模型生成的语义理解矩阵相结合,实现一个语义理解模块,旨在分别挖掘问题句和答案句的句级语义特征;多层感知机与基于双向注意力计算方法生成的语义交互矩阵相结合,实现一个语义交互模块,旨在构建问答对之间的词级交互关系。实验结果表明,所提模型在WikiQA数据集上MAP和MRR分别为0.789和0.806,相比基线模型,该模型在性能上有一致的提升,在SelQA数据集上MAP和MRR分别为0.903和0.911,也具有较好的性能表现。
中图分类号:
[1]TAN M,DOS SANTOS C,XIANG B,et al.Improved representation learning for question answer matching[C]//Proceedings of the 54th Annual Meeting of the Associationfor Computational Linguistics.2016:464-473. [2]MIN S,ZHONG V,SOCHER R,et al.Efficient and robustquestion answering from minimal context over documents[J].arXiv:1805.08092,2018. [3]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543. [4]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013. [5]LIU R H,YE X,YUE Z Y.Review of pre-trained models fornatural language processing tasks[J].Journal of Computer Applications,2121,41(5):1236-1246. [6]QIU X,SUN T,XU Y,et al.Pre-trained models for natural language processing:A survey[J].Science China Technological Sciences,2020,63(10):1872-1897. [7]TAN M,DOS SANTOS C,XIANG B,et al.Improved representation learning for question answer matching[C]//Proceedings of the 54th Annual Meeting of the Association for Computa-tional Linguistics.2016:464-473. [8]CHEN Q,HU Q,HUANG J X,et al.Enhancing recurrent neu-ral networks with positional attention for question answering[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:993-996. [9]HUANG J.A Multi-Size Neural Network with Attention Mecha-nism for Answer Selection[J].arXiv:2105.03278,2021. [10]WANG S,JIANG J.A compare-aggregate model for matchingtext sequences[J].arXiv:1611.01747,2016. [11]BIAN W,LI S,YANG Z,et al.A compare-aggregate model with dynamic-clip attention for answer selection[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.2017:1987-1990. [12]YOON S,DERNONCOURT F,KIM D S,et al.A compare-aggregate model with latent clustering for answer selection[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.2019:2093-2096. [13]LI Z C,TURDI T,ASKAR H.Answer selection model based on dynamic attention and multi-perspective matching[J].Journal of Computer Applications,2021,41(11):3156-3163. [14]PETERS M E,NEUMANN M,IYYER M,et al.Deep contex-tualized word representations[J].arXiv:1802.05365,2018. [15]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [16]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[J/OL].[2022-06-19].http://cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf. [17]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-trainingof deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [18]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019. [19]LASKAR M T R,HUANG X,HOQUE E.Contextualized embeddings based transformer encoder for sentence similarity mo-deling in answer selection task[C]//Proceedings of The 12th Language Resources and Evaluation Conference.2020:5505-5514. [20]CHEN Q,HU Q,HUANG J X,et al.Can:Enhancing sentence similarity modeling with collaborative and adversarial network[C]//The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval.2018:815-824. [21]LI W,WU Y.Exploiting WordNet Synset and Hypernym Representations for Answer Selection[C]//Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing.2020:106-115. [22]TOLSTIKHIN I O,HOULSBY N,KOLESNIKOV A,et al.Mlp-mixer:An all-mlp architecture for vision[J].Advances in Neural Information Processing Systems,2021,34:24261-24272. [23]LIU H,DAI Z,SO D,et al.Pay attention to MLPs[J].Advances in Neural Information Processing Systems,2021,34:9204-9215. [24]SANTOS C,TAN M,XIANG B,et al.Attentive pooling net-works[J].arXiv:1602.03609,2016. [25]SHA L,ZHANG X,QIAN F,et al.A multi-view fusion neural network for answer selection[C]//Thirty-second AAAI Confe-rence on Artificial Intelligence.2018. |
[1] | 于家畦, 康晓东, 白程程, 刘汉卿. 一种新的中文电子病历文本检索模型 New Text Retrieval Model of Chinese Electronic Medical Records 计算机科学, 2022, 49(6A): 32-38. https://doi.org/10.11896/jsjkx.210400198 |
[2] | 王文强, 贾星星, 李朋. 自适应的集成定序算法 Adaptive Ensemble Ordering Algorithm 计算机科学, 2022, 49(6A): 242-246. https://doi.org/10.11896/jsjkx.210200108 |
[3] | 王加昌, 郑代威, 唐雷, 郑丹晨, 刘梦娟. 基于机器学习的剩余使用寿命预测实证研究 Empirical Research on Remaining Useful Life Prediction Based on Machine Learning 计算机科学, 2022, 49(11A): 211100285-9. https://doi.org/10.11896/jsjkx.211100285 |
[4] | 徐兵, 弋沛玉, 王金策, 彭舰. 知识图谱嵌入的高阶协同过滤推荐系统 High-order Collaborative Filtering Recommendation System Based on Knowledge Graph Embedding 计算机科学, 2021, 48(11A): 244-250. https://doi.org/10.11896/jsjkx.210100211 |
[5] | 刘树栋, 魏嘉敏. 基于谱聚类和成对数据表示的多层感知机分类算法 Multilayer Perceptron Classification Algorithm Based on Spectral Clusteringand Simultaneous Two Sample Representation 计算机科学, 2019, 46(11A): 194-198. |
|