计算机科学 ›› 2024, Vol. 51 ›› Issue (12): 250-258.doi: 10.11896/jsjkx.231100147

• 人工智能 • 上一篇    下一篇

融合义原相似度矩阵与字词向量双通道的短文本语义匹配策略

刘东旭1, 段利国1,2, 崔娟娟1, 常轩伟1   

  1. 1 太原理工大学计算机科学与技术学院 山西 晋中 030600
    2 山西电子科技学院 山西 临汾 041000
  • 收稿日期:2023-11-22 修回日期:2024-05-08 出版日期:2024-12-15 发布日期:2024-12-10
  • 通讯作者: 段利国(zhaixing202202@163.com)
  • 作者简介:(2495628480@qq.com)
  • 基金资助:
    山西省自然科学基金(202203021221234,202303021211052)

Short Text Semantic Matching Strategy Fusing Sememe Similarity Matrix and Dual-channel of Char-Word Vectors

LIU Dongxu1, DUAN Liguo1,2, CUI Juanjuan1, CHANG Xuanwei1   

  1. 1 College of Computer Science and Technology, Taiyuan University of Technology, Jinzhong, Shanxi 030600, China
    2 Shanxi University of Electronic Science and Technology, Linfen, Shanxi 041000, China
  • Received:2023-11-22 Revised:2024-05-08 Online:2024-12-15 Published:2024-12-10
  • About author:LIU Dongxu,born in 1999,postgra-duate.His main research interests include text matching and so on.
    DUAN Liguo,born in 1970,Ph.D,professor,postgraduate supervisor,is a member of CCF(No.15823S).His main research interests include natural language processing and so on.
  • Supported by:
    Natural Science Foundation of Shanxi Province,China(202203021221234,202303021211052).

摘要: 短文本语义匹配任务的目的是判断两个短文本句子的语义是否一致。然而,现有的许多方法往往存在短文本语义信息不足、无法有效识别同义词等问题。针对这些不足,提出一种融合义原相似度矩阵与字词向量双通道的短文本语义匹配策略。首先,利用预训练模型Bert对输入的句子对进行编码;然后,对于句子中词级别的语义信息,利用FastText模型训练并获取文本的词向量,并加入BiLSTM模型进一步提取上下文语义信息。为了有效利用义原信息,在上述的双通道中分别加入多头注意力和用于对分离向量进行交互计算的协同注意力,并在注意力中分别融入对应的义原相似度矩阵,最后综合上述两部分向量推断出语义的一致性。在金融领域数据集BQ和开放域数据集LCQMC上的实验证明了所提算法的有效性。

关键词: 自然语言处理, 短文本, 义原, 协同注意力, 字词向量

Abstract: The purpose of the short text semantic matching task is to judge whether the semantics of two short text sentences are consistent.However,many existing methods often have shortcomings such as insufficient semantic information of short text and inability to effectively identify synonyms.In response to these shortcomings,this paper proposes a short text semantic matching strategy that fuses sememe similarity matrix and dual-channel of char-word vectors.Firstly,the pre-trained model Bert is used to encode the input sentence pairs;for the word-level semantic information in the sentence,the FastText model is used to train and obtain the word vector of the text,and the BiLSTM model is added to further extract the contextual semantic information.Se-condly,making effective use of the semantic information,multi-head attention and co-attention for interactive calculation of separation vectors are added to the above-mentioned dual-channel.And the semantic similarity matrix is integrated into the attentions respectively.Finally,infer the semantic consistency according to the above vectors.The effectiveness of the above algorithm is proved by experiments on the financial dataset BQ and the open domain dataset LCQMC.

Key words: Natural language processing, Short text, Sememe, Co-attention, Char-Word vector

中图分类号: 

  • TP391
[1]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//North American Chapter of the Association for Computational Linguistics.Minneapolis,Minnesota:ACL,2019:4171-4186.
[2]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[J].Advances In Neural Information Processing Systems,2017,30:5998-6008.
[3]QI F,YANG C,LIU Z,et al.Openhownet:An Open Sememe-based Lexical Knowledge Base[J].arXiv:1901.09957,2019.
[4]ARMAND J,EDOUARD G,PIOTR B,et al.Bag of Tricks for Efficient Text Classification[C]//Conference of the European Chapter of the Association for Computational Linguistics.Valencia,Spain:ACL,2017:427-431.
[5]HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF Models for Sequence Tagging[J].arXiv:1508.01991,2015.
[6]LU J,YANG J,BATRA D,et al.Hierarchical Question-Image Co-Attention for Visual Question Answering[C]//Conference on Neural Information Processing Systems.2016:289-297.
[7]HUANG P S,HE X D,GAO J F,et al.Learning Deep Structured Semantic Models for Web Search Using Click through Data[C]//International Conference on Information and Knowledge Management.2013:2333-2338.
[8]CHEN Q,ZHU X D,LING Z H,et al.Enhanced LSTM For Natural Language Inference[C]//Annual Meeting of the Association for Computational Linguistics.2017:1657-1668.
[9]GONG Y C,LUO H,ZHANG J.Natural Language Inferenceover Interaction Space[J].arXiv:1709.04348,2017.
[10]TAN C,WEI F,WANG W H,et al.Multiway Attention Networks for Modeling Sentence Pairs[C]//International Joint Conference on Artificial Intelligence.2018:4411-4417.
[11]LAN Z Z,CHEN M,GOODMAN S,et al.Albert:A Lite Bert for Self-supervised Learning of Language Representations[C]//International Conference on Learning Representations.2020.
[12]LIU Y H,OTT M,GOYAL N,et al.Roberta:A Robustly Optimized Bert Pretraining Approach[J].arXiv:1907.11692,2019.
[13]ZHANG Z S,WU Y W,ZHAO H,et al.Semantics-aware BERT for Language Understanding[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:9628-9635.
[14]ZHANG Z Y,HAN X,LIU Z Y,et al.ERNIE:Enhanced Language Representation with Informative Entities[C]//Procee-dings of the 57th Annual Meeting of the Association for Computational Linguistics.Italy:ACL,2019:1441-1451.
[15]LIU W J,ZHOU P,ZHAO Z,et al.K-Bert:Enabling Language Representation with Knowledge Graph[C]//AAAI Conference on Artificial Intelligence.2020:2901-2908.
[16]HE P C,LIU X D,GAO J F,et al.DeBERTa:Decoding-en-hanced BERT with Disentangled Attention[C]//International Conference on Learning Representations.2021.
[17]LYU B,CHEN L,ZHU S,et al.Let:Linguistic Knowledge Enhanced Graph Transformer for Chinese Short Text Matching[C]//AAAI Conference on Artificial Intelligence.2021:13498-13506.
[18]BAI J G,WANG Y J,CHEN Y R,et al.Syntax-BERT:Improving Pre-trained Transformers with Syntax Trees[C]//Confe-rence of the European Chapter of the Association for Computational Linguistics.Minneapolis,Minnesota:ACL,2021:3011-3020.
[19]LI Y L,ZHOU Y P.Text Similarity Matching Based on Twin Network and Char-Word Vector Combination[J].Applications of Computer Systems,2022,31(10):295-302.
[20]LYU X F,ZHAO S L,GAO H D,et al.Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network[J].Computer Science,2022,49(9):92-100.
[21]YU E,DU L,JIN Y,et al.Learning Semantic Textual Similarity via Topic-informed Discrete Latent Variables[C]//Conference on Empirical Methods in Natural Language Processing.2022:4937-4948.
[22]WANG S,LIANG D,SONG J,et al.DABERT:Dual Attention Enhanced BERT for Semantic Matching[C]//International Conference on Computational Linguistics.2022:1645-1654.
[23]ZOU Y C,LIU H W,GUI T,et al.Divide and Conquer:Text Semantic Matching with Disentangled Keywords and Intents[C]//Annual Meeting of the Association for Computational Linguistics.Findings of the Association for Computational Linguistics.Dublin,Ireland:ACL,2022:3622-3632.
[24]CHEN M Y,JIANG H Y,YANG Y J.Context Enhanced Short Text Matching using Clickthrough Data[J].arXiv:2203.01849,2022.
[25]ZHANG H Y,DUAN L G,WANG Q C,et al.Long Text Multi-entity Emotion Analysis Based on Multi-task Joint Training[J].Computer Science,2024,51(6):309-316.
[26]JIANG K X,ZHAO Y H,JIN G Z,et al.KETM:A Knowledge-Enhanced Text Matching Method[C]//International Joint Conference on Neural Networks.2023:1-8.
[27]WU Z B,PALMER M.Verb Semantics and Lexical Selection[C]//Annual Meeting of the Association for Computational Linguistics.1994:27-30.
[28]CUI Y M,CHE W X,LIU T,et al.Revisiting Pre-Trained Mo-dels for Chinese Natural Language Processing[C]//Findings of the Association for Computational Linguistics:EMNLP.2020:657-668.
[29]BAI J,BAI S,CHU Y F,et al.Qwen Technical Report[J].ar-Xiv:2309.16609,2023.
[30]YANG A Y,XIAO B,WANG B N,et al.Baichuan2:OpenLarge-scale Language Models[J].arXiv:2309.10305,2023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!