计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 230200068-5.doi: 10.11896/jsjkx.230200068
康书铭, 朱焱
KANG Shuming, ZHU Yan
摘要: 文本立场分析旨在从用户发表的文本中推测其对特定话题的看法,如支持、反对、中立等态度。传统的立场分析研究往往采用卷积神经网络或者长短时记忆网络等深度学习模型学习文本的基本语义信息,忽略了文本蕴含的句法结构信息。针对这一问题,文中设计实现了基于话题注意力和依存句法的文本立场检测模型——AT-BiLSTM-GAT,在BiLSTM提取的文本上下文信息基础上,采用GAT进一步学习文本语言学层次的依存句法信息。同时设计实现一种融合上下文语义信息的话题注意力机制,采用缩放点积注意力学习立场文本中与话题相关的重要内容,在公开数据集上的对比实验证明了AT-BiLSTM-GAT模型的高效性。最后,针对立场分析研究数据集存在规模较小的问题,设计实现了一种基于WordNet同义词库与WebVectors词嵌入模型的同义词替换数据增强方案WWDA,保证了同义词替换过程的词性正确性和语义相似性,通过实验证明其可以生成更多高质量样本,提升模型的检测性能。
中图分类号:
[1]LI Y,SUN Y Q,JING W P.Summary of Text Stance Detection[J].Journal of Computer Research and Development,2021,58(11):2538-2557. [2]LIU W,PENG X,LI C,et al.A Survey on Stance Detection[J].Journal of Chinese Information,2020,34(12):1-8. [3]DU J,XU R,HE Y,et al.Stance classification with target-specific neural attention networks[C]//International Joint Confe-rences on Artificial Intelligence.2017. [4]YUE T C,ZHANG S W,YANG L,et al.A stance detectionmethod based on two-stage attention mechanism[J].Journal of Guangxi Normal University(Natural Science Edition),2019,37(1):42-49. [5]BAI J,LI F,JI D H.Attention-based BiLSTM-CNN ChineseWeibo Stance Detection Model [J].Computer Applications and Software,2018,35(3):266r274. [6]SUN Q,WANG Z,LI S,et al.Stance detection via sentiment information and neural network model[J].Frontiers of Computer Science,2019,13(1):127-138. [7]WANG Z,SUN Q,LI S,et al.Neural Stance Detection WithHierarchical Linguistic Representations[J/OL].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2020,28.https://ieeexplore.ieee.org/abstract/document/8949710. [8]WU L,CHEN Y,SHEN K,et al.Graph neural networks fornatural language processing:A survey[J].arXiv:2106.06090,2021. [9]VELICKOVIC P,CUCURULL G,CASANOVA A,et al.Graph attention networks[J].arXiv:1710.10903,2017. [10]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J/OL].Advances in Neural Information Processing Systems,2017,30.https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. [11]MOHAMMAD S,KIRITCHENKO S,SOBHANI P,et al.Semeval-2016 task 6:Detecting stance in tweets[C]//Proceedings of the 10th international workshop on semantic evaluation(SemEval-2016).2016:31-41. [12]GLANDT K,KHANAL S,LI Y,et al.Stance detection in COVID-19 tweets[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:1596-1611. [13]WEI J,ZOU K.EDA:Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Confe-rence on Natural Language Processing(EMNLP-IJCNLP).2019:6382-6388. [14]MILLER G A.WordNet:a lexical database for English[J].Communications of the ACM,1995,38(11):39-41. [15]KUTUZOV A,FARES M,OEPEN S,et al.Word vectors,reuse,and replicability:Towards a community repository of large-text resources[C]//Proceedings of the 58th Conference on Si-mulation and Modelling.Linköping University Electronic Press,2017:271-276. |
|