计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241200023-8.doi: 10.11896/jsjkx.241200023
赵弘毅, 李志远, 卜凡亮
ZHAO Hongyi, LI Zhiyuan, BU Fanliang
摘要: 随着社交媒体的广泛应用,网络仇恨言论的传播问题日益严重,尤其在网络匿名性的掩护下,仇恨言论得以快速扩散,为仇恨言论检测带来严峻挑战。为了有效应对这一问题,提出了一种基于多语言嵌入图卷积网络(Multi-language Embedding Graph Convolutional Network,MEGCN)的多语言仇恨言论检测方法。该方法充分融合了序列建模与图建模的优势,利用多语言预训练模型进行特征提取,从而能够处理不同语言间的复杂关系。同时,提出了一种基于插值预测的联合训练方式,以提升模型的准确性和鲁棒性。通过在4个公开数据集上的实验,结果表明,MEGCN相比所有对比模型,均在多语言仇恨言论检测任务中取得了更优的性能。该方法不仅能够保持较高的序列建模精度,还能够有效地捕捉文本间的结构性关系,进而提升模型在多语言环境中的表现,尤其在不同语言之间的语义对应关系方面展现出显著优势。
中图分类号:
| [1]AGOSTINA C,LEONARDO N,NEIL S,et al.Explainability and Hate Speech:Structured Explanations Make Social Media Moderators Faster[C]//Proceedings of the 62nd Annual Mee-ting of the Association for Computational Linguistics.Association for Computational Linguistics,2024:398-408. [2]WANG X L,WANG Y H,ZHANG S X,et al.Gender Discrimination Speech Detection Model Fusing Post Attributes[J].Computer Science,2024,51(6):338-345. [3]CHEN H Y,ZHANG L.Very Short Texts Hierarchical Classification Combining Semantic Interpretation and DeBERTa[J].Computer Science,2024,51(5):250-257. [4]YAO L,MAO C S,LUO Y.Graph Convolutional Networks for Text Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7370-7377. [5]HUANG R,XU J.Text Classification Based on Invariant Graph Convolutional Neural Networks[J].Computer Science,2024,51(S1):120-124. [6]STEPHEN M,EMANUELA B,ANTOINE D,et al.Multilingual Epidemiological Text Classification:A Comparative Study[C]//International Conference on Computational Linguistics(COLING).2020:6172-6183. [7]SEBASTIAN K,DENNIS M R,STEFFEN H,et al.Discussing the Value of Automatic Hate Speech Detection in Online Debates[C]//Multikonferenz Wirtschaftsinformatik.2018. [8]DEBORA N.Exposing the limits of Zero-shot Cross-lingualHate Speech Detection[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2021:907-914. [9]IRINA B,VIKTOR H,ALEXANDER F.Cross-Lingual Transfer Learning for Hate Speech Detection[C]//Proceedings of the First Workshop on Language Technology for Equality,Diversity and Inclusion,Kyiv.Association for Computational Linguistics,2021:15-25. [10]ASHISH V,NOAM S,NIKI P,et al.Attention is All You Need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008. [11]WU S H,DREDZE M.Are All Languages Created Equal inMultilingual BERT?[C]//Proceedings of the 5th Workshop on Representation Learning for NLP.Association for Computatio-nal Linguistics,2020:120-130. [12]LAMPLE G,ALEXIS C.Cross-lingual Language Model Pretraining[J].arXiv:1901.07291,2019. [13]GCONNEAU A,KHANDELWAL K,GOYAL N,et al.Unsupervised Cross-lingual Representation Learning at Scale[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2020:8440-8451. [14]TEODOR T,ZUBIAGA A.Cross-lingual Hate Speech Detection using Transformer Models[J].arXiv:2111.00981,2021. [15]YANG Z Q,XU Z H,CUI Y M,et al.CINO:A Chinese Mino-rity PRE-trained Language Model[C]//Proceedings of the 29th International Conference on Computational Linguistics.International Committee on Computational Linguistics,2022:3937-3949. [16]SAI S A,BINNY M,PUNYAJOY S,et al.A Deep Dive into Multilingual Hate Speech Classification[C]//European Confe-rence on Machine Learning and Knowledge Discovery in Databased.2021:423-439. [17]LIN Y X,MENG Y X,SUN X F,et al.BertGCN:Transductive Text Classification by Combining GNN and BERT[C]//Findings of the Association for Computational Linguistics.Association for Computational Linguistics,2021:1456-1462. [18]YANG T C,HU L M,SHI C,et al.HGAT:Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification[J].ACM Transactions Information Systems,2021,39(3):1-29. [19]WU L,CHEN Y,SHEN K,et al.Graph neural networks fornatural language processing:A survey[J].Foundations and Trends in Machine Learning,2023,16(2):119-328. [20]ZHANG J,ZHANG H,SUN L,et al.Graph-Bert:Only Attention is Needed for Learning Graph Representations[J].arXiv:2001.05140,2020. [21]SHAKED B,URI A,ERAN Y.How Attentive are Graph Attention Networks?[J].arXiv:2105.14491,2021. [22]YAO L,MAO C S,LUO Y.Graph Convolutional Networks for Text Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7370-7377. [23]DENG J W,ZHOU J Y,SUN H,et al.COLD:A Benchmark for Chinese Offensive Language Detection[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2022:11580-11599. [24]VALERIO B,CRISTINA B,ELISABETTA F,et al.SemEval-2019 Task 5:Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter[C]//Proceedings of the 13th International Workshop on Semantic Evaluation.Association for Computational Linguistics,2019:54-63. [25]PATRICIA C,VÉRONIQUE M,FARAH B,et al.An Annotated Corpus for Sexism Detection in French Tweets[C]//Proceedings of the Twelfth Language Resources and Evaluation Conference.European Language Resources Association,2020:1397-1403. [26]OUSIDHOUM N,LIN Z Z,ZHANG H M,et al.Multilingual and Multi-Aspect Hate Speech Analysis[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2019:4675-4684. [27]OLLAGNIER A,CABRIO E,VILLATA S,et al.CyberAgressionAdo-v1:a Dataset of Annotated Online Aggressions in French Collected through a Role-playing Game[C]//Language Resources and Evaluation Conference.2020. [28]VANETIK N,MIMOUN E.Detection of Racist Language inFrench Tweets[J].Information,2022,13(7):318. [29]KENNEDY,CHRIS J,GEOFF B,et al.Constructing interval variables via faceted Rasch measurement and multitask deep learning:a hate speech application[J].arXiv:2009.10277,2020. [30]MNASSRI K,FARAHBAKHSH R,CRESPI N.MultilingualHate Speech Detection Using Semi-supervised Generative Adversarial Network[C]//International Conference on Complex Networks and Their Applications.2024:192-204. |
|
||