基于多语言嵌入图卷积网络的仇恨言论检测方法

doi:10.11896/jsjkx.241200023

Abstract

Abstract: With the widespread use of social media,the issue of the spread of online hate speech has become increasingly severe,especially under the cover of anonymity on the Internet,allowing hate speech to spread rapidly,posing a serious challenge to the detection of hate speech.In order to effectively address this issue,this paper proposes a cross-lingual hate speech detection me-thod based on Multi-language Embedding Graph Convolutional Network(MEGCN).This method fully integrates the advantages of sequence modeling and graph modeling,and uses multi-language pre-trained models for feature extraction,thus being able to handle complex relationships between different languages.At the same time,this paper proposes a joint training method based on interpolation prediction to improve the accuracy and robustness of the model.Experiments on four public datasets show that MEGCN achieves better performance than all existing comparative models in the task of cross-lingual hate speech detection.This method not only maintains a high sequence modeling accuracy,but also effectively captures the structural relationships between texts,thereby improving the performance of the model in multi-language environments,especially in terms of semantic correspondence between different languages.

Key words: Hate speech detection, Graph convolutional network, Multi-language pre-trained model, Natural language processing

CLC Number:

TP391

ZHAO Hongyi, LI Zhiyuan, BU Fanliang. Multi-language Embedding Graph Convolutional Network for Hate Speech Detection[J].Computer Science, 2025, 52(11A): 241200023-8.

References

[1]AGOSTINA C,LEONARDO N,NEIL S,et al.Explainability and Hate Speech:Structured Explanations Make Social Media Moderators Faster[C]//Proceedings of the 62nd Annual Mee-ting of the Association for Computational Linguistics.Association for Computational Linguistics,2024:398-408.
[2]WANG X L,WANG Y H,ZHANG S X,et al.Gender Discrimination Speech Detection Model Fusing Post Attributes[J].Computer Science,2024,51(6):338-345.
[3]CHEN H Y,ZHANG L.Very Short Texts Hierarchical Classification Combining Semantic Interpretation and DeBERTa[J].Computer Science,2024,51(5):250-257.
[4]YAO L,MAO C S,LUO Y.Graph Convolutional Networks for Text Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7370-7377.
[5]HUANG R,XU J.Text Classification Based on Invariant Graph Convolutional Neural Networks[J].Computer Science,2024,51(S1):120-124.
[6]STEPHEN M,EMANUELA B,ANTOINE D,et al.Multilingual Epidemiological Text Classification:A Comparative Study[C]//International Conference on Computational Linguistics(COLING).2020:6172-6183.
[7]SEBASTIAN K,DENNIS M R,STEFFEN H,et al.Discussing the Value of Automatic Hate Speech Detection in Online Debates[C]//Multikonferenz Wirtschaftsinformatik.2018.
[8]DEBORA N.Exposing the limits of Zero-shot Cross-lingualHate Speech Detection[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2021:907-914.
[9]IRINA B,VIKTOR H,ALEXANDER F.Cross-Lingual Transfer Learning for Hate Speech Detection[C]//Proceedings of the First Workshop on Language Technology for Equality,Diversity and Inclusion,Kyiv.Association for Computational Linguistics,2021:15-25.
[10]ASHISH V,NOAM S,NIKI P,et al.Attention is All You Need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[11]WU S H,DREDZE M.Are All Languages Created Equal inMultilingual BERT?[C]//Proceedings of the 5th Workshop on Representation Learning for NLP.Association for Computatio-nal Linguistics,2020:120-130.
[12]LAMPLE G,ALEXIS C.Cross-lingual Language Model Pretraining[J].arXiv:₁901.07291,2019.
[13]GCONNEAU A,KHANDELWAL K,GOYAL N,et al.Unsupervised Cross-lingual Representation Learning at Scale[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2020:8440-8451.
[14]TEODOR T,ZUBIAGA A.Cross-lingual Hate Speech Detection using Transformer Models[J].arXiv:2111.00981,2021.
[15]YANG Z Q,XU Z H,CUI Y M,et al.CINO:A Chinese Mino-rity PRE-trained Language Model[C]//Proceedings of the 29th International Conference on Computational Linguistics.International Committee on Computational Linguistics,2022:3937-3949.
[16]SAI S A,BINNY M,PUNYAJOY S,et al.A Deep Dive into Multilingual Hate Speech Classification[C]//European Confe-rence on Machine Learning and Knowledge Discovery in Databased.2021:423-439.
[17]LIN Y X,MENG Y X,SUN X F,et al.BertGCN:Transductive Text Classification by Combining GNN and BERT[C]//Findings of the Association for Computational Linguistics.Association for Computational Linguistics,2021:1456-1462.
[18]YANG T C,HU L M,SHI C,et al.HGAT:Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification[J].ACM Transactions Information Systems,2021,39(3):1-29.
[19]WU L,CHEN Y,SHEN K,et al.Graph neural networks fornatural language processing:A survey[J].Foundations and Trends^ in Machine Learning,2023,16(2):119-328.
[20]ZHANG J,ZHANG H,SUN L,et al.Graph-Bert:Only Attention is Needed for Learning Graph Representations[J].arXiv:2001.05140,2020.
[21]SHAKED B,URI A,ERAN Y.How Attentive are Graph Attention Networks?[J].arXiv:2105.14491,2021.
[22]YAO L,MAO C S,LUO Y.Graph Convolutional Networks for Text Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7370-7377.
[23]DENG J W,ZHOU J Y,SUN H,et al.COLD:A Benchmark for Chinese Offensive Language Detection[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2022:11580-11599.
[24]VALERIO B,CRISTINA B,ELISABETTA F,et al.SemEval-2019 Task 5:Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter[C]//Proceedings of the 13th International Workshop on Semantic Evaluation.Association for Computational Linguistics,2019:54-63.
[25]PATRICIA C,VÉRONIQUE M,FARAH B,et al.An Annotated Corpus for Sexism Detection in French Tweets[C]//Proceedings of the Twelfth Language Resources and Evaluation Conference.European Language Resources Association,2020:1397-1403.
[26]OUSIDHOUM N,LIN Z Z,ZHANG H M,et al.Multilingual and Multi-Aspect Hate Speech Analysis[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2019:4675-4684.
[27]OLLAGNIER A,CABRIO E,VILLATA S,et al.CyberAgressionAdo-v1:a Dataset of Annotated Online Aggressions in French Collected through a Role-playing Game[C]//Language Resources and Evaluation Conference.2020.
[28]VANETIK N,MIMOUN E.Detection of Racist Language inFrench Tweets[J].Information,2022,13(7):318.
[29]KENNEDY,CHRIS J,GEOFF B,et al.Constructing interval variables via faceted Rasch measurement and multitask deep learning:a hate speech application[J].arXiv:2009.10277,2020.
[30]MNASSRI K,FARAHBAKHSH R,CRESPI N.MultilingualHate Speech Detection Using Semi-supervised Generative Adversarial Network[C]//International Conference on Complex Networks and Their Applications.2024:192-204.

Related Articles 15

[1]	CHEN Han, XU Zefeng, JIANG Jiu, FAN Fan, ZHANG Junjian, HE Chu, WANG Wenwei. Large Language Model and Deep Network Based Cognitive Assessment Automatic Diagnosis [J]. Computer Science, 2026, 53(3): 41-51.
[2]	QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[3]	WU Xianjie, LI Tongliang, LI Zhoujun. Survey of Table Question Answering Research [J]. Computer Science, 2026, 53(3): 295-306.
[4]	ZHAI Jie, LI Yanhao, CHEN Lexuan, GUO Weibin. Dynamic Recommendation of Personalized Hands-on Learning Materials Based on LightweightEducational LLMs [J]. Computer Science, 2026, 53(2): 48-56.
[5]	CHEN Haitao, LIANG Junwei, CHEN Chen, WANG Yufan, ZHOU Yu. Multimodal Physical Education Data Fusion via Graph Alignment for Action Recognition [J]. Computer Science, 2026, 53(2): 89-98.
[6]	SUN Mingxu, LIANG Gang, WU Yifei, HU Haixin. Chinese Hate Speech Detection Incorporating Hate Object Features and Variant Word Restoration Mechanism [J]. Computer Science, 2026, 53(2): 289-299.
[7]	CHANG Xuanwei, DUAN Liguo, CHEN Jiahao, CUI Juanjuan, LI Aiping. Method for Span-level Sentiment Triplet Extraction by Deeply Integrating Syntactic and Semantic Features [J]. Computer Science, 2026, 53(2): 322-330.
[8]	HU Hailong, XU Xiangwei, LI Yaqian. Drug Combination Recommendation Model Based on Dynamic Disease Modeling [J]. Computer Science, 2025, 52(9): 96-105.
[9]	CHENG Zhangtao, HUANG Haoran, XUE He, LIU Leyuan, ZHONG Ting, ZHOU Fan. Event Causality Identification Model Based on Prompt Learning and Hypergraph [J]. Computer Science, 2025, 52(9): 303-312.
[10]	LIU Le, XIAO Rong, YANG Xiao. Application of Decoupled Knowledge Distillation Method in Document-level RelationExtraction [J]. Computer Science, 2025, 52(8): 277-287.
[11]	LI Mengxi, GAO Xindan, LI Xue. Two-way Feature Augmentation Graph Convolution Networks Algorithm [J]. Computer Science, 2025, 52(7): 127-134.
[12]	ZHENG Cheng, YANG Nan. Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge [J]. Computer Science, 2025, 52(7): 218-225.
[13]	BIAN Hui, MENG Changqian, LI Zihan, CHEN Zihaoand XIE Xuelei. Continuous Sign Language Recognition Based on Graph Convolutional Network and CTC/Attention [J]. Computer Science, 2025, 52(6A): 240400098-9.
[14]	TAN Qiyin, YU Jiong, CHEN Zixin. Outlier Detection Method Based on Adaptive Graph Autoencoder [J]. Computer Science, 2025, 52(6): 129-138.
[15]	ZHANG Jiaxiang, PAN Min, ZHANG Rui. Study on EEG Emotion Recognition Method Based on Self-supervised Graph Network [J]. Computer Science, 2025, 52(5): 122-127.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Multi-language Embedding Graph Convolutional Network for Hate Speech Detection

PDF (PC)