Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 241200023-8.doi: 10.11896/jsjkx.241200023

• Artificial Intelligence • Previous Articles     Next Articles

Multi-language Embedding Graph Convolutional Network for Hate Speech Detection

ZHAO Hongyi, LI Zhiyuan, BU Fanliang   

  1. School of Information Network Security,People’s Public Security University of China,Beijing 100045,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    Double First-Class Innovation Research Project of People’s Public Security University of China(2023SYL08).

Abstract: With the widespread use of social media,the issue of the spread of online hate speech has become increasingly severe,especially under the cover of anonymity on the Internet,allowing hate speech to spread rapidly,posing a serious challenge to the detection of hate speech.In order to effectively address this issue,this paper proposes a cross-lingual hate speech detection me-thod based on Multi-language Embedding Graph Convolutional Network(MEGCN).This method fully integrates the advantages of sequence modeling and graph modeling,and uses multi-language pre-trained models for feature extraction,thus being able to handle complex relationships between different languages.At the same time,this paper proposes a joint training method based on interpolation prediction to improve the accuracy and robustness of the model.Experiments on four public datasets show that MEGCN achieves better performance than all existing comparative models in the task of cross-lingual hate speech detection.This method not only maintains a high sequence modeling accuracy,but also effectively captures the structural relationships between texts,thereby improving the performance of the model in multi-language environments,especially in terms of semantic correspondence between different languages.

Key words: Hate speech detection, Graph convolutional network, Multi-language pre-trained model, Natural language processing

CLC Number: 

  • TP391
[1]AGOSTINA C,LEONARDO N,NEIL S,et al.Explainability and Hate Speech:Structured Explanations Make Social Media Moderators Faster[C]//Proceedings of the 62nd Annual Mee-ting of the Association for Computational Linguistics.Association for Computational Linguistics,2024:398-408.
[2]WANG X L,WANG Y H,ZHANG S X,et al.Gender Discrimination Speech Detection Model Fusing Post Attributes[J].Computer Science,2024,51(6):338-345.
[3]CHEN H Y,ZHANG L.Very Short Texts Hierarchical Classification Combining Semantic Interpretation and DeBERTa[J].Computer Science,2024,51(5):250-257.
[4]YAO L,MAO C S,LUO Y.Graph Convolutional Networks for Text Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7370-7377.
[5]HUANG R,XU J.Text Classification Based on Invariant Graph Convolutional Neural Networks[J].Computer Science,2024,51(S1):120-124.
[6]STEPHEN M,EMANUELA B,ANTOINE D,et al.Multilingual Epidemiological Text Classification:A Comparative Study[C]//International Conference on Computational Linguistics(COLING).2020:6172-6183.
[7]SEBASTIAN K,DENNIS M R,STEFFEN H,et al.Discussing the Value of Automatic Hate Speech Detection in Online Debates[C]//Multikonferenz Wirtschaftsinformatik.2018.
[8]DEBORA N.Exposing the limits of Zero-shot Cross-lingualHate Speech Detection[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2021:907-914.
[9]IRINA B,VIKTOR H,ALEXANDER F.Cross-Lingual Transfer Learning for Hate Speech Detection[C]//Proceedings of the First Workshop on Language Technology for Equality,Diversity and Inclusion,Kyiv.Association for Computational Linguistics,2021:15-25.
[10]ASHISH V,NOAM S,NIKI P,et al.Attention is All You Need[J].Advances in Neural Information Processing Systems,2017,30:5998-6008.
[11]WU S H,DREDZE M.Are All Languages Created Equal inMultilingual BERT?[C]//Proceedings of the 5th Workshop on Representation Learning for NLP.Association for Computatio-nal Linguistics,2020:120-130.
[12]LAMPLE G,ALEXIS C.Cross-lingual Language Model Pretraining[J].arXiv:1901.07291,2019.
[13]GCONNEAU A,KHANDELWAL K,GOYAL N,et al.Unsupervised Cross-lingual Representation Learning at Scale[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2020:8440-8451.
[14]TEODOR T,ZUBIAGA A.Cross-lingual Hate Speech Detection using Transformer Models[J].arXiv:2111.00981,2021.
[15]YANG Z Q,XU Z H,CUI Y M,et al.CINO:A Chinese Mino-rity PRE-trained Language Model[C]//Proceedings of the 29th International Conference on Computational Linguistics.International Committee on Computational Linguistics,2022:3937-3949.
[16]SAI S A,BINNY M,PUNYAJOY S,et al.A Deep Dive into Multilingual Hate Speech Classification[C]//European Confe-rence on Machine Learning and Knowledge Discovery in Databased.2021:423-439.
[17]LIN Y X,MENG Y X,SUN X F,et al.BertGCN:Transductive Text Classification by Combining GNN and BERT[C]//Findings of the Association for Computational Linguistics.Association for Computational Linguistics,2021:1456-1462.
[18]YANG T C,HU L M,SHI C,et al.HGAT:Heterogeneous Graph Attention Networks for Semi-supervised Short Text Classification[J].ACM Transactions Information Systems,2021,39(3):1-29.
[19]WU L,CHEN Y,SHEN K,et al.Graph neural networks fornatural language processing:A survey[J].Foundations and Trends in Machine Learning,2023,16(2):119-328.
[20]ZHANG J,ZHANG H,SUN L,et al.Graph-Bert:Only Attention is Needed for Learning Graph Representations[J].arXiv:2001.05140,2020.
[21]SHAKED B,URI A,ERAN Y.How Attentive are Graph Attention Networks?[J].arXiv:2105.14491,2021.
[22]YAO L,MAO C S,LUO Y.Graph Convolutional Networks for Text Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:7370-7377.
[23]DENG J W,ZHOU J Y,SUN H,et al.COLD:A Benchmark for Chinese Offensive Language Detection[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2022:11580-11599.
[24]VALERIO B,CRISTINA B,ELISABETTA F,et al.SemEval-2019 Task 5:Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter[C]//Proceedings of the 13th International Workshop on Semantic Evaluation.Association for Computational Linguistics,2019:54-63.
[25]PATRICIA C,VÉRONIQUE M,FARAH B,et al.An Annotated Corpus for Sexism Detection in French Tweets[C]//Proceedings of the Twelfth Language Resources and Evaluation Conference.European Language Resources Association,2020:1397-1403.
[26]OUSIDHOUM N,LIN Z Z,ZHANG H M,et al.Multilingual and Multi-Aspect Hate Speech Analysis[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Association for Computational Linguistics,2019:4675-4684.
[27]OLLAGNIER A,CABRIO E,VILLATA S,et al.CyberAgressionAdo-v1:a Dataset of Annotated Online Aggressions in French Collected through a Role-playing Game[C]//Language Resources and Evaluation Conference.2020.
[28]VANETIK N,MIMOUN E.Detection of Racist Language inFrench Tweets[J].Information,2022,13(7):318.
[29]KENNEDY,CHRIS J,GEOFF B,et al.Constructing interval variables via faceted Rasch measurement and multitask deep learning:a hate speech application[J].arXiv:2009.10277,2020.
[30]MNASSRI K,FARAHBAKHSH R,CRESPI N.MultilingualHate Speech Detection Using Semi-supervised Generative Adversarial Network[C]//International Conference on Complex Networks and Their Applications.2024:192-204.
[1] HU Hailong, XU Xiangwei, LI Yaqian. Drug Combination Recommendation Model Based on Dynamic Disease Modeling [J]. Computer Science, 2025, 52(9): 96-105.
[2] CHENG Zhangtao, HUANG Haoran, XUE He, LIU Leyuan, ZHONG Ting, ZHOU Fan. Event Causality Identification Model Based on Prompt Learning and Hypergraph [J]. Computer Science, 2025, 52(9): 303-312.
[3] LIU Le, XIAO Rong, YANG Xiao. Application of Decoupled Knowledge Distillation Method in Document-level RelationExtraction [J]. Computer Science, 2025, 52(8): 277-287.
[4] LI Mengxi, GAO Xindan, LI Xue. Two-way Feature Augmentation Graph Convolution Networks Algorithm [J]. Computer Science, 2025, 52(7): 127-134.
[5] ZHENG Cheng, YANG Nan. Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge [J]. Computer Science, 2025, 52(7): 218-225.
[6] BIAN Hui, MENG Changqian, LI Zihan, CHEN Zihaoand XIE Xuelei. Continuous Sign Language Recognition Based on Graph Convolutional Network and CTC/Attention [J]. Computer Science, 2025, 52(6A): 240400098-9.
[7] TAN Qiyin, YU Jiong, CHEN Zixin. Outlier Detection Method Based on Adaptive Graph Autoencoder [J]. Computer Science, 2025, 52(6): 129-138.
[8] ZHANG Jiaxiang, PAN Min, ZHANG Rui. Study on EEG Emotion Recognition Method Based on Self-supervised Graph Network [J]. Computer Science, 2025, 52(5): 122-127.
[9] HUANG Qian, SU Xinkai, LI Chang, WU Yirui. Hypergraph Convolutional Network with Multi-perspective Topology Refinement forSkeleton-based Action Recognition [J]. Computer Science, 2025, 52(5): 220-226.
[10] LIU Yanlun, XIAO Zheng, NIE Zhenyu, LE Yuquan, LI Kenli. Case Element Association with Evidence Extraction for Adjudication Assistance [J]. Computer Science, 2025, 52(2): 222-230.
[11] XU Siyao, ZENG Jianjun, ZHANG Weiyan, YE Qi, ZHU Yan. Dependency Parsing for Chinese Electronic Medical Record Enhanced by Dual-scale Collaboration of Large and Small Language Models [J]. Computer Science, 2025, 52(2): 253-260.
[12] YUAN Tianhao, WANG Yongjun, WANG Baoshan, WANG Zhongyuan. Review of Artificial Intelligence Generated Content Applications in Natural Language Processing [J]. Computer Science, 2025, 52(11A): 241200156-12.
[13] WEI Hao, ZHANG Zongyu, DIAO Hongyue, DENG Yaochen. Review of Application of Information Extraction Technology in Digital Humanities [J]. Computer Science, 2025, 52(11A): 250600198-10.
[14] FU Juan. Research on Application of Deep Learning-based Natural Language Processing Technology inIntelligent Translation Systems [J]. Computer Science, 2025, 52(11A): 241000037-6.
[15] ZHAO Zhuoyang, QIN Donghong, BAI Fengbo, LIANG Xianye, XU Chen, ZHENG Yuehua, LIANG Yufeng, LAN Sheng, ZHOU Guoping. ZHA_TGCN:A Topic Classification Method for Low-resource Sawcuengh Language [J]. Computer Science, 2025, 52(11A): 250100059-8.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!