计算机科学 ›› 2026, Vol. 53 ›› Issue (2): 289-299.doi: 10.11896/jsjkx.241200004

• 人工智能 • 上一篇    下一篇

融合仇恨对象特征与变体词还原机制的中文仇恨言论检测

孙明旭, 梁刚, 吴逸飞, 胡海馨   

  1. 四川大学网络空间安全学院 成都 610207
  • 收稿日期:2024-12-02 修回日期:2025-03-07 发布日期:2026-02-10
  • 通讯作者: 梁刚(lianggang@scu.edu.cn)
  • 作者简介:(2022226240010@stu.scu.edu.cn)
  • 基金资助:
    国家自然科学基金联合项目(62162057);四川省自然科学基金(2025ZNSFSC0509);四川省科技厅重点研发项目(2023YFG0294);教育部地方项目(2023CDLZ-2)

Chinese Hate Speech Detection Incorporating Hate Object Features and Variant Word Restoration Mechanism

SUN Mingxu, LIANG Gang, WU Yifei, HU Haixin   

  1. School of Cyber Science and Engineering,Sichuan University,Chengdu 610207,China
  • Received:2024-12-02 Revised:2025-03-07 Online:2026-02-10
  • About author:SUN Mingxu,born in 2000,postgra-duate.His main research interests include hate speech detection and social network.
    LIANG Gang,born in 1976,Ph.D,associate professor,master supervisor.His main research interests include network security,online public opinion analysis and prediction,and AI security.
  • Supported by:
    National Natural Science Foundation of China(62162057),National Natural Science Foundation of Sichuan Pro-vince(2025ZNSFSC0509),Sichuan Province Science and Technology Department Key R&D Projects (2023YFG0294) and Local Projects of the Ministry of Education(2023CDLZ-2).

摘要: 网络仇恨言论的显著增加及其产生的危害使仇恨言论自动检测成为一项关键任务。现有方法忽略了文本中的仇恨对象对仇恨言论检测模型在语义提取方面的作用,导致模型上下文特征提取能力不足,易受特定表述影响而产生决策错误。同时,现有方法未考虑变体词给语义提取带来的干扰,导致仇恨言论检测漏报率较高。此外,中文仇恨言论检测领域缺乏可用数据集支持。针对上述问题,提出了一种融合仇恨对象特征与变体词还原机制的仇恨言论检测方法。该方法将仇恨对象识别作为中间任务,指导模型充分学习仇恨对象上下文特征,从而增强仇恨言论检测模型对文本的理解能力。此外,引入基于ChatGLM2-6B模型的变体词还原模块,通过还原变体词汇有效缓解了变体词对仇恨言论检测模型语义提取的干扰。最后,构建了一个中文仇恨言论数据集,以促进该领域的进一步研究。经实验验证,所提模型的F1分数达到96.71%,在各项性能上均超越了现有基线方法。特别地,模型在针对特定场景的检测准确率方面提升4.21%,对由变体词引起的漏报率降低3.45%。

关键词: 仇恨言论检测, 中文数据集, 仇恨对象识别, 变体词还原, 文本增强, 自然语言处理

Abstract: The rise of online hate speech and its significant societal harms have made automatic hate speech detection a critical task.Existing methods overlook the impact of hate objects on semantic extraction for hate speech detection,leading to inadequate contextual feature extraction and susceptibility to decision errors induced by specific expressions.Meanwhile,these methods fail to consider the interference of variant words on semantic extraction,resulting in a high miss rate in hate speech detection.Furthermore,the field of Chinese hate speech detection lacks the support of available datasets.To tackle these challenges,this paper proposes a hate speech detection method incorporating hate object features and variant word restoration mechanism.The method treats hate object recognition as an intermediate task,guiding the model to fully learn the contextual features of hate objects,thereby enhancing text comprehension in hate speech detection.Additionally,a variant word restoration module fine-tuned based on ChatGLM2-6B is proposed.It aims to effectively reduce the interference of variant words on hate speech detection by restoring variant words to their normal equivalents.Finally,a Chinese hate speech dataset is also presented to facilitate further research in this field.Experimental results verify that the proposed method achieves a 96.71% F1 score,outperforming baseline methods in all metrics.Specifically,the model exhibits a 4.21% improvement in detection accuracy for specific scenes and a 3.45% decrease in the miss rate caused by variant words.

Key words: Hate speech detection, Chinese dataset, Hate object recognition, Variant word restoration, Text enhancement, Natural language processing

中图分类号: 

  • TP391
[1]ANTI-DEFAMATION LEAGUE.Online Hate and Harass-ment:The American Experience 2023[EB/OL].https://extremismterms.adl.org/resources/report/online-hate-and-harassment-american-experience-2023.
[2]GANDHI A,AHIR P,ADHVARYU K,et al.Hate speech detection:A comprehensive review of recent works[J].Expert Systems,2024,41(8):e13562.
[3]The Dark data Project and the Sentinel Project.Hatebase[EB/OL].https://hatebase.org/.
[4]VALERIOBASILE.Hurtlex[EB/OL].https://github.com/valeriobasile/hurtlex.
[5]DAVIDSON T,WARMSLEY D,MACY M,et al.Automatedhate speech detection and the problem of offensive language[C]//Proceedings of the International AAAI Conference on Web and Social Media.Palo Alto,CA:AAAI,2017:512-515.
[6]WASEEM Z,HOVY D.Hateful symbols or hateful people?Predictive features for hate speech detection on twitter[C]//Proceedings of the NAACL student research workshop.Stroudsburg,PA:ACL,2016:88-93.
[7]ZHANG Z,ROBINSON D,TEPPER J.Detecting hate speech on twitter using a convolution-gru based deep neural network[C]//The Semantic Web:15th International Conference,ESWC 2018.Berlin:Springer-Verlag,2018:745-760.
[8]CASELLI T,BASILE V,MITROVIĆ J,et al.Hatebert:Re-training bert for abusive language detection in english[C]//Proceedings of the 5th Workshop on Online Abuse and Harms(WOAH 2021).Stroudsburg,PA:ACL,2021:17-25.
[9]GAMBÄCK B,SIKDAR U K.Using convolutional neural networks to classify hate-speech[C]//Proceedings of the First Workshop on Abusive Language Online.Stroudsburg,PA:ACL,2017:85-90.
[10]SAHOO N R,BERIA G P,BHATTACHARYYA P.IndicCONAN:A Multilingual Dataset for Combating Hate Speech in Indian Context[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2024:22313-22321.
[11]HEBERT L,SAHU G,GUO Y,et al.Multi-modal discussion transformer:Integrating text,images and graph transformers to detect hate speech on social media[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2024:22096-22104.
[12]UNITED NATIONS.What is hate speech?[EB/OL].https://www.un.org/hate-speech/understanding-hate-speech/what-is-hate-speech.
[13]THE European Commission.Code of conduct on countering illegal hate speech online[EB/OL].https://ec.europa.eu/newsroom/just/document.cfm?doc_id=42985.
[14]WIKIPEDIA.Hate speech[EB/OL].https://en.wikipedia.org/wiki/Hate_speech.
[15]XU J L,HAO J H,BIAN X M,et al.Multi-task fine-tuning on bert using spelling errors correction for chinese text classification robustness[C]//2021 IEEE 4th International Conference on Big Data and Artificial Intelligence(BDAI).Piscataway,NJ:IEEE,2021:110-114.
[16]DENG J,ZHOU J,SUN H,et al.COLD:A benchmark for Chinese offensive language detection[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Proces-sing.Stroudsburg,PA:ACL,2022:11580-11599.
[17]MAGICSMX.CHS-CORPUS-DataSet[EB/OL].https://github.com/Magicsmx/CHS-CORPUS-DataSet.
[18]PITSILIS G K,RAMAMPIARO H,LANGSETH H.Effective hate-speech detection in Twitter data using recurrent neural networks[J].Applied Intelligence,2018,48(12):4730-4742.
[19]IWENDI C,SRIVASTAVA G,KHAN S,et al.Cyberbullyingdetection solutions based on deep learning architectures[J].Multimedia Systems,2023,29(3):1839-1852.
[20]MANDL T,MODHA S,MAJUMDER P,et al.Overview of the hasoc track at fire 2019:Hate speech and offensive content identification in indo-european languages[C]//Proceedings of the 11th annual meeting of the Forum for Information Retrieval Evaluation.New York:ACM,2019:14-17.
[21]FRI-DRIKSDÓTTIR S R,SIMONSEN A,ÁSMUNDSSON A S,et al.Ice and Fire:Dataset on Sentiment,Emotions,Toxicity,Sarcasm,Hate speech,Sympathy and More in Icelandic Blog Comments[C]//Proceedings of the Fourth Workshop on Threat,Aggression & Cyberbullying@ LREC-COLING-2024.Paris:ELRA,2024:73-84.
[22]HOSSAIN E,SHARIF O,HOQUE M M,et al.DecipheringHate:Identifying Hateful Memes and Their Targets[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2024:8347-8359.
[23]SINGH A,THAKUR R.Generalizable Multilingual HateSpeech Detection on Low Resource Indian Languages using Fair Selection in Federated Learning[C]//Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg,PA:ACL,2024:7204-7214.
[24]MATHEW B,SAHA P,YIMAM S M,et al.Hatexplain:Abenchmark dataset for explainable hate speech detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI,2021:14867-14875.
[25]BAUER N,PREISIG M,VOLK M.Offensiveness,Hate,Emotion and GPT:Benchmarking GPT3.5 and GPT4 as Classifiers on Twitter-specific Datasets[C]//Proceedings of the Fourth Workshop on Threat,Aggression & Cyberbullying@ LREC-COLING-2024.Paris:ELRA,2024:126-133.
[26]ZHANG Y,LI Z,BAO Z,et al.MuCGEC:a multi-referencemulti-source evaluation dataset for Chinese grammatical error correction[C]//2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Proceedings of the Conference.Stroudsburg,PA:ACL,2022:3118-3130.
[27]YANG H,LIN C J.Tocp:A dataset for chinese profanity processing[C]//Proceedings of the Second Workshop on Trolling,Aggression and Cyberbullying.Paris:ELRA,2020:6-12.
[28]CHUNG I,LIN C J.Tocab:A dataset for chinese abusive language processing[C]//2021 IEEE 22nd International Conference on Information Reuse and Integration for Data Science(IRI).Piscataway,NJ:IEEE,2021:445-452.
[29]JIANG A,YANG X,LIU Y,et al.SWSR:A Chinese dataset and lexicon for online sexism detection[J].Online Social Networks and Media,2022,27:100182.
[30]RAO A,ZHANG Y,JIA Q,et al.Chinese Hate Speech detection method Based on RoBERTa-WWM.[C]//Proceedings of the 22nd Chinese National Conference on Computational Linguistics.Beijing:Chinese Information Processing Society of China,2023:501-511.
[31]LU J,XU B,ZHANG X,et al.Facilitating fine-grained detection of Chinese toxic language:Hierarchical taxonomy,resources,and benchmarks[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2023:16235-16250.
[32]DESTWANG.CTC2021[EB/OL].https://github.com/dest-wang/CTC2021.
[33]TSENG Y H,LEE L H,CHANG L P,et al.Introduction to SIGHAN 2015 bake-off for Chinese spelling check[C]//Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing.Stroudsburg,PA:ACL,2015:32-37.
[34]WU S H,LIU C L,LEE L H.Chinese spelling check evaluation at SIGHAN bake-off 2013[C]//Proceedings of the Seventh SIGHAN Workshop on Chinese Language Processing.Stroudsburg,PA:ACL,2013:35-42.
[35]YU L C,LEE L H,TSENG Y H,et al.Overview of SIGHAN 2014 bake-off for Chinese spelling check[C]//Proceedings of The Third CIPS-SIGHAN Joint Conference on Chinese Language Processing.Stroudsburg,PA:ACL,2014:126-132.
[36]ZHAO Y,JIANG N,SUN W,et al.Overview of the nlpcc 2018 shared task:Grammatical error correction[C]//The 7th CCF International Conference on Natural Language Processing and Chinese Computing.Berlin:Springer-Verlag,2018:439-445.
[37]LXNENG.xpinyin[EB/OL].https://pypi.org/project/xpinyin/.
[38]HOWL-ANDERSON.hanzi_chaizi[EB/OL].https://github.com/howl-anderson/hanzi_chaizi.
[39]MAGICSMX.MultiVWRD-Dataset[EB/OL].https://github.com/Magicsmx/MultiVWRD-Dataset.
[40]MAGICSMX.HateSpeechKeywords[EB/OL].https://github.com/Magicsmx/HateSpeechKeywords.
[41]HILLZHANG1999.ChERRANT[EB/OL].https://github.com/HillZhang1999/MuCGEC/tree/main/scorers/ChERRANT.
[42]GLM T,ZENG A,XU B,et al.ChatGLM:A family of large language models from glm-130b to glm-4 all tools[J].arXiv:2406.12793,2024.
[43]OPENAI.GPT-4o mini:advancing cost-efficient intelligence[EB/OL].https://openai.com/index/gpt-4o-mini-advancing-cost-efficient-intelligence/.
[44]TONGYI LABS.Qwen-max[EB/OL].https://help.aliyun.com/zh/dashscope/developer-reference/quick-start.
[45]LEWIS M.Bart:Denoising sequence-to-sequence pre-training for natural language generation,translation,and comprehension[C]//Proceedings of the Annual Meeting of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2020:7871-7880.
[46]RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].Journal of Machine Learning Research,2020,21(140):1-67.
[47]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Au-dio,Speech,and Language Processing,2021,29:3504-3514.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!