计算机科学 ›› 2024, Vol. 51 ›› Issue (3): 14-19.doi: 10.11896/jsjkx.230800063

• 新计算模式下的信息安全防护 • 上一篇    下一篇

基于对比图学习的跨文档虚假信息检测

廖劲智1, 赵和伟1, 连小童1, 纪文亮1, 石海明1, 赵翔2   

  1. 1 国防大学军事管理学院 北京100000
    2 国防科技大学系统工程学院 长沙410072
  • 收稿日期:2023-08-09 修回日期:2023-11-27 出版日期:2024-03-15 发布日期:2024-03-13
  • 通讯作者: 赵翔(xiangzhao@nudt.edu.com)
  • 作者简介:(jinzhiliao19@163.com)
  • 基金资助:
    国家重点研发计划(2022YFB3102600);国家自然科学基金(72301284,62272469)

Contrastive Graph Learning for Cross-document Misinformation Detection

LIAO Jinzhi1, ZHAO Hewei1, LIAN Xiaotong1, JI Wenliang1, SHI Haiming1, ZHAO Xiang2   

  1. 1 College of Military Management,National Defense University,Beijing 100000,China
    2 College of System Engineering,National University of Defense Technology,Changsha 410072,China
  • Received:2023-08-09 Revised:2023-11-27 Online:2024-03-15 Published:2024-03-13
  • About author:LIAO Jinzhi,born in 1993,Ph.D,lectu-rer.His main research interests include natural language processing and know-ledge management.ZHAO Xiang,born in 1986,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.39960S).His main research interests include knowledge graph and data analysis.
  • Supported by:
    National Key R&D Program of China(2022YFB3102600)and National Natural Science Foundation of China(72301284,62272469).

摘要: 当前,网络上充斥着大量虚假信息,严重阻碍了社会各行业的正常运转,如何精准检测虚假信息成为了亟待解决的问题。现有研究主要从账户特征、文本内容和多模态3个角度开展工作,但大多忽视了虚假信息赖以传播的关键特征(即内容新奇性),仅是孤立地分析判别目标信息的真实性,未能把握舆论环境的特征。因此,提出了一种基于对比图学习的跨文档虚假信息检测方法(Contrastive Graph Learning,CAL),聚焦于内容新奇性,主要包含两个关键模块:对比学习模块和异构图模块。前者致力于扩大客观事实与虚假信息在向量空间中的表示差异性;后者包含实体、事件、事件集、句子和文档5种类型实体,尽可能向实体表示中注入舆论环境的语义特征。最后,在IED,TL17和Crisis这3个数据集上,在文档级和事件级这两个层次上开展了相关实验,CAL在所有测试中均取得了最优的结果,验证了所提方法的有效性。

关键词: 跨文档虚假信息检测, 对比学习, 异构图, 事件级检测

Abstract: Misinformation proliferates on the Internet,undermining the normal functioning of various industries.Detecting falsehoods accurately has therefore become an urgent challenge.Existing research on this task focuses primarily on three aspects:account traits,textual content,and multimodality.However,most methods overlook the key attribute of misinformation diffusion the novelty of content.They analyze the veracity of target claims in isolation,failing to capture public opinion dynamics.To address this issue,this paper proposes a cross-document misinformation detection framework called contrastive graph learning(CAL).CAL focuses on content novelty and comprises two key components:a contrastive learning module and a heterogeneous graph module.The former expands the representational difference between factual and false claims,and the latter encompasses five entity types:words,events,event sets,sentences,and documents.It injects semantic features of the public discourse into entity embeddings.We evaluate CAL on the IED,TL17,and Crisis datasets at both document and event levels.CGL achieves state-of-the-art performance,which verifies the efficacy of its design.It provides a robust solution for combating misinformation by mode-ling novelty and environmental context.

Key words: Cross-document misinformation detection, Contrastive learning, Heterogeneous graph, Event-level detection

中图分类号: 

  • TP391
[1]SOROUSH V,ROY D,ARAL S,et al.The spread of true andfalse news online [J].Science,2018,359:1146-1151.
[2]JIN Z,CAO J,ZHANG Y,et al.News verification by exploiting conflicting social viewpoints in microblogs [C]//AAAI Confe-rence on Artificial Intelligence.2016:2972-2978.
[3]WU L,RAO Y,JIN H,et al.Different absorption from the same sharing:Sifted multi-task learning for fake news detection [C]//Conference on Empirical Methods in Natural Language Proce-ssing(EMNLP).2019:4643-4652.
[4]TACCHINI E,BALLARIN G,VEDOVA M L,et al.Some like it hoax:Automated fake news detection in social networks [J].arXiv:1704.07506,2017.
[5]YANG W,LIANG G,XIE K.Rumor detection method based on burst topic detection and domain expert discovery [J].Journal of Computer Applications,2017,37(10):2799- 2805.
[6]SHU K,MAHUDESWARAN D,WANG S,et al.Fakenewsnet:A data repository with news content,social context,and spatiotemporal information for studying fake news on social media [J].Big Data,2020,8(3):171-188.
[7]GUPTA M,ZHAO P,HAN J.Evaluating event credibility ontwitter [C]//International Conference on Data Mining(ICDM).2012:153-164.
[8]ZHANG J,DONG B,YU P S.Fakedetector:Effective fake news detection with deep diffusive neural network [C]//International Conference on Data Engineering(ICDE).2020:1826-1829.
[9]XUE H,WANG L,YANG Y,et al.Rumor detection modelbased on user propagation network and message content [J].Journal of Computer Applications,2021,41(12):3540-3545.
[10]CONROY N,RUBIN V L,CHEN Y.Automatic deception detection:Methods for finding fake news [C]//Association for Information Science and Technology.2015:1-4.
[11]GAO M,CHEN F.Credibility evaluating method of Chinese microblog based on information fusion [J].Journal of Computer Applications,2016,36(8):2071-2075,2081.
[12]POTTHAST M,KIESEL J,REINARTZ K,et al.A stylometric inquiry into hyperpartisan and fake news [C]//Annual Meeting of the Association for Computational Linguistics(ACL).2018:231-240.
[13]SITAULA N,MOHAN C K,GRYGIEL J,et al.Credibilitybased fake news detection [J].arXiv:1911.00643,2019.
[14]LEE D,KIM Y,KIM H,et al.Fake News Detection Using Deep Learning [J].The Journal of Information Processing Systems,2019,15:1119-1130.
[15]VAIBHAV V,ANNASAMY R M,HOVY E H.Do sentence interactions matter? leveraging sentence level representations for fake news classification [C]//Workshop on Graph-Based Me-thods for Natural Language Processing.2019:134-139.
[16]PAN J Z,PAVLOVA S,LI C,et al.Content based fake news detection using knowledge graphs [C]//International Semantic Web Conference(ISWC).2018:669-683.
[17]WANG H,GONG L,ZHOU Z,et al.Dis-information from Social Media with Semantic Enhancement[J].Data Analysis and Knowledge Discovery,2023,7(2):48-60.
[18]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition [C]//International Conference on Learning Representations(ICLR).2015.
[19]WANG Y,MA F,JIN Z,et al.EANN:Event Adversarial Neural Networks for Multimodal Fake News Detection [C]//ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:849-857.
[20]KHATTAR D,GOUD J S,GUPTA M,et al.MVAE:Multimodal Variational Autoencoder for Fake News Detection [C]//The Web Conference(WWW).2019:2915-2921.
[21]SINGHAL S,SHAH R R,CHAKRABORTY T,et al.Spot-Fake:A Multi-modal Framework for Fake News Detection [C]//International Conference on Multimedia Big Data.2019:39-47.
[22]WANG W Y.liar,liar pants on fire:A new benchmark dataset for fake news detection [C]//Annual Meeting of the Association for Computational Linguistics(ACL).2017:422-426.
[23]ZHANG H,FANG Q,QIAN S,et al.Multimodal Knowledge-aware Event Memory Network for Social Media Rumor Detection [C]//ACM International Conference on Multimedia(MM).2019:1942-1951.
[24]SONG C,NING N,ZHANG Y,et al.A Multimodal Fake News Detection Model Based on Crossmodal Attention Residual and Multichannel Convolutional Neural Networks [J].Information Process and Management(IPM),2021,58(1):102437.
[25]WANG Y,QIAN S,HU J,et al.Fake News Detection viaKnowledge-Driven Multimodal Graph Convolutional Networks [C]//International Conference on Multimedia Retrieval(ICMR).2020:540-547.
[26]LIU J,FENG K,JEFF Z P,et al.MSRD:Multi-Modal Web Rumor Detection Method[J].Journal of Computer Research and Development,2020,57(11):2328-2336.
[27]WU X,HUANG K,FUNG Y R,et al.Cross-document Misinformation Detection based on Event Graph Reasoning [C]//North American Chapter of the Association for Computational Linguistics(NAACL).2022:543-558.
[28]CHEN T,KORNBLITH S,NOROUZI M,et al.A SimpleFramework for Contrastive Learning of Visual Representations [C]//International Conference on Machine Learning(ICML).2020:1597-1607.
[29]LIN Y,JI H,HUANG F,et al.A joint neural model for information extraction with global features [C]//Annual Meeting of the Association for Computational Linguistics(ACL).2020:7999-8009.
[30]PAN X,ZHANG B,MAY J,et al.Cross lingual name tagging and linking for 282 languages [C]//Annual Meeting of the As sociation for Computational Linguistics(ACL).2017:1946-1958.
[31]LEE K,HE L,LEWIS M,et al.End-to-end neural coreference resolution [C]//Conference on Empirical Methods in Natural Language Processing(EMNLP).2017:188-197.
[32]LAI T,JI H,BUI T,et al.A context-dependent gated module for incorporating symbolic semantics into event coreference re-solution[C]//North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL).2021:3491-3499.
[33]WEN H,LIN Y,LAI T,et al.RESIN:A dockerized schemaguided cross-document cross-lingual cross-media in formation extraction and event tracking system [C]//North American Chapter of the Association for Computational Linguistics:Human Language Technologies:Demonstrations(NAACL).2021:133-143.
[34]DEVLIN J,CHANG M,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding [C]//North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL).2019:4171-4186.
[35]HAMILTON W L,YING Z,LESKOVEC J.Inductive representation learning on large graphs [C]//Annual Conference on Neural Information Processing Systems(NeurIPS).2017:1024-1034.
[36]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks [C]//International Conference on Learning Representations(ICLR).2017.
[37]SRIVASTAVA R K,GREFF K,SCHMIDHUBER J.Highway Networks[J].arXiv:1505.00387,2015.
[38]BORDES A,USUNIER N,GARCÌA-DURÁN A,et al.Translating embeddings for modeling multi-relational data [C]//Neural Information Processing Systems(NeurIPS).2013:2787-2795.
[39]LI M,LI S,WANG Z,et al.Future is not one-dimensional:Graph modeling based complex event schema induction for event prediction [J].arXiv:2104.06344,2021.
[40]TRAN G B,ALRIFAI M,NGUYEN D Q.Predicting relevant news events for timeline summaries [C]//International World Wide Web Conference(WWW).2013:91-92.
[41]KARIMI H,TANG J.Learning hierarchical discourse-levelstructure for fake news detection [C]//North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL).2019:3432-3442.
[42]ZELLERS R,HOLTZMAN A,RASHKIN H,et al.Defending against neural fake news [C]//Neural Information Processing Systems(NeurIPS).2019:9051-9062.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!