基于语义扩充和HDGCN的虚假新闻联合检测技术

doi:10.11896/jsjkx.230700170

摘要/Abstract

摘要： 虚假新闻检测的方法有很多种,单一的方法通常只关注新闻内容、社交上下文或外部事实等信息;而联合检测方法则通过整合多种模式信息达到检测目的。Pref-FEND即为一种整合新闻内容与外部事实的联合检测方法,它从新闻内容和外部事实中提取3种词语表示,利用动态图卷积网络获得词节点之间的关系。但其在如何让两种模式更加专注于自己的偏好部分方面仍存在不足。因此,对Pref-FEND模型进行了改进,利用语义挖掘扩充新闻中的风格词,利用实体链接扩充新闻中的实体词,共得到5种词语并将其作为图网络的节点表示,从而更有效地建模图神经网络的节点表征;同时,引入深度异构图卷积网络(HDGCN)进行偏好学习,它的深度策略和多层注意力机制可以让两种模型更加专注于自身需要的偏好感知并减少冗余信息。实验结果表明,在公开数据集Weibo和Twitter上,与当前主流的基于内容的单一模型LDAVAE相比,改进后的框架F1值分别提高了2.8%和1.9%;与基于事实的单一模型GET相比,F1值分别提高了2.1%和1.8%;同时,在LDAVAE+GET联合检测的情况下,比Pref-FEND的 F1值分别提高了1.1%和1.3%。实验结果验证了所改进模型的有效性。

关键词: 虚假新闻, 图卷积网络, 实体抽取, 注意力机制, 自然语言处理

Abstract: here are many methods for detecting fake news.The single method typically focuses only on information such as news content,social context,or external facts.On the other hand,joint detection methods integrate multiple modalities of information to achieve the detection goal.Pref-FEND is an example of a joint detection method that integrates news content and external facts.It extracts three types of word representations from news content and external facts,and uses dynamic graph convolutional networks to capture relationships between word nodes.However,there are still shortcomings in how to make each modality more focused on its preferred aspect.Therefore,the Pref-FEND model has been improved by using semantic mining to expand style words in news and entity linking to expand entity words in news.This results in five types of word as node representations in the graph neural network,enabling a more effective modeling of the node representation of the graph neural network.Additionally,a deep heterogeneous graph convolutional network(HDGCN) is introduced for preference learning.Its deep strategy and multi-layer attention mechanism allow both models to focus more on their own preferred perception and reduce redundant information.Experimental results demonstrate the effectiveness of the improved framework.On the public datasets Weibo and Twitter,compared to the current state-of-the-art content-based single model LDAVAE,the improved framework achieves an F1 score improvement of 2.8% and 1.9% respectively.Compared to the fact-based single model GET,the F1 score improvement is 2.1% and 1.8% respectively.In the case of joint detection using LDAVAE+GET,the F1 score is 1.1% and 1.3% higher than Pref-FEND respectively.Experimental results validate the effectiveness of the improved model.

Key words: Fake news, Graph convolutional networks, Entity extraction, Attention mechanism, Natural language processing

中图分类号:

TP183

张明道, 周欣, 吴晓红, 卿粼波, 何小海. 基于语义扩充和HDGCN的虚假新闻联合检测技术[J]. 计算机科学, 2024, 51(4): 299-306. https://doi.org/10.11896/jsjkx.230700170

ZHANG Mingdao, ZHOU Xin, WU Xiaohong, QING Linbo, HE Xiaohai. Unified Fake News Detection Based on Semantic Expansion and HDGCN[J]. Computer Science, 2024, 51(4): 299-306. https://doi.org/10.11896/jsjkx.230700170

参考文献

[1]ALLCOTT H,GENTZKOW M.Social Media and Fake News in the 2016 Election [J].Journal of Economic Perspectives,2017,31(2):211-236.
[2]NAEEM S B,BHATTI R.The Covid ‘infodemic':a new front for information professionals [J].Health Information & Libra-ries Journal,2020,73(19):13-16.
[3]DOU Y T,SHU K,XIA C Y,et al.User Preference-aware Fake News Detection [J].Special Interest Group on Information Retrieval,2021,24(19):25.
[4]LU Y J,LI C T.GCAN:Graph-aware Co- Attention Networks for Explainable Fake News Detection on Social Mdeia [J].Association for Computational Linguistics,2020,19(4):40-45.
[5]JIANG S,CHEN X T,ZHANG L M,et al.User Characteristic Enhanced Model for Fake News Detection in Social Media [C]//NLPCC.2019.
[6]CASTILLO C,MENDOZA M,POBLETE B.Information Cre-dibility on Twitter [C]//Proceedings of the 20th International Conference on World Wide Web(Hyderabad,India).2011.
[7]VOLKOVA S,SHAFFER K,JANG J Y,et al.Separating Facts from Fiction:Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter [C]//Proceedings of the 55th An-nual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2017:647-653.
[8]PRZYBYLA P.Capturing the Style of Fake News [C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2020:490-497.
[9]NAN Q,CAO J,ZHU Y,et al.MDFEND:Multi-domain FakeNews Detection[J].Information and Knowledge Management Conference,2022,9(8):75-80.
[10]POPAT K,MUKHERJEE S,ATES A Y,WEIKUM G.De-clare:Debunking fake news and false claims using evidence-awaredeep learning [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:22-32.
[11]NGUYENVO,LEE K.Learning from fact-checkers:Analysisand generation of fact-checking language [C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.2019:335-344.
[12]POTTHAST M,KIESEL J,REINARTZ K,BEVENDORFF J,STEIN B.A stylometric inquiry into hyperpartisan and fake news [J].arXiv:1702.05638,2017.
[13]CUI L,SEO H,TABAR M,et al.DETERRENT:Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation[C]//The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining(KDD '20).ACM,2020:492-502.
[14]THORNE J,VLACHOS A,COCARASCU O,et al.The FactExtraction and VERification(FEVER) Shared Task [C]//Proceedings of the First Workshop on Fact Extraction and VERification(FEVER)(Brussels,Belgium).Association for Computational Linguistics,2018:1-9.
[15]POPAT K,MUKHERJEE S,ATES A Y,et al.Declare:Debunking fake news and false claims using evidence-awaredeep lear-ning [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:22-32.
[16]MA J,GAO W,JOTY S,et al.Sentence-Level Evidence Embedding for Claim Verification with Hierarchical Attention Networks [J].ACL,2019(2):2561-2571.
[17]WU L,RAO Y,YANG X,et al.Evidence-Aware Hierarchical Interactive Attention Networks for Explainable Claim Verification [C]//Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence IJCAI-PRICAI-20.2020:1388-1394.
[18]VO N AND LEE K.Hierarchical Multi-head Attentive Network for Evidence-aware Fake News Detection [C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2021:965-975.
[19]SHENG Q,ZHANG X,CAO J,et al.Integrating Pattern-andFact-based Fake News Detection via Model Preference Learning [C]//ACM International Conference on Information and Knowledge Management.2021:69-78.
[20]KANG Z,CAO Y,SHANG Y.Fake News Detection with Hete-rogenous Deep Graph Convolutional Network [C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.2021:408-420.
[21]ZHANG X,CAO J,LI X,et al.Mining Dual Emotion for Fake News Detection [C]//Proceedings of the Web Conference 2021 Association for Computing Machinery.2021:3465-3476.
[22]JIAO Z,SUN S,SUN K.Chinese Lexical Analysis with Deep Bi-GRU-CRF Network [J].Natural Language Processing,2018,9(7):1735-1780.
[23]YANG,YU D,ZHANG F,et al.TexSmart:A System for Enhanced Natural Language Understanding [C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:1-10.
[24]MILLER G A.WordNet:a lexical database for English [J].Commun. ACM,1995,38:39-41.
[25]DONG Z D,DONG Q.HowNet-a hybrid language and know-ledge resource [C]//Proceeding of the International Conference on Natural Language Processing and Knowledge Engineering.2003.
[26]LINMEI H,YANG T,SHI C,et al.Heterogeneous graph attention networks for semi-supervised short text classification[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019.
[27]DEVLIN J,CHANG M,LEE K,et al.BERT:Pre-training ofDeep Bidirectional Transformers for Language Understanding [C]//Proceedings of the 2019 Conference of the North American Chapter of the Asso-ciation for Computational Linguistics:Human Language Technologies.2018.
[28]WANG S,LIANG D,SONG J,et al.DABERT:Dual Attention Enhanced BERT for Semantic Matching [C]//Proceedings of the 2022 International Conference on Computational Linguistics.2022:325-335.
[29]GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures [J].Neural networks 18,2005,24:602-610.
[30]HOSSEINI M,SABET A J.Interpretable Fake News Detectionwith Topic and Deep Variational Models [J].Online Social Networks and Media,2022,6(3):66-78.
[31]XU W,WU J.Evidence-aware Fake News Detection with Graph Neural Networks [J].WWW '22,2022(3):25-29.
[32]WOLF T,DEBUT L,SANH V,et al.Transformers:State-of-the-Art Natural Language Processing [C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.2020:38-45.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed