Computer Science ›› 2026, Vol. 53 ›› Issue (6A): 251200031-16.doi: 10.11896/jsjkx.251200031

• Artificial Intelligence • Previous Articles     Next Articles

Quantitative Diagnostic Model for Generative AI Information Cocoons Based on MultidimensionalSemantic Contrastive Learning

LI Xiao, SUN Xinyu   

  1. College of Law and Information Management,China University of Political Science and Law,Beijing 102249,China
  • Online:2026-06-16 Published:2026-06-12
  • About author:LI Xiao,born in 1986,Ph.D,associate professor.His main research interests include legal AI and big data in criminal psychology.
  • Supported by:
    National Key R&D Program of China(2022YFC3303000,2022YFC3303001),University-Industry Cooperation and Collaborative Education Program of the Ministry of Education of China(241100007150419) and Young Scholars Academic Innovation Research Team Program at China University of Political Science and Law(25CSTD01).

Abstract: This paper proposes GCI-MVP,a theory-driven multidimensional diagnostic model that quantifies information cocoon risks in Generative AI dialogues.Unlike traditional recommender systems,Generative AI exhibits “unidirectional compliance”.This tendency risks constructing closed cognitive spaces through human-AI co-construction-manifested as topic narrowing,opi-nion homogenization,and cognitive frame repetition.Existing studies remain largely theoretical and lack fine-grained,interpretable diagnostic tools for dynamic conversational flows.GCI-MVP addresses this gap through three synergistic innovations.Firstly,it establishes a theory-computable-explainable triadic paradigm.This paradigm encodes three core information cocoon mechanisms—cognitive selection bias,machine cognitive resonance,and cognitive frame repetition—into learnable neural computation units:to-pic prototypes,semantic anchors,and frame probes.Secondly,it synthesizes three interpretable metrics—topic cocoon index(TCI),semantic uniformity index(SUI),and frame repetition index(FRI)-via a multi-branch diagnostic architecture.Thirdly,it enables end-to-end risk assessment through a theory-driven linear fusion layer.Experiments on real-world Chinese dialogues(WildChat) demonstrate that GCI-MVP effectively diagnoses information cocoon risks at different levels,achieving 85.4% accuracy and a macro F1-score of 0.825 in three-level risk classification.It significantly outperforms LDA-based topic diversity,lexical diversity,and fine-tuned BERT baselines.Bootstrap tests and McNemar's test jointly confirm the statistical significance of this advantage(p<0.001).Systematic ablation studies validate the necessity of each diagnostic dimension,and typical case analyses further reveal strong alignment between the proposed metrics and theoretical mechanisms such as “cognitive selection bias” and “machine cognitive resonance.” Cross-model generalization tests on GLM-4,ERNIE 4.0,and BLOOMZ-7B demonstrate that the model achieves stable diagnosis without fine-tuning(macro F1=0.803),exhibiting strong generalizability.GCI-MVP provides a computable,interpretable,and auditable tool for assessing cognitive safety risks in generative AI,offering important theoretical value and application prospects for building secure and trustworthy human-AI dialogue systems.

Key words: Generative AI, Information cocoons, Cognitive safety, AI alignment, Conversational risk, Interpretable evaluation

CLC Number: 

  • TP389.1
[1] HUANG L H.The Formation and Response to “Information Cocoons” in Generative Artificial Intelligence [J].Library and Information Service,2025,69(X):87-91.
[2] CAO D Y.“Reflexive Information Cocoons”:The Construction Mechanism and Transcendence of Generative Artificial Intelligence [J].Exploration and Free Views,2025,1(2):153-166.
[3] HAN Q M,WANG H Y,XU J D.The Impact of User Information Exchange on Information Narrowing in SocialMedia:A Case Study of Sina Weibo [J].Information Science,2016,34(12):97-101..
[4] GUO S L,ZHANG X N,SU X N.Causes and Empirical Research on the Information Cocoons in the Application of Generative Artificial Intelligence by Researchers [J].Information Stu-dies:Theory & Application,2025,48(2):45-55.
[5] WANG Y X,ZHANG W,LIU C,et al.Aligning Large Lan-guage Models with Human Cognitive Diversity:A Framework for Measuring and Mitigating Intellectual Homogenization [C]//Advances in Neural Information Processing Systems 37(NeurIPS 2024).Vancouver:Curran Associates,Inc.,2024:11245-11267.
[6] CHEN K,RODRIGUEZ M,THOMPSON B,et al.Quantifying and Mitigating Intellectual Echo Chambers in LLM-Driven Conversations [C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics(ACL 2024).Bangkok:Association for Computational Linguistics,2024:3892-3910.
[7] GUO S L,ZHANG X,WANG Y.Quantifying Information Diversity in Social Media Using Dynamic Topic Models:An LDA-based Approach and Its Limitations [J].Journal of the Association for Information Science and Technology,2022,73(5):678-692.
[8] SEARCENT P,TAGC C.Social media and the future of open debate:a user-oriented approach to Facebook's filter bubble conundrum [J].Discourse,Context and Media,2019,27(3):41-48.
[9] KITCHENS B,JOHNSON S L,GRAY P H.Understandingecho chambers and filter bubbles:the impact of social media on diversification and partisan shifts in news consumption [J].MIS Quarterly,2020,44(4):1619-1649.
[10] BEAM M A,KOSICKI G M.Personalized news portals:filtering systems and increased news exposure [J].Journalism & Mass Communication Quarterly,2014,91(1):59-77.
[11] DUAN Y,YUAN Y Z,ZHANG H.An Empirical Study on the Information Cocoons of Network Users and Their Formation Mechanism in the Big Data Environment [J].Journal of Intelligence,2020,39(11):158-164.
[12] CHEN Q,WANG Y W.Cocoon Effect and News Consumption Behavior Pattern:A Case Study of User Comment Data from Tencent News Client [J].Social Sciences,2019(11):73-87.
[13] ZHANG M,LI Y,WANG X Y,et al.A Quantitative Diagnostic Framework for Information Cocoons in Recommender Systems:From Static Metrics to Dynamic Processes [J].IEEE Transactions on Knowledge and Data Engineering,2024,36(8):3789-3805.
[14] WANG H,LIU Y,SUN J.Diagnostic Model of Information Cocoons in Generative AI Environments Based on Multi-Dimensional Semantic Evolution Trajectories [J].Journal of the China Society for Scientific and Technical Information,2025,44(2):156-170.
[15] LI W,CHEN Y,ZHAO Q.Measuring the Echo Chamber Effect in Conversational AI:A Multi-Dimensional Approach [J].ACM Transactions on Information Systems,2024,42(3):1-28.
[16] WU J,WANG X,FENG F,et al.Self-supervised graph learning for recommendation [C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2021:726-735.
[17] GIORGI J,NITSKI O,WANG B,et al.DeCLUTR:Deep contrastive learning for unsupervised textual representations [C]//Proceedings of the 58th Annual Meeting of the Association for Computa-tional Linguistics.Online:Association for Computational Linguistics,2020:879-895.
[18] CARLSSON F,GYLLENSTEN A C,GOGOULOU E,et al.Semantic re-tuning with contrastive tension [C]//International Conference on Learning Representations(ICLR 2021).2021.
[19] LIN Z,LIU B,MADOTTO A,et al.Zero-shot dialogue state tracking via cross-task transfer [C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing(EMNLP 2021).Association for Computational Linguistics,2021:7890-7900.
[20] CHEN T,KORNBUTH S,NOROUZI M,et al.A simple framework for contrastive learning of visual representations [C]//Proceedings of the 37th International Conference on Machine Learning(ICML 2020).2020.
[21] ZHANG L,HE Z,MAO T T.The Double-Edged Sword Effect of Generative Artificial Intelligence Information Acquisition on Information Cocoons [J/OL].https://link.cnki.net/urlid/11.1448.G3.20251010.1650.008.
[22] RISKO E F,GILBERT S J.Cognitive offloading [J].Trends in Cognitive Sciences,2016,20(9):676-688.
[23] SUNSTEIN C R.Infotopia:How Many Minds Produce Know-ledge [M].Beijing:Law Press,2008.
[24] TAO H,HE Y.Quantifying Topic Persistence in Conversational AI:A Benchmark and Evaluation Framework [J].ACM Tran-sactions on Interactive Intelligent Systems,2023,13(3):1-24.
[25] LAPATA M.Probabilistic text structuring:Experiments withsentence ordering [C]//Proceedings of the 41st Annual Mee-ting of the Association for Computational Linguistics.Sapporo:Association for Computational Linguistics,2003:545-552.
[26] XU X,DONG J Y.A Study on the Emotional Bias of Users' “Information Cocoons” in Social Network Content Production [J].Global Media Journal,2022,9(4):78-99.
[27] SCHEGLOFF E A.Sequence organization in interaction:a primer in conversation analysis [M].Cambridge:Cambridge University Press,2007.
[28] DEMBERG V,KELLER F.A computational model of the information density-syntax interface [C]//Proceedings of the 30th Annual Conference of the Cognitive Science Society.Austin:Cognitive Science Society,2008:1440-1445.
[29] JOLLIFFE I T,CADIMA J.Principal component analysis:a review and recent developments [J].Philosophical Transactions of the Royal Society A:Mathematical,Physical and Engineering Sciences,2016,374(2065):20150202.
[30] LIKERT R.A technique for the measurement of attitudes [J].Archives of Psychology,1932,22(140):1-55.
[1] ZHANG Can, LI Weixun, WANG Ming, ZHAN Xiong, XIE Ziguang, HAN Dongqi, WANG Zhiliang, YANG Jiahai. Network Traffic Generation Method for Malicious Traffic Identification [J]. Computer Science, 2026, 53(4): 415-423.
[2] LI Jiahui, ZHANG Mengmeng, CHEN Honghui. Large Language Models Driven Framework for Multi-agent Military Requirement Generation [J]. Computer Science, 2025, 52(1): 65-71.
[3] YAN Yusong, ZHOU Yuan, WANG Cong, KONG Shengqi, WANG Quan, LI Minne, WANG Zhiyuan. COA Generation Based on Pre-trained Large Language Models [J]. Computer Science, 2025, 52(1): 80-86.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!