Computer Science ›› 2026, Vol. 53 ›› Issue (6): 30-38.doi: 10.11896/jsjkx.250600158

• Intelligent Education Technology • Previous Articles     Next Articles

Public Opinion Analysis in Universities Based on GNN Multimodal Fusion

LI Zhen1, ZHANG Yang2, LI Zhichao2, ZHAN Peng1, CHEN Lin1   

  1. 1 Digital Intelligence Support Research Institute,Shandong University,Jinan 250100,China
    2 Information Technology Office,Shandong University(Weihai),Weihai,Shandong 264209,China
  • Received:2025-06-24 Revised:2025-09-15 Online:2026-06-15 Published:2026-06-09
  • About author:LI Zhen,born in 1995,postgraduate.His main research interests include multimodal analysis and artificial intelligence.
    CHEN Lin,born in 1983,Ph.D,senior engineer.His main research interest is educational informationization.
  • Supported by:
    National Natural Science Foundation of China(62276155).

Abstract: Currently,social media platforms have become crucial sources of information for identifying campus public opinion events.However,the analysis of such events still faces unique challenges,including sparse domain terminology,platform-specific structures,and diverse event types.To address these issues,this paper proposes a multimodal fusion framework based on graph neural network(GNN).By integrating DOM topological structures,knowledge-enhanced text,and cross-modal dynamic interactions,the framework provides campus administrators with a highly robust and accurate public opinion analysis tool.The approach enhances the robustness of public opinion analysis by mining the topological semantics of HTML DOM and cross-modal dynamic interaction information to obtain deeply fused feature representations.The model parses the DOM tree into a multi-level graph structure and employs GNN to model topological relationships between nodes.It incorporates a gated cross-modal attention mo-dule to dynamically adjust the fusion intensity among DOM,text,and visual modalities.Additionally,a Wikipedia knowledge enhancement strategy is introduced to expand short-text semantics through entity associations.Experimental results demonstrate that the proposed model achieves improvements over baseline models across multiple benchmark datasets.This framework effectively addresses issues such as semantic fragmentation across modalities and insufficient information in short texts,delivering a high-precision multimodal fusion solution for campus public opinion analysis.

Key words: Multimodal analysis, Knowledge enhancement, Graph neural network, Cross-modal fusion, HTML DOM

CLC Number: 

  • TP313
[1]HONG Z,CHARD K,FOSTER I.Combining language andgraph models for semi-structured information extraction on the web[J].arXiv:2402.14129,2024.
[2]LANG Q,ZHOU J,WANG H,et al.PLM-GNN:A webpageclassification method based on joint pre-trained language model and graph neural network[J].arXiv:2305.05378,2023.
[3]PANG T,XIAO W,LIU Y,et al.Web-enhanced unmanned aerial vehicle target search method combining imitation learning and reinforcement learning[J].International Journal of Web Information Systems,2024,20(3):324-337.
[4]NGUYEN P L,DAM C L,NGUYEN X T,et al.Design of a computer vision system for accurate measurement of mango skin defects by surface image slicing[J].Measurement Science and Technology,2024,36(1):015437.
[5]XUE F.Advancements and future directions in deep learning-based natural language processing[J].AIP Conference Procee-dings,2024,3194(1):050022.
[6]XU H,CHEN L,ZHAO Z,et al.Hierarchical multimodal pre-training for visually rich webpage understanding[C]//Procee-dings of the 17th ACM International Conference on Web Search and Data Mining.ACM,2024:864-872.
[7]GUO H,ZHANG W,CHEN J,et al.IW-Bench:Evaluating large multimodal models for converting image-to-web[J].arXiv:2409.18980,2024.
[8]SOLANKI B,NAIR A R,SINGHA M,et al.GraphVL:Graph-enhanced semantic modeling via vision-language models for generalized class discovery[C]//Proceedings of the 15th Indian Conference on Computer Vision Graphics and Image Processing.2024:1-10.
[9]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[J].arXiv:1412.3555,2014.
[10]HWANG W,YIM J,PARK S,et al.Spatial dependency parsing for semi-structured document information extraction[J].arXiv:2005.00642,2020.
[11]LEE J,LIM P,HOOI B,et al.Multimodal large language models for phishing webpage detection and identification[C]//2024 APWG Symposium on Electronic Crime Research.IEEE,2024:1-13.
[12]LI Y,HUANG C,DENG S,et al.KnowPhish:Large languagemodels meet multimodal knowledge graphs for enhancing reference-based phishing detection[C]//33rd USENIX Security Symposium.USENIX,2024:793-810.
[13]HUANG C,LIN Z,HAN Z,et al.PAMoE-MSA:Polarity-aware mixture of experts network for multimodal sentiment analysis[J].International Journal of Multimedia Information Retrieval,2025,14(1):7.
[14]HANZLÍCEK Z,MATOUŠEK J,VÍT J.Using LSTM neuralnetworks for cross-lingual phonetic speech segmentation with an iterative correction procedure[J].Computational Intelligence,2024,40(2):40.
[15]YAO T,LI Y,LI Y,et al.Cross-modal semantically augmented network for image-text matching[J].ACM Transactions on Multimedia Computing,Communications,and Applications,2024,20(4):18.
[16]TAHMASEBI S,MÜLLER-BUDACK E,EWERTH R.Multimodal misinformation detection using large vision-language models[C]//Proceedings of the 33rd ACM International Conference on Information and Knowledge Management.ACM,2024:2189-2199.
[17]AGGARWAL M,GUPTA H,SARKAR M,et al.Form2Seq:A framework for higher-order form structure extraction[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2021.
[18]WANG Q,FANG Y,RAVULA A,et al.Webformer:The web-page transformer for structure information extraction[C]//Proceedings of the ACM Web Conference 2022.ACM,2022:3124-3133.
[19]AL-JEBRNI A H,ALI S G,LI H T,et al.SThy-Net:A feature fusion-enhanced dense-branched modules network for small thyroid nodule classification from ultrasound images[J].The Visual Computer,2023,39(8):3675-3689.
[20]ZADEH A,CHEN M,PORIA S,et al.Tensor fusion network for multimodal sentiment analysis[J].arXiv:1707.07250,2017.
[21]QIAN F,HAN J,HE Y,et al.Sentiment knowledge enhancedself-supervised learning for multimodal sentiment analysis[C]//Findings of the Association for Computational Linguistics:ACL 2023.ACL,2023:12966-12978.
[22]WANG W,DING L,SHEN L,et al.WisdoM:Improving multimodal sentiment analysis by fusing contextual world knowledge[C]//Proceedings of the 32nd ACM International Conference on Multimedia.ACM,2024:2282-2291.
[23]WANG C H,HAN D.Sentiment Analysis of Micro-blog Inte-grated on Explicit Semantic Analysis Method[J].Wireless Personal Communications,2018,102(1079):1-11.
[24]GEROGIANNIS V,KANAVOS A,ANTONOPOULOS N,et al.Enhancing Sentiment Classification in Twitter Data Through Context-Driven Text Processing and Tweet Embeddings[C]//2023 IEEE 11th Region 10 Humanitarian Technology Conference(R10-HTC).IEEE,2023:644-648.
[25]GORI M,MONFARDINI G,SCARSELLI F.A new model for learning in graph domains[C]//Proceedings of the 2005 IEEE International Joint Conference on Neural Networks.IEEE,2005,2:729-734.
[26]TIAN Y J,ZHANG C X,GUO Z C,et al.Recipe2vec:Multi-modal recipe representation learning with graph neural networks[J].arXiv:2205.12396,2022.
[27]WANG Z W,YOU J G,HU R S,et al.Multimodal micro-video recommendation algorithm incorporating graph neural network[J].Journal of Chinese Computer Systems,2025,46(4):825-832.
[28]APPALARAJU S,JASANI B,KOTA B U,et al.Docformer:End-to-end transformer for document understanding[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.IEEE,2021:993-1003.
[29]ZONG Y,MAC AODHA O,HOSPEDALES T.Self-supervised multimodal learning:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,202447(7):5299-5318.
[30]AGGARWAL M,GUPTA H,SARKAR M,et al.Form2Seq:A framework for higher-order form structure extraction[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2021.
[31]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2019.
[32]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[C]//International Conference on Learning Representations.2021.
[33]OVERWIJK A,ONG C,CALLAN J.ClueWeb22:10 billion web documents with rich information[C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2022:3360-3362.
[34]WANG Q,FANG Y,RAVULA A,et al.Webformer:The web-page transformer for structure information extraction[C]//Proceedings of the ACM Web Conference 2022.ACM,2022:3124-3133.
[35]LIN B Y,SHENG Y,VO N,et al.Freedom:A transferable neural architecture for structured information extraction on web documents[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.ACM,2020:1092-1102.
[36]CARLSON A,SCHAFER C.Bootstrapping information extraction from semi-structured web pages[C]//European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases.Springer,2008.
[37]HAO Q,CAI R,PANG Y,et al.From one tree to a forest:a unified solution for structured web data extraction[C]//Procee-dings of the 20th International Conference on World Wide Web.ACM,2011.
[38]QIAN Y,SANTUS E,JIN Z,et al.GraphIE:A graph-basedframework for information extraction[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.ACL,2019.
[39]LIN B Y,SHENG Y,VO N,et al.Freedom:A transferable neural architecture for structured information extraction on web documents[C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.ACM,2020:1092-1102.
[40]ZHOU Y,SHENG Y,VO N,et al.Simplified DOM trees for transferable attribute extraction from the web[J].arXiv:2101.02415,2021.
[41]CHEN L,CHEN X,ZHAO Z,et al.WebSRC:A dataset forweb-based structural reading comprehension[J].arXiv:2101.09465,2021.
[42]WANG Q,FANG Y,RAVULA A,et al.Webformer:The web-page transformer for structure information extraction[C]//Proceedings of the ACM Web Conference 2022.ACM,2022:3124-3133.
[43]LI J,XU Y,CUI L,et al.MarkupLM:Pre-training of text and markup language for visually-rich document understanding[J].arXiv:2110.08518,2021.
[44]LIU T,HU Y,GAO J,et al.Hierarchical multi-modal transformer for cross-modal long document classification[J].arXiv:2407.10105,2024.
[45]WANG Z,GUAN T,FU P,et al.Marten:Visual question an-swering with mask generation for multi-modal document understanding[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2025:14460-14471.
[46]MO Y,SHAO Z,YE K,et al.Doc-CoB:Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning[J].arXiv:2505.18603,2025.
[47]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-CAM:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.IEEE,2017.
[1] ZHAO Lei, YANG Yulu, YUAN Bo. Personalized Course Recommendation System Based on Knowledge Graph [J]. Computer Science, 2026, 53(6): 93-101.
[2] ZHANG Xin, CHEN Wen. CausalVulGNN:Framework for Software Vulnerability Explanation Based on Causal Inferenceand Graph Neural Networks [J]. Computer Science, 2026, 53(6): 427-436.
[3] SHEN Ao, ZHOU Qingkai, XIA Tian, GAO Ruiling. Span-based Aspect Sentiment Triplet Extraction Based on Multi-view Graph Neural Networks [J]. Computer Science, 2026, 53(5): 319-327.
[4] LIU Meilin, MA Le. Learning Path Recommendation Based on Fusion of Hypergraph Neural Network and Dynamic Knowledge Tracking [J]. Computer Science, 2026, 53(5): 68-78.
[5] WANG Jinghong, LI Pengchao, MI Jusheng, WANG Wei. Multi-channel Graph Kolmogorov-Arnold Network Based on WL Graph Core [J]. Computer Science, 2026, 53(4): 224-234.
[6] WANG Jinghong, LI Pengchao, WANG Xizhao, ZHANG Zili. Dual-channel Graph Neural Network Based on KAN [J]. Computer Science, 2026, 53(3): 188-196.
[7] DING Yan, DING Hongfa, YU Muran, JIANG Heling. Survey of Backdoor Attacks and Defenses on Graph Neural Network [J]. Computer Science, 2026, 53(3): 1-22.
[8] ZHAO Zhengbiao, LU Hanyu, DING Hongfa. Node-influence Based Construction Algorithm of Approximate Worst-case Forgetting Set for Graph Unlearning [J]. Computer Science, 2026, 53(3): 64-77.
[9] LI Chengyu, HUANG Ke, ZHANG Ruiheng , CHEN Wei. Heterogeneous Graph Attention Network-based Approach for Smart Contract Vulnerability
Detection
[J]. Computer Science, 2026, 53(2): 423-430.
[10] ZHAI Jie, CHEN Lexuan, PANG Zhiyu. Survey on Graph Neural Network-based Methods for Academic Performance Prediction [J]. Computer Science, 2026, 53(2): 16-30.
[11] YANG Ming, HE Chaobo, YANG Jiaqi. Direction-aware Siamese Network for Knowledge Concept Prerequisite Relation Prediction [J]. Computer Science, 2026, 53(2): 39-47.
[12] WANG Xinyu, SONG Xiaomin, ZHENG Huiming, PENG Dezhong, CHEN Jie. Contrastive Learning-based Masked Graph Autoencoder [J]. Computer Science, 2026, 53(2): 145-151.
[13] LIU Hongjian, ZOU Danping, LI Ping. Pedestrian Trajectory Prediction Method Based on Graph Attention Interaction [J]. Computer Science, 2026, 53(1): 97-103.
[14] LI Yaru, WANG Qianqian, CHE Chao, ZHU Deheng. Graph-based Compound-Protein Interaction Prediction with Drug Substructures and Protein 3D Information [J]. Computer Science, 2025, 52(9): 71-79.
[15] WU Hanyu, LIU Tianci, JIAO Tuocheng, CHE Chao. DHMP:Dynamic Hypergraph-enhanced Medication-aware Model for Temporal Health EventPrediction [J]. Computer Science, 2025, 52(9): 88-95.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!