基于深度学习的海洋热点新闻挖掘方法

doi:10.11896/jsjkx.231200005

Abstract

Abstract: The rapid development of the mobile Internet and the popularity of modern mobile clients promote the vigorous deve-lopment of the online news industry,social media and self-media,etc.,providing users with diverse and rich information.With the steady advancement of China's maritime power strategy and the significant enhancement of national maritime eawareness,the Internet is flooded with multifaceted information on the ocean field, with relevant media reports and public opinions proliferating online and hotspot events occurring frequently.Aiming at multi-source and multi-attribute network marine information,based on multi-source text clustering and automatic summarization technology,an automatic deep learning-based ocean hot news mining system is proposed,including five functional modules:automatic collection of multi-source ocean-related data,data preprocessing,feature extraction,text clustering,and automatic summarization.Specifically,the web crawler program collects diverse and scattered ocean data from multiple data sources,automatically structures the data and stores it in the database;clustering analysis is performed based on the similarity of text features and relationships between texts,which provides data support for subsequent summarization generation and topic discovery.Additionally,an automatic summary generation method for ocean news is proposed,leveraging the powerful contextual understanding and rich language expression abilities of the pre-trained language mo-dels.Multiple experiments demonstrate the effectiveness of the proposed method in each evaluation index,highlighting its superiority in mining news on multi-source heterogeneous networks.This method provides a feasible solution for processing scattered marine information and generating more readable content summaries,significantly contributing to the enhancement of marine information retrieval efficiency,monitoring public opinion trends,and promoting the application and dissemination of marine information.

Key words: Ocean news, Text clustering, Automatic summarization, Deep learning, Natural language processing, Pre-trained model

CLC Number:

TP391

QIN Xianping, DING Zhaoxu, ZHONG Guoqiang, WANG Dong. Deep Learning-based Method for Mining Ocean Hot Spot News[J].Computer Science, 2024, 51(11A): 231200005-10.

References

[1]LIU Z C,LIN G S,GOH W L.Bottom-up scene text detection with Markov clustering networks[J].International Journal of Computer Vision,2020,128:1786-1809.
[2]FAN J C.Large-scale subspace clustering via k-factorization[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:342-352.
[3]DANG Z Y,DENG C,YANG X,et al.Nearest Neighbor Ma-tching for Deep Clustering[C]//Proceedings of the IEEE/CVF Conference on Compute Vision and Pattern Recognition.2021:13693-13702.
[4]HARTL P,KRUSCHWITZ U.Applying Automatic Text Summarization for Fake News Detection[J].arXiv:2204.01841,2022.
[5]LI H R,ZhU J N,ZhANG J J,et al.Keywords-guided abstractive sentence summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8196-8203.
[6]ABDI A,HASAN S,SHAMMUDDIN S M,et al.A hybrid deep learning architecture for opinion-oriented multi-document summarization based on multi-feature fusion[J].Knowledge-Based Systems,2021,213:106658.
[7]ALAMI N,MEKNASSI M,EN-NAHNAHI N,et al.Unsuper-vised neural networks for automatic Arabic text summarization using document clustering and topic modeling[J].Expert Systems with Applications,2021,172:114652.
[8]STEFANOVITCH N,JACQUET G,LONGUEVILLE B D.Graph and Embedding based Approach for Text Clustering:Topic Detection in a Large Multilingual Public Consultation[C]//Companion Proceedings of the ACM Web Conference 2023.2023:694-700.
[9]MCCONVILLE R,SANTOS-RODRIGUEZ R,PIECHOCKI RJ,et al.N2D:(not too) deep clustering via clustering the local manifold of an autoencoded embedding[C]//2020 25th international conference on pattern recognition(ICPR).IEEE,2021:5145-5152.
[10]WANG D X,LI T R,DENG P,et al.A Generalized Deep Lear-ning Algorithm based on NMF for Multi-view Clustering[J].IEEE Transactions on Big Data,2022,9(1):328-340.
[11]GEORGE L,SUMATHY P.An integrated clustering and BERT framework for improved topic modeling[J].International Journal of Information Technology,2023,15(4):2187-2195.
[12]OLUKANMI P,NELWAMONDO F,MARWALA T,et al.Automatic detection of outliers and the number of clusters in k-means clustering via Chebyshev-type inequalities[J].Neural Computing and Applications,2022,34(8):5939-5958.
[13]SAHA J,MUKHERJEE J.CNAK:Cluster number assistedK-means[J].Pattern Recognition,2021,110:107625.
[14]ZHAO X W,NIE F P,WANG R,et al.Improving projectedfuzzy K-means clustering via robust learning[J].Neurocompu-ting,2022,101,491:34-43.
[15]UNGER H,KUBEK M,HLOCH M,et al.A survey on innovative graph-based clustering algorithms[J].The Autonomous Web,2022,101:95-110.
[16]WANG C,PAN S R,CELINA P Y,et al.Deep neighbor-aware embedding for node clustering in attributed graphs[J].Pattern Recognition,2022,122:108230.
[17]RAN X C,XI Y,LU Y,et al.Comprehensive survey on hierarchical clustering algorithms and the recent developments[J].Artificial Intelligence Review,2023,56(8):8219-8264.
[18]DOGAN A,BIRANT D.K-centroid link:a novel hierarchicalclusteringlinkage method[J].Applied Intelligence,2022,52(5):5537-5560.
[19]IKOTUN A M,EZUGWU A E,ABUALIGAH L,et al.K-means clustering algorithms:A comprehensive review,va-riants analysis,and advances in the era of big data[J].Information Sciences,2023,622:178-210.
[20]HUANG S D,KANGZ,XU Z,et al.Robust deep k-means:An effective and simple method for data clustering[J].Pattern Re-cognition,2021,117:107996.
[21]SHRIFAN N H M M,AKBAR M F,ISA N A M.An adaptive outlier removal aided k-means clustering algorithm[J].Journal of King Saud University-Computer and Information Sciences,2022,34(8):6365-6376.
[22]HE W H,WU C J,ZHOU S J,et al.Study on Short Text Clustering with Unsupervised SimCSE[J].Computer Science,2023,50(11):71-76.
[23]LI Y F,YANG M,X PENG D Z,et al.Twin contrastive learning for online clustering[J].International Journal of Computer Vision,2022,130(9):2205-2221.
[24]LIU Y,TU W X,ZHOU S H,et al.Deep graph clustering via dual correlation reduction[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022,36(7):7603-7611.
[25]CAI J Y,FAN J C,GUO W Z,et al.Efficient deep embedded subspace clustering[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2022:1-10.
[26]SUBAKTI A,MURFI H,HARIADI N.The performance ofBERT as data representation of text clustering[J].Journal of big Data,2022,9(1):1-21.
[27]CAI J Y,WANG S P,XU C,Y et al.Unsupervised deep clustering via contractive feature representation and focal loss[J].Pattern Recognition,2022,123:108386.
[28]RONEN M,FINDER S E,FREIFELD O.DeepDPM:Deep clustering with an unknown number of clusters[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:9861-9870.
[29]EL-KASSAS W S,SALAMA C R,RAFEA A A,et al.Auto-matic text summarization:A comprehensive survey[J].Expert systems with applications,2021,165:113679.
[30]CAI X Y,SHI K L,JIANG Y H,et al.HITS-based attentional neural model for abstractive summarization[J].Knowledge-Based Systems,2021,222:106996.
[31]LIU Y X,LIU P F,RADEV D,et al.BRIO:Bringing order toabstractive summarization[J].arXiv:2203.16804,2022.
[32]JIN H Q,WANG T M,WAN X J.SemSUM:Semantic depen-dency guided neural abstractive summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8026-8033.
[33]JOSHI A,FIDALGO E,ALEGRE E,et al.RankSum—An unsupervised extractive text summarization based on rank fusion[J].Expert Systems with Applications,2022,200:116846.
[34]JOSHI A,FIDALGO E,ALEGRE E,et al.DeepSumm:Exploiting topic models and sequence to sequence networks for extractive text summarization[J].Expert Systems with Applications,2023,211:118442.
[35]MAO X J,WEI Y,YANG Y R,et al.KHGAS:KeywordsGuided Heterogeneous Graph for Abstractive Summarization[J].Computer Science,2024,51(7):278-286.
[36]SRIVASTAVA R,SINGH P,RANA K P S,et al.A topic mo-deled unsupervised approach to single document extractive text summarization[J].Knowledge-Based Systems,2022,246:108636.
[37]KHURANA A,BHATNAGAR V.Investigating entropy for extractive document summarization[J].Expert Systems with Applications,2022,187:115820.
[38]JING B Y,YOU Z Y,YANG T,et al.Multiplex graph neural network for extractive text summarization[J].arXiv:2108.12870,2021.
[39]KOURIS P,ALEXANDRIDIS G,STAFYLOPATIS A.Ab-stractive text summarization based on deep learning and semantic content generalization[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5082-5092.

Related Articles 15

[1]	XU Jinlong, GUI Zhonghua, LI Jia'nan, LI Yingying, HAN Lin. FP8 Quantization and Inference Memory Optimization Based on MLIR [J]. Computer Science, 2024, 51(9): 112-120.
[2]	DU Yu, YU Zishu, PENG Xiaohui, XU Zhiwei. Padding Load:Load Reducing Cluster Resource Waste and Deep Learning Training Costs [J]. Computer Science, 2024, 51(9): 71-79.
[3]	GUO Zhiqiang, GUAN Donghai, YUAN Weiwei. Word-Character Model with Low Lexical Information Loss for Chinese NER [J]. Computer Science, 2024, 51(8): 272-280.
[4]	CHEN Siyu, MA Hailong, ZHANG Jianhui. Encrypted Traffic Classification of CNN and BiGRU Based on Self-attention [J]. Computer Science, 2024, 51(8): 396-402.
[5]	SUN Yumo, LI Xinhang, ZHAO Wenjie, ZHU Li, LIANG Ya’nan. Driving Towards Intelligent Future:The Application of Deep Learning in Rail Transit Innovation [J]. Computer Science, 2024, 51(8): 1-10.
[6]	KONG Lingchao, LIU Guozhu. Review of Outlier Detection Algorithms [J]. Computer Science, 2024, 51(8): 20-33.
[7]	TANG Ruiqi, XIAO Ting, CHI Ziqiu, WANG Zhe. Few-shot Image Classification Based on Pseudo-label Dependence Enhancement and NoiseInterferenceReduction [J]. Computer Science, 2024, 51(8): 152-159.
[8]	XIAO Xiao, BAI Zhengyao, LI Zekai, LIU Xuheng, DU Jiajin. Parallel Multi-scale with Attention Mechanism for Point Cloud Upsampling [J]. Computer Science, 2024, 51(8): 183-191.
[9]	ZHANG Junsan, CHENG Ming, SHEN Xiuxuan, LIU Yuxue, WANG Leiquan. Diversified Label Matrix Based Medical Image Report Generation [J]. Computer Science, 2024, 51(8): 200-208.
[10]	GUO Fangyuan, JI Genlin. Video Anomaly Detection Method Based on Dual Discriminators and Pseudo Video Generation [J]. Computer Science, 2024, 51(8): 217-223.
[11]	YANG Heng, LIU Qinrang, FAN Wang, PEI Xue, WEI Shuai, WANG Xuan. Study on Deep Learning Automatic Scheduling Optimization Based on Feature Importance [J]. Computer Science, 2024, 51(7): 22-28.
[12]	GAN Run, WEI Xianglin, WANG Chao, WANG Bin, WANG Min, FAN Jianhua. Backdoor Attack Method in Autoencoder End-to-End Communication System [J]. Computer Science, 2024, 51(7): 413-421.
[13]	LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao. Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images [J]. Computer Science, 2024, 51(7): 197-205.
[14]	WENG Yu, LUO Haoyu, Chaomurilige, LIU Xuan , DONG Jun, LIU Zheng. CINOSUM:An Extractive Summarization Model for Low-resource Multi-ethnic Language [J]. Computer Science, 2024, 51(7): 296-302.
[15]	SHI Dianxi, GAO Yunqi, SONG Linna, LIU Zhe, ZHOU Chenlei, CHEN Ying. Deep-Init:Non Joint Initialization Method for Visual Inertial Odometry Based on Deep Learning [J]. Computer Science, 2024, 51(7): 327-336.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Deep Learning-based Method for Mining Ocean Hot Spot News

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0