Computer Science ›› 2024, Vol. 51 ›› Issue (5): 208-215.doi: 10.11896/jsjkx.230200131

• Artificial Intelligence • Previous Articles     Next Articles

Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion

ZHANG Zhiyuan, ZHANG Weiyan, SONG Yuqiu, RUAN Tong   

  1. School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
  • Received:2023-02-19 Revised:2023-06-21 Online:2024-05-15 Published:2024-05-08
  • About author:ZHANG Zhiyuan,born in 1998,postgraduate.His main research interests include natural language processing and multilingual pretraining model.
    RUAN Tong,born in 1973,professor,Ph.D supervisor.Her main research interests include text extraction know-ledge graph and data quality assessment.

Abstract: The goal of the multilingual event detection task is to organize a collection of news documents in multiple languages into different key events,where each event can include news documents in different languages.This task facilitates various downstream task applications,such as multilingual knowledge graph construction,event reasoning,information retrieval,etc.At pre-sent,multilingual event detection is mainly divided into two methods:translation first and then event detection,and single language detection first and then alignment across multiple languages.The former relies on the effect of translation while the latter requires a separate training model for each language.To this end,this paper proposes a multilingual event detection method based on cross-level multi-view feature fusion,which performs end-to-end multilingual event detection tasks.This method uses the multi-view features of documents from different levels to obtain high reliability.It improves the generalization performance of low-resource language event detection.Experiments on a news dataset with a mixture of nine languages show that the proposed method improves the BCubed F1 value by 4.63%.

Key words: Multilingual pre-training model, Multilingual event detection, News documents clustering, Weighted similarity, Incremental clustering

CLC Number: 

  • TP391
[1]HUANG Z,LI Z,JIANG H,et al.Multilingual KnowledgeGraph Completion with Self-Supervised Adaptive Graph Alignment[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:474-485.
[2]AHUJA K,KUMAR S,DANDAPAT S,et al.Multi TaskLearning For Zero Shot Performance Prediction of Multilingual Models[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:5454-5467.
[3]FUJINUMA Y,JORDAN L,KANN K,et al.Match the Script,Adapt if Multilingual:Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:1500-1512.
[4]GUZMAN L,LAI V,POURAN A,et al.Event Detection for Suicide Understanding[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics.2022:1952-1961.
[5]YANG W,BOYD-GRABER J,PHILIP R.A Multilingual Topic Model for Learning Weighted Topic Links Across Corpora with Low Comparability[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.2019:1243-1248.
[6]LIU J,CHEN Y,LIU K,et al.Neural Cross-Lingual Event Detection with Minimal Parallel Resources[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.2019:738-748.
[7]MIRANDA S,ZNOTINS A,COHEN S,et al.Multilingualclustering of streaming news[C]//Proceedings of the 2018 Conference onEmpirical Methods in Natural Language Processing.2018:4535-4544.
[8]LINGER M,HAJAIEJ M.Batch clustering for multilingualnews streaming[C]//Proceedings of the Text2Story'20 Workshop.2020:55-61.
[9]LABAN P,HEARST M.newsLens:building and visualizinglong-ranging news stories[C]//Proceedings of the Events and Stories in the News Workshop.2017:1-9.
[10]STAYKOVSKI T,BARRON-CEDENO A,MARTINO G,et al.Dense vs.Sparse Representations for News Stream Clustering[C]//Proceedings of the Text2Story'19 Workshop.2019:47-52.
[11]LI Y,YU Z,GAO S,et al.Case-related News Detection Based on Case Element and Deep Clustering Method[J].Journal of Chinese Information Processing,2021,35(11):60-69.
[12]GUO H,WANG Z,ZHU Q,et al.Event Clustering Method for Chinese Social Text Based on Semi-supervised Learning [J].Journal of Chinese Information Processing,2022,36(2):152-159.
[13]SARAVANAKUMAR K,BALLESTEROS M,CHANDRAS-EKARAN M,et al.Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings[C]//Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:2330-2340.
[14]DEVLIN J,CHANG M,LEE K,et al.BERT:Pre-training ofDeep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.2019:4171-4186.
[15]ZHANG Y,GUO F,SHEN J,et al.Unsupervised Key Event Detection from Massive Text Corpora[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2022:2535-2544.
[16]REIMERS N,GUREVYCH I.Sentence-BERT:Sentence Em-beddings using Siamese BERT-Networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.2019:3980-3990.
[17]YANG W,YU Z,GAO S,et al.Chinese-Vietnamese news topic discovery methodbased on cross-language neural topic model[J].Journal of Computer Applications,2021,41(10):2879-2884.
[18]CONNEAU A,LAMPLE G,DENOYER L,et al.Word Translation Without Parallel Data[C]//Proceedings of the 6th International Conference on Learning Representations.2018:1-14.
[19]FENG F,YANG Y,CER D,et al.Language-agnostic BERTSentence Embedding[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:878-891.
[20]XUE L,CONSTANT N,ROBERTS A,et al.mT5:A Massively Multilingual Pre-trained Text-to-Text Transformer[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics.2021:483-498.
[21]YANG Z,DAI Z,YANG Y,et al.XLNet:Generalized Auto-regressive Pretraining for Language Understanding[C]//Advances in Neural Information Processing Systems 32.2019:5754-5764.
[1] WANG Sha-sha,JIANG Feng and WANG Wen-peng. Rough Set Approach to Data Completion Based on Relative Decision Entropy and Weighted Similarity [J]. Computer Science, 2014, 41(2): 245-248.
[2] MAO Guo-jun and CAO Yong-cun. Clustering Models and Algorithms for Distributed Data Streams Based on Data Synopsis [J]. Computer Science, 2013, 40(6): 187-191.
[3] . 3D Mesh Model Retrieval Using Incremental Clustering [J]. Computer Science, 2011, 38(11): 248-251.
[4] . Rough Set Approach to Data Completion Based on Weighted Similarity [J]. Computer Science, 2011, 38(11): 167-170.
[5] SIJ Xiao-ke,LAN Yang,QIN Yu-ming,CHENG Yao-dong. Outlier Detection Based on the Damped Model in Mixed Data Streams [J]. Computer Science, 2010, 37(5): 157-162.
[6] ZHANG Qian-sheng,JIANG Sheng-yi. System Decision Making Method Based on Vague Bidirectional Approximate Reasoning [J]. Computer Science, 2010, 37(4): 219-.
[7] . [J]. Computer Science, 2009, 36(1): 198-200.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!