基于跨层级多视角特征的多语言事件探测

doi:10.11896/jsjkx.230200131

Abstract

Abstract: The goal of the multilingual event detection task is to organize a collection of news documents in multiple languages into different key events,where each event can include news documents in different languages.This task facilitates various downstream task applications,such as multilingual knowledge graph construction,event reasoning,information retrieval,etc.At pre-sent,multilingual event detection is mainly divided into two methods:translation first and then event detection,and single language detection first and then alignment across multiple languages.The former relies on the effect of translation while the latter requires a separate training model for each language.To this end,this paper proposes a multilingual event detection method based on cross-level multi-view feature fusion,which performs end-to-end multilingual event detection tasks.This method uses the multi-view features of documents from different levels to obtain high reliability.It improves the generalization performance of low-resource language event detection.Experiments on a news dataset with a mixture of nine languages show that the proposed method improves the BCubed F1 value by 4.63%.

Key words: Multilingual pre-training model, Multilingual event detection, News documents clustering, Weighted similarity, Incremental clustering

CLC Number:

TP391

ZHANG Zhiyuan, ZHANG Weiyan, SONG Yuqiu, RUAN Tong. Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion[J].Computer Science, 2024, 51(5): 208-215.

References

[1]HUANG Z,LI Z,JIANG H,et al.Multilingual KnowledgeGraph Completion with Self-Supervised Adaptive Graph Alignment[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:474-485.
[2]AHUJA K,KUMAR S,DANDAPAT S,et al.Multi TaskLearning For Zero Shot Performance Prediction of Multilingual Models[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:5454-5467.
[3]FUJINUMA Y,JORDAN L,KANN K,et al.Match the Script,Adapt if Multilingual:Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:1500-1512.
[4]GUZMAN L,LAI V,POURAN A,et al.Event Detection for Suicide Understanding[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics.2022:1952-1961.
[5]YANG W,BOYD-GRABER J,PHILIP R.A Multilingual Topic Model for Learning Weighted Topic Links Across Corpora with Low Comparability[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.2019:1243-1248.
[6]LIU J,CHEN Y,LIU K,et al.Neural Cross-Lingual Event Detection with Minimal Parallel Resources[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.2019:738-748.
[7]MIRANDA S,ZNOTINS A,COHEN S,et al.Multilingualclustering of streaming news[C]//Proceedings of the 2018 Conference onEmpirical Methods in Natural Language Processing.2018:4535-4544.
[8]LINGER M,HAJAIEJ M.Batch clustering for multilingualnews streaming[C]//Proceedings of the Text2Story'20 Workshop.2020:55-61.
[9]LABAN P,HEARST M.newsLens:building and visualizinglong-ranging news stories[C]//Proceedings of the Events and Stories in the News Workshop.2017:1-9.
[10]STAYKOVSKI T,BARRON-CEDENO A,MARTINO G,et al.Dense vs.Sparse Representations for News Stream Clustering[C]//Proceedings of the Text2Story'19 Workshop.2019:47-52.
[11]LI Y,YU Z,GAO S,et al.Case-related News Detection Based on Case Element and Deep Clustering Method[J].Journal of Chinese Information Processing,2021,35(11):60-69.
[12]GUO H,WANG Z,ZHU Q,et al.Event Clustering Method for Chinese Social Text Based on Semi-supervised Learning [J].Journal of Chinese Information Processing,2022,36(2):152-159.
[13]SARAVANAKUMAR K,BALLESTEROS M,CHANDRAS-EKARAN M,et al.Event-Driven News Stream Clustering using Entity-Aware Contextual Embeddings[C]//Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:2330-2340.
[14]DEVLIN J,CHANG M,LEE K,et al.BERT:Pre-training ofDeep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.2019:4171-4186.
[15]ZHANG Y,GUO F,SHEN J,et al.Unsupervised Key Event Detection from Massive Text Corpora[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2022:2535-2544.
[16]REIMERS N,GUREVYCH I.Sentence-BERT:Sentence Em-beddings using Siamese BERT-Networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.2019:3980-3990.
[17]YANG W,YU Z,GAO S,et al.Chinese-Vietnamese news topic discovery methodbased on cross-language neural topic model[J].Journal of Computer Applications,2021,41(10):2879-2884.
[18]CONNEAU A,LAMPLE G,DENOYER L,et al.Word Translation Without Parallel Data[C]//Proceedings of the 6th International Conference on Learning Representations.2018:1-14.
[19]FENG F,YANG Y,CER D,et al.Language-agnostic BERTSentence Embedding[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:878-891.
[20]XUE L,CONSTANT N,ROBERTS A,et al.mT5:A Massively Multilingual Pre-trained Text-to-Text Transformer[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics.2021:483-498.
[21]YANG Z,DAI Z,YANG Y,et al.XLNet:Generalized Auto-regressive Pretraining for Language Understanding[C]//Advances in Neural Information Processing Systems 32.2019:5754-5764.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 7

Metrics

Comments

Recommended 0

[1]	WANG Sha-sha,JIANG Feng and WANG Wen-peng. Rough Set Approach to Data Completion Based on Relative Decision Entropy and Weighted Similarity [J]. Computer Science, 2014, 41(2): 245-248.
[2]	MAO Guo-jun and CAO Yong-cun. Clustering Models and Algorithms for Distributed Data Streams Based on Data Synopsis [J]. Computer Science, 2013, 40(6): 187-191.
[3]	. 3D Mesh Model Retrieval Using Incremental Clustering [J]. Computer Science, 2011, 38(11): 248-251.
[4]	. Rough Set Approach to Data Completion Based on Weighted Similarity [J]. Computer Science, 2011, 38(11): 167-170.
[5]	SIJ Xiao-ke,LAN Yang,QIN Yu-ming,CHENG Yao-dong. Outlier Detection Based on the Damped Model in Mixed Data Streams [J]. Computer Science, 2010, 37(5): 157-162.
[6]	ZHANG Qian-sheng,JIANG Sheng-yi. System Decision Making Method Based on Vague Bidirectional Approximate Reasoning [J]. Computer Science, 2010, 37(4): 219-.
[7]	. [J]. Computer Science, 2009, 36(1): 198-200.