Computer Science ›› 2024, Vol. 51 ›› Issue (12): 30-36.doi: 10.11896/jsjkx.240300025

• Integration of Digital Twin Network and Artificial Intelligence • Previous Articles     Next Articles

Deep Contrastive Siamese Network Based Repeated Event Identification

LI Zichen1, YI Xiuwen2,3, CHEN Shun1,2,3, ZHANG Junbo1,2,3, LI Tianrui1   

  1. 1 School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, China
    2 JD Intelligent Cities Research, Beijing 100176, China
    3 JD Intelligent Cities Technology Co., Ltd., Beijing 100176, China
  • Received:2024-03-04 Revised:2024-07-17 Online:2024-12-15 Published:2024-12-10
  • About author:LI Zichen,born in 1997,postgraduate.His main research interests include urban computing and deep learning.
    YI Xiuwen,born in 1991,Ph.D,data scientist,researcher,is a senior member of CCF(No.45025M).His main research interests include spatio-temporal data mining and deep learning.
  • Supported by:
    National Key R&D Program of China(2023YFC2308703) and Beijing Nova Program(Z211100002121119).

Abstract: In China,citizens can report issues they encounter in daily life to the government and seek assistance by calling the 12345 citizen hotline.However,many events are reported multiple times,which places significant pressure on the staffs responsible for event allocation,resulting in low efficiency of event disposal and waste of public resources.Identifying repeated events requires precise analysis of textual semantics and contextual relationships.To address this problem,this paper proposes an event repetition identification method based on a deep contrastive siamese network.By evaluating the similarity between the descriptions of events,the method identifies events with the same demands.First,it reduces the number of events through retrieval and filtering.Then,it fine-tunes a pre-trained BERT model through contrastive learning to learn distinct semantic representations of event descriptions.Finally,the event title is introduced as contextual information,and a siamese network with a classifier is used to identify repeated events.Experimental results on the 12345 event dataset of Nantong demonstrate that the proposed method outperforms baseline methods across various evaluation metrics,particularly in the F0.5 score,which is relevant to the repetition task scenario.The proposed method can effectively identify repeated events and improve the efficiency of event handling.

Key words: 12345 hotline, Repeated event dispatch, Contrastive learning, Siamese network, Urban computing

CLC Number: 

  • TP399
[1]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.Human Language Technologies,2019:4171-4186.
[2]SARZYNSKA-WAWER J,WAWER A,PAWLAK A,et al.Detecting formal thought disorder by deep contextualized word representations[J].Psychiatry Research,2021,304:114135.
[3]YANG Z,DAI Z,YANG Y,et al.Xlnet:Generalized autoregressive pretraining for language understanding[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:5753-5763.
[4]ZHENG Y P,MA X L.Using Government Hotline Data to Promote Smart Governance-The Case of Guangzhou Government Hotline[J].E-Government,2018(12):18-26.
[5]MA X L,ZHENG Y P,ZHANG C W.The Big Data Empowering Effect of Government Hotlines on City Governance Innovation:Value,Status and Issues[J].Documentation,Informaiton &Knowledge,2021,38(2):4-12.
[6]CHENG X M,CHEN G,CHEN J P,et al.RAVA:An Reinforced-Association-Based Method for 12345 Hotline Events Allocation[J].Journal of Chinese Information Processing,2022,36(10):155-166,172.
[7]PU X,LONG K,CHEN K,et al.A semantic-based short-textfast clustering method on hotline records in Chengdu[C]//2019 IEEE Intl Conf on Dependable,Autonomic and Secure Computing,Intl Conf on Pervasive Intelligence and Computing,Intl Conf on Cloud and Big Data Computing,Intl Conf on Cyber Science and Technology Congress.2019:516-521.
[8]PENG X,LI Y,SI Y,et al.A social sensing approach for everyday urban problem-handling with the 12345-complaint hotline data[J].Computers,Environment and Urban Systems,2022,94:101790.
[9]LUO J Y,QIU Z,XIE G Q,et al.Research on civic hotline complaint text classification model based on word2vec[C]//2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery.IEEE,2018:180-1803.
[10]CHANDRASEKARAN D,MAGO V.Evolution of semanticsimilarity-a survey[J].ACM Computing Surveys(CSUR),2021,54(2):1-37.
[11]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[C]//Proceedings of the 1th International Conference on Learning Representations.2013.
[12]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.2014:1532-1543.
[13]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.2014:1746-1751.
[14]JOHNSON R,ZHANG T.Deep pyramid convolutional neural networks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:562-570.
[15]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.2016:2873-2879.
[16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[17]KATHERINE L,DAPHNE L,NYSTROM A,et al.Deduplica-ting Training Data Makes Language Models Better[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:8424-8445.
[18]BIKASH G,LUCAS A,PETR K.Deduplication of ScholarlyDocuments using Locality Sensitive Hashing and Word Embeddings[C]//Proceedings of the Twelfth Language Resources and Evaluation Conference.2020:901-910.
[19]JAISWAL A,BABU A R,ZADEH M Z,et al.A survey on con-trastive self-supervised learning[J].Technologies,2020,9(1):2.
[20]CHEN T,KORNBLITH S,NOROUZI M,et al.A simpleframework for contrastive learning of visual representations[C]//International Conference on Machine Learning.PMLR,2020:1597-1607.
[21]GAO T,YAO X,CHEN D.SimCSE:Simple Contrastive Lear-ning of Sentence Embeddings[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Proces-sing.2021:6894-6910.
[22]WIETING J,BANSAL M,GIMPEL K,et al.Towards Universal Paraphrastic Sentence Embeddings[C]//Proceedings of the 4th International Conference on Learning Representations.2016.
[23]CHEN X,HE K.Exploring Simple Siamese RepresentationLearning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:15750-15758.
[24]SONG Y,SHI S,LI J,et al.Directional Skip-Gram:ExplicitlyDistinguishing Left and Right Context for Word Embeddings[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:175-180.
[25]IYYER M,MANJUNATHA V,BOYD-GRABER J,et al.Deep Unordered Composition Rivals Syntactic Methods for Text Classification[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.2015:1681-1691.
[26]WANG L,YANG N,HUANG X,et al.Text Embeddings byWeakly-supervised Contrastive Pre-training[J].arXiv:2212.03533,2022.
[27]LIU Y,OTT M,GOYAL N,et al.Roberta:A Robustly Optimized BERT Pretraining Approach[J].arXiv:1907.11692,2019.
[28]REIMERS N,GUREVYCH I.Sentence-BERT:Sentence Em-beddings using Siamese BERT-Networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.2019:3982-3992.
[29]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3(Jan):993-1022.
[1] TIAN Sicheng, HUANG Shaobin, WANG Rui, LI Rongsheng, DU Zhijuan. Contrastive Learning-based Prompt Generation Method for Large-scale Language Model ReverseDictionary Task [J]. Computer Science, 2024, 51(8): 256-262.
[2] HU Haibo, YANG Dan, NIE Tiezheng, KOU Yue. Graph Contrastive Learning Incorporating Multi-influence and Preference for Social Recommendation [J]. Computer Science, 2024, 51(7): 146-155.
[3] TIAN Qing, LU Zhanghu, YANG Hong. Unsupervised Domain Adaptation Based on Entropy Filtering and Class Centroid Optimization [J]. Computer Science, 2024, 51(7): 345-353.
[4] YU Bihui, TAN Shuyue, WEI Jingxuan, SUN Linzhuang, BU Liping, ZHAO Yiman. Vision-enhanced Multimodal Named Entity Recognition Based on Contrastive Learning [J]. Computer Science, 2024, 51(6): 198-205.
[5] LI Yilin, SUN Chengsheng, LUO Lin, JU Shenggen. Aspect-based Sentiment Classification for Word Information Enhancement Based on Sentence Information [J]. Computer Science, 2024, 51(6): 299-308.
[6] LI Zichen, YI Xiuwen, CHEN Shun, ZHANG Junbo, LI Tianrui. Government Event Dispatch Approach Based on Deep Multi-view Network [J]. Computer Science, 2024, 51(5): 216-222.
[7] CHEN Runhuan, DAI Hua, ZHENG Guineng, LI Hui , YANG Geng. Urban Electricity Load Forecasting Method Based on Discrepancy Compensation and Short-termSampling Contrastive Loss [J]. Computer Science, 2024, 51(4): 158-164.
[8] LIAO Jinzhi, ZHAO Hewei, LIAN Xiaotong, JI Wenliang, SHI Haiming, ZHAO Xiang. Contrastive Graph Learning for Cross-document Misinformation Detection [J]. Computer Science, 2024, 51(3): 14-19.
[9] HUANG Kun, SUN Weiwei. Traffic Speed Forecasting Algorithm Based on Missing Data [J]. Computer Science, 2024, 51(3): 72-80.
[10] YANG Bo, LUO Jiachen, SONG Yantao, WU Hongtao, PENG Furong. Time Series Clustering Method Based on Contrastive Learning [J]. Computer Science, 2024, 51(2): 63-72.
[11] ZHU Xudong, LAI Teng. Multimodal Contrastive Learning Based Scene Graph Generation [J]. Computer Science, 2024, 51(11A): 231200185-5.
[12] PENG Guangchuan, WU Fei, HAN Lu, JI Yimu, JING Xiaoyuan. Fake News Detection Based on Cross-modal Interaction and Feature Fusion Network [J]. Computer Science, 2024, 51(11): 23-29.
[13] XU Jie, WANG Lisong. Contrastive Clustering with Consistent Structural Relations [J]. Computer Science, 2023, 50(9): 123-129.
[14] HU Shen, QIAN Yuhua, WANG Jieting, LI Feijiang, LYU Wei. Super Multi-class Deep Image Clustering Model Based on Contrastive Learning [J]. Computer Science, 2023, 50(9): 192-201.
[15] LI Xiang, FAN Zhiguang, LIN Nan, CAO Yangjie, LI Xuexiang. Self-supervised Learning for 3D Real-scenes Question Answering [J]. Computer Science, 2023, 50(9): 220-226.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!