计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241200035-8.doi: 10.11896/jsjkx.241200035
张硕, 季铎
ZHANG Shuo, JI Duo
摘要: 随着大数据技术在公安领域的广泛应用,警情响应速度的提升已成为推动公安现代化及高效运作的核心目标之一。警情快速响应系统通过自动派警机制替代传统人工派警,其核心依赖于模型对警情地址的精准识别。然而,警情地址与普通地址在特征表现上存在显著差异,现有的商业化地址匹配模型在处理警情地址时,常常存在适配性不足的问题。为解决这一问题,提出了一种结合地址分级和拼音信息的改进方法,旨在替代传统深度学习算法,以应对商业化地址计算模型在警情地址识别中的局限性。该方法针对中文警情地址中的特殊词组、多层次地址结构、同音异字及错别字等特点进行优化。通过预训练模型、数据增强、地址分级及拼音信息编码等技术手段,研究构建并训练一种专用于警情地址相似度计算的高效模型,显著提高中文警情地址的识别准确性与适配能力。
中图分类号:
| [1]KANG M,DU Q,WANG M.A new method of Chinese addressextraction based on address tree model[J].Acta Geodaetica et Cartographica Sinica,2015,44(1):99-107. [2]KANG M J,DU Q Y,WANG M J.Chinese Address Extraction Method Using Address Tree Model [J].Journal of Surveying and Mapping,2015,44(1):99-107. [3] LI X F,SONG Z L,CHEN X X,et al.Research and Implementation of Fuzzy Matching for K-Tree Address [J].Surveying and Mapping Bulletin,2018(9):126-129. [4]SHI M J.Research on intelligent matching of non-standard Chinese addresses [D].Xuzhou:China University of Mining and Technology,2020. [5]WANG S,ZHUANG S,ZUCCON G.Bert-based dense retrievers require interpolation with bm25 for effective passage retrie-val[C]//Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval.2021:317-324. [6] DIAO X C,TAN M C,CAO J J.A string similarity calculation method that integrates multiple editing distances [J].Computer Application Research,2010,27(12):4523-4525. [7]LIN Y,KANG M,WU Y,et al.A deep learning architecture for semantic address matching [J].International Journal of Geographical Information Science,2020,34(3):559-576. [8] YU T,WANG D,CHEN Q.Chinese address matching method based on pseudo semantic similarity model [J].Surveying and Mapping Bulletin,2022(3):101-106. [9]LI F,LU Y,MAO X,et al.Multi-task deep learning model based on hierarchical relations of address elements for semantic address matching[J].Neural Computing and Applications,2022,34(11):8919-8931. [10]RONG X.word2vec parameter learning explained[J].arXiv:1411.2738,2014. [11]PENNINGTON J,SOCHER R,MANNING C.Glove:globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).Stroudsburg,PA:Association for Computational Linguistics,2014:1532-1543. [12]DEVLIN J,CHANG M W,LEE K,et al.BERT:pretraining of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [13]LIU Y H,OTT M,GOYAL N,et al.RoBERTa:a robustly optimized BERT pretraining approach[J].arXiv:1907.11692,2019. [14]ZHOU H.Geographic coding method based on conditional random fields and spatial inference [D].Zhengzhou:PLA University of Information Engineering,2015. [15]ZHANG H.Research on Chinese Address RESOLUTION and Matching Method Based on BERT Pre trained Model [D].Nanjing:Nanjing Normal University,2021. [16]YANG B.Research on Chinese Address Normalization Technology Integrating Attention Mechanism and Sequence Generation Network [D].Lanzhou:Lanzhou Jiaotong University,2023. [17]PENG Y L,HU S S,WU T.Multi strategy Chinese addressmatching method [J].Surveying and Mapping Bulletin,2022(2):145-148. [18]CHEN N Y.Research and Implementation of Semantic Address Matching Method under Privacy Protection [D]. Xi’an:Xi’an University of Electronic Science and Technology,2023. |
|
||