计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241200035-8.doi: 10.11896/jsjkx.241200035

• 人工智能 • 上一篇    下一篇

基于融合模型的警情地址相似度计算

张硕, 季铎   

  1. 中国刑事警察学院公安信息技术与情报学院 沈阳 110000
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 季铎(18640037173@163.com)
  • 作者简介:2926637201@qq.com)
  • 基金资助:
    辽宁网络安全执法协同创新中心资助

Calculation of Police Incident Address Similarity Based on Fusion Model

ZHANG Shuo, JI Duo   

  1. School of Public Security Technology and Information,Criminal Investigation Police University of China,Shenyang 110000,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    Liaoning Collaboration Innovation Center For CSLE.

摘要: 随着大数据技术在公安领域的广泛应用,警情响应速度的提升已成为推动公安现代化及高效运作的核心目标之一。警情快速响应系统通过自动派警机制替代传统人工派警,其核心依赖于模型对警情地址的精准识别。然而,警情地址与普通地址在特征表现上存在显著差异,现有的商业化地址匹配模型在处理警情地址时,常常存在适配性不足的问题。为解决这一问题,提出了一种结合地址分级和拼音信息的改进方法,旨在替代传统深度学习算法,以应对商业化地址计算模型在警情地址识别中的局限性。该方法针对中文警情地址中的特殊词组、多层次地址结构、同音异字及错别字等特点进行优化。通过预训练模型、数据增强、地址分级及拼音信息编码等技术手段,研究构建并训练一种专用于警情地址相似度计算的高效模型,显著提高中文警情地址的识别准确性与适配能力。

关键词: 警情地址, 地址分级, 拼音, 深度学习, 预训练, 数据增强

Abstract: With the widespread application of big data technology in the field of public security,the improvement of police response speed has become one of the core goals to promote the modernization and efficient operation of public security.The rapid response system for police incidents replaces traditional manual dispatch with an automatic dispatch mechanism,and its core relies on the model’s accurate identification of police addresses.However,there are significant differences in feature representation between police addresses and regular addresses,and existing commercial address matching models often suffer from insufficient adaptability when dealing with police addresses.To address this issue,this paper proposes an improved method that combines address grading and pinyin information,aiming to replace traditional deep learning algorithms and address the limitations of commercial address calculation models in police address recognition.This method is optimized for the special phrases,multi-level address structure,homophones,and misspellings in Chinese police addresses.By using techniques such as pre-training models,data augmentation,address grading,and Pinyin information encoding,this paper aims to develop and train an efficient model specifically designed for calculating the similarity of police addresses,significantly improving the recognition accuracy and adaptability of Chinese police addresses.

Key words: Police address, Address classification, Pinyin, Deep learning, Pre-training, Data augmentation

中图分类号: 

  • TP391
[1]KANG M,DU Q,WANG M.A new method of Chinese addressextraction based on address tree model[J].Acta Geodaetica et Cartographica Sinica,2015,44(1):99-107.
[2]KANG M J,DU Q Y,WANG M J.Chinese Address Extraction Method Using Address Tree Model [J].Journal of Surveying and Mapping,2015,44(1):99-107.
[3] LI X F,SONG Z L,CHEN X X,et al.Research and Implementation of Fuzzy Matching for K-Tree Address [J].Surveying and Mapping Bulletin,2018(9):126-129.
[4]SHI M J.Research on intelligent matching of non-standard Chinese addresses [D].Xuzhou:China University of Mining and Technology,2020.
[5]WANG S,ZHUANG S,ZUCCON G.Bert-based dense retrievers require interpolation with bm25 for effective passage retrie-val[C]//Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval.2021:317-324.
[6] DIAO X C,TAN M C,CAO J J.A string similarity calculation method that integrates multiple editing distances [J].Computer Application Research,2010,27(12):4523-4525.
[7]LIN Y,KANG M,WU Y,et al.A deep learning architecture for semantic address matching [J].International Journal of Geographical Information Science,2020,34(3):559-576.
[8] YU T,WANG D,CHEN Q.Chinese address matching method based on pseudo semantic similarity model [J].Surveying and Mapping Bulletin,2022(3):101-106.
[9]LI F,LU Y,MAO X,et al.Multi-task deep learning model based on hierarchical relations of address elements for semantic address matching[J].Neural Computing and Applications,2022,34(11):8919-8931.
[10]RONG X.word2vec parameter learning explained[J].arXiv:1411.2738,2014.
[11]PENNINGTON J,SOCHER R,MANNING C.Glove:globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).Stroudsburg,PA:Association for Computational Linguistics,2014:1532-1543.
[12]DEVLIN J,CHANG M W,LEE K,et al.BERT:pretraining of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[13]LIU Y H,OTT M,GOYAL N,et al.RoBERTa:a robustly optimized BERT pretraining approach[J].arXiv:1907.11692,2019.
[14]ZHOU H.Geographic coding method based on conditional random fields and spatial inference [D].Zhengzhou:PLA University of Information Engineering,2015.
[15]ZHANG H.Research on Chinese Address RESOLUTION and Matching Method Based on BERT Pre trained Model [D].Nanjing:Nanjing Normal University,2021.
[16]YANG B.Research on Chinese Address Normalization Technology Integrating Attention Mechanism and Sequence Generation Network [D].Lanzhou:Lanzhou Jiaotong University,2023.
[17]PENG Y L,HU S S,WU T.Multi strategy Chinese addressmatching method [J].Surveying and Mapping Bulletin,2022(2):145-148.
[18]CHEN N Y.Research and Implementation of Semantic Address Matching Method under Privacy Protection [D]. Xi’an:Xi’an University of Electronic Science and Technology,2023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!