计算机科学 ›› 2016, Vol. 43 ›› Issue (2): 26-30.doi: 10.11896/j.issn.1002-137X.2016.02.005

• 目次 • 上一篇    下一篇

一种基于编码关联的快速多模式匹配算法

朱永强,秦志光   

  1. 电子科技大学计算机科学与工程学院 成都611731,电子科技大学计算机科学与工程学院 成都611731
  • 出版日期:2018-12-01 发布日期:2018-12-01

Multi-pattern Matching Algorithm Based on Coding Association

ZHU Yong-qiang and QIN Zhi-guang   

  • Online:2018-12-01 Published:2018-12-01

摘要: 多模式匹配算法经常使用有限自动状态机来实现多个模式串的并行匹配。针对基于自动状态机的多模式匹配算法在应用于中文编码时存在的存储空间膨胀问题,使用中文字符的拆分编码构造自动状态机,以优化算法自动状态机的存储空间,并利用中文编码的编码关联性,设计了一种基于编码关联跳转的失效跳转表,使用启发式跳跃规则提升匹配算法的时间性能。最后通过实验证明,中文编码环境下,相比于其它使用自动状态机的多模式匹配算法,改良算法拥有更小的空间消耗与更快的运行速度。

关键词: 多模式匹配,DFSA算法,WM算法,DFSA-QS算法,编码关联

Abstract: Multi-pattern matching algorithms often use finite state automaton to implement parallel matching of multiple pattern strings.When multi-pattern matching algorithm based on finite state automaton is applied into the Chinese,it will lead to storage space expansion.Aiming to solve this problem,this improved algorithm constructs automatic state machine by using split coding of Chinese characters to save storage space and designs failure jump table based on coding association,and uses heuristic jumping rules to improve time performance of matching.Finally,compared to other algorithm,smaller space consumption and faster speed in Chinese environment of this improved algorithm were proved by simulation.

Key words: Multi-pattern matching,DFSA algorithm,WM algorithm,DFSA-QS algorithm,Coding association

[1] Aho A V,Corasick M J.Efficient String Matching:An Aid to Bibliographic Search[J].Communication of the ACM,1975,18(6):333-340
[2] Wang Pei-feng,Li Li.Research on multi-pattern matching algorithms based on Aho-Corasick algorithm[J].Application Research of Computers,2011,8(4):1251-1254(in Chinese) 王培凤,李莉.基于Aho-Corasick算法的多模式匹配算法研究[J].计算机应用研究,2011,8(4):1251-1254
[3] Wu Sun,Udi M.A fast algorithm for multi-pattern searching:TR 94-17[R].University of Arizona at Tuscon,1994
[4] Wu Sun,Udi M.Agrep-A Fast Approximate Pattern Matchin Tool[C]∥Usenix Winter Technical Conference.1992:153-162
[5] Wu Sun,Udi M.GLIMPSE:A Tool to Search Through Entire FileSystem[C]∥Usenix Winter Technical Conferenee.1994
[6] Liu Wei-guo,Hu Yong-gang.DHSWM:An improved multi-pattern matching algorithm based on WM algorithm[J].Journal of Central South University(Science and Technology),2011,2(12):3765-3771(in Chinese) 刘卫国,胡勇刚.DHSWM:一种改进的WM多模式匹配算法[J].中南大学学报(自然科学版),2011,2(12):3765-3771
[7] Wang Yong-cheng,Shen Zhou,Xu Yi-zhen.Improved Algo-rithms for Matching Multiple Patterns[J].Journal of Computer Research and Development,2002,9(1):55-60(in Chinese) 王永成,沈州,许一震.改进的多模式匹配算法[J].计算机研究与发展,2002,9(1):55-60
[8] Wang Yong-jin,Gu Nai-jie,Ren Chang-xin.A Word-lengthBased Wu-manber Multi-pattern Matching Algorithm[J].Journal of Chinese Computer Systems,2013,4(7):1650-1653(in Chinese) 汪永进,顾乃杰,任长新.一种按字长匹配的Wu-Manber多模式匹配算法[J].小型微型计算机系统,2013,4(7):1650-1653
[9] Liu Wei,Guo Yuan-bo,Huang Peng.Multi-Pattern MatchingEngine Based on Bloom Filter[J].Acta Electronica Sinica,2010,8(5):1095-1099(in Chinese) 刘威,郭渊博,黄鹏.基于Bloom filter的多模式匹配引擎[J].电子学报,2010,8(5):1095-1099
[10] Zhu Yong-qiang,Jiang Xue.Analysis and research of Chinesemulti-pattern matching algorithm performance[J].Computer Technology and Development,2013,4(2):67-70(in Chinese) 朱永强,江雪.中文多模式匹配算法性能的分析与研究[J].计算机技术与发展,2013,4(2):67-70
[11] Li Wei-nan,E Yue-peng,Ge Jing-guo,et al.Multi-Pattern Ma-tching Algorithms and Hardware Based Implementation[J].Journal of Software,2006,7(12):2403-2415(in Chinese) 李伟男,鄂跃鹏,葛敬国,等.多模式匹配算法及硬件实现[J].软件学报,2006,7(12):2403-2415
[12] Sun Qin-dong,Huang Xin-bo,Wang Qian.Multiple PatternMatching on Chinese/English Mixed Texts[J].Journal of Software,2008,9(3):674-686(in Chinese) 孙钦东,黄新波,王倩.面向中英文混合环境的多模式匹配算法[J].软件学报,2008,9(3):674-686
[13] Gao Chao-qin,Chen Yuan-yan,Li Mei.Fast multi-pattern mat-ching algorithm for intrusion detection[J].Journal of Computer Applications,2008,8(1):82-84(in Chinese) 高朝勤,陈元琰,李梅.一种面向入侵检测的快速多模式匹配算法[J].计算机应用,2008,8(1):82-84
[14] Shen Zhou,Wang Yong-cheng,Xu Yi-zhen.A Fast MultiplePattern Algorithm for Chinese String Matching[J].Journal of Shanghai Jiaotong University,2001,5(9):1285-1289(in Chinese) 沈州,王永成,许一震.一种面向中文的快速字串多模式匹配算法[J].上海交通大学学报,2001,5(9):1285-1289

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!