计算机科学 ›› 2013, Vol. 40 ›› Issue (11): 117-121.

• 信息安全 • 上一篇    下一篇

一种适合中文的多模式匹配算法

侯整风,杨波,朱晓玲   

  1. 合肥工业大学计算机与信息学院 合肥230009;合肥工业大学计算机与信息学院 合肥230009;合肥工业大学计算机与信息学院 合肥230009
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受安徽省自然科学基金(090412051),广东省教育部产学研结合项目(2008B0905002400)资助

Multiple Pattern Algorithm for Chinese

HOU Zheng-feng,YANG Bo and ZHU Xiao-ling   

  • Online:2018-11-16 Published:2018-11-16

摘要: 中文字符的相互独立性导致AC算法的时空性能急剧下降。针对此问题,对AC算法的存储结构进行了改进,提出了一种适合中文的多模式匹配算法——AC_SC算法。该算法以邻接链表存储有限状态自动机,尝试解决存储空间快速膨胀问题,并将状态“0”的长链表转化为散列链表,以提高算法的匹配效率。实验结果表明,AC_SC算法具有良好的时空性能。

关键词: 多模式匹配,AC算法,邻接链表,有限状态自动机

Abstract: Because of the independent of the Chinese characters,the space and time performances of AC algorithm decline sharply.For this problem,the storage structure of AC algorithm was improved and a multi-pattern algorithm for Chinese named AC_SC was proposed.The algorithm uses adjacency-list to store the finite state automata to solve the problem of the storage space rapid expansion.Besides,the long linked list of the state ‘0’ is changed into a Hash linked table to improve the matching efficient.Experimental results show that AC_SC has better time and space performances.

Key words: Multi-pattern matching,AC algorithm,Adjacency-list,Finite state automata

[1] 杜大军,费敏锐,宋扬,等.网络控制系统的简要回顾及展望[J].仪器仪表学报,2011,3(32):713-720
[2] Wheeler D L,Barrett T,Benson D A,et al.Database resources of the National Center for Biotechnology Information [J].Nucleic Acids Research,Database issue,2007,35:5-12
[3] Knuth D E,Morris J H,Pratt V R.Fast Patter Matching inStrings[J].SIAM Journal of Computer,1977,6(2):323-350
[4] Boyer R S,Moore J S.A Fast String Searching Algorithm[J].Communications of the ACM,1977,20(10):762-772
[5] Horspool R N.Practical fast searching in strings [J].Software-Practice & Experience,1980,10(6):501-506
[6] Sunday D M.A very fast substring searching algorithm [J].Communications of the ACM,1990,33(8):132-142
[7] Aho A V,Margaret J.Corasick Efficient String Matching[J].An Aid to Biographic Search Communications of the ACM,1975,18(6):333-340
[8] Commentz-Walter B.A String Matching Algorithm Fast on the Average [J].Proceedings of 6th ICALP,1979,71:118-132
[9] Wu Sun,Manber U.Fast algorithm for multi-pattern searching[R].Tucson:Department of computer science university of Arizona,1994
[10] 王永成,沈州,许一震.改进的多模式匹配算法[J].计算机研究与发展,2002,39(1):55-60
[11] Norton M.Optimizing pattern matching for intrusion detection [EB/OL].http://does.idsreseareh.Org/OptimizingPattermMatching-ForIDS.pdf,2006-05-11
[12] Tuck N,Sherwood T,Calder B,et al.Deterministic Memory Efficient String Matching Algorithms for Intrusion Detection[C]∥IEEE Iafocom.2004:333-340
[13] 巫喜红,曾锋.AC 多模式匹配算法研究[J].计算机工程,2012,38(6):279-281
[14] 张元竞,张伟哲.一种基于位图的多模式匹配算法[J].哈尔滨工业大学学报,2010,42(2):277-280
[15] 王培凤,李莉.一种改进的多模式匹配算法在Snort中的应用[J].计算机科学,2012,39(2):72-74

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!