计算机科学 ›› 2017, Vol. 44 ›› Issue (1): 128-133.doi: 10.11896/j.issn.1002-137X.2017.01.025

• 网络与通信 • 上一篇    下一篇

基于AC算法的比特流频繁序列挖掘

雷东,王韬,马云飞   

  1. 军械工程学院信息工程系 石家庄050003,军械工程学院信息工程系 石家庄050003,军械工程学院信息工程系 石家庄050003
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受军内科研基金资助

Frequent Pattern Mining in Bit Stream Based on AC Algorithm

LEI Dong, WANG Tao and MA Yun-fei   

  • Online:2018-11-13 Published:2018-11-13

摘要: 为解决比特流频繁序列挖掘效率不高以及易受用户数据影响而导致准确率低的问题,首先从理论上论证了短频繁序列挖掘存在的局限性,根据不同长度的频繁序列挖掘时存在的特点,将其分为长频繁序列与短频繁序列,提出比特流协议头部字段定位算法;基于AC多模式匹配算法分别针对长、短频繁序列挖掘的不同特点,提出了相应的挖掘方法,提高了挖掘结果的准确性。最后通过实验验证了所提算法的有效性。

关键词: 比特流,AC算法,长频繁序列挖掘,短频繁序列挖掘

Abstract: The existing method of frequent pattern mining in bit stream is inefficient and the precision of the results is low under the influence of redundant data.In order to solve the problem,it was proved that the mining of short frequent pattern has great limitations.According to the different features when the patterns are mined with different lengths,the frequent sequences are divided into two types:long frequent pattern and short frequent pattern.An algorithm of finding the header fields of the protocol in bit stream was proposed,and the efficient algorithm of mining the long frequent pattern and the short frequent pattern were proposed based on AC multi-pattern matching algorithm.Simulation results on the Ethernet show that the proposed algorithm is effective.

Key words: Bit stream,AC algorithm,Long frequent pattern mining,Short frequent pattern mining

[1] 爱德华·华兹.信息战原理与实战[M].吴汉平,译.北京:电子工业出版社,2003:1-20.
[2] LI Fen,LI Tong,ZHANG Chun-rui,et al.Length identification of unknown data frame[C]∥2012 Eighth International Con-ference on Computational Intelligence and Security,Washington,DC,USA,IEEE Computer Society.2012:674-677.
[3] JIN Ling.Study on bit stream oriented unknown frame headidentification[J].Shanghai:Shanghai Jiaotong University,2011:29-39.(in Chinese) 金陵.面向比特流的未知帧头识别技术研究[D].上海:上海交通大学,2011:29-39.
[4] WANG He-zhou,XUE Kai-ping,HONG Pei-lin,et al.An unknown link protocol bit stream segmentation algorithm based on frequent statistics and association rules[J].Journal of University of Science and Technology of China,2013,43(7):554-560.(in Chinese) 王和洲,薛开平,洪佩琳,等.基于频繁统计和关联规则的未知链路协议比特流切割算法[J].中国科技大学学报,2013,43(7):554-560.
[5] SONG Jiang.Unknown protocol identification in wireless environment[D].Chengdu:University of Electronic Science and Technology,2013:23-27.(in Chinese) 宋疆.无线网络环境下未知协议发现探索研究[D].成都:电子科技大学,2013:23-27.
[6] WU Yan-mei.The frame location and protocol feature analysis from the bit-stream in the wireless network[D].Chengdu:University of Electronic Science and Technology,2014:10-11.(in Chinese) 吴艳梅.无线环境下比特流协议帧定位与特征分析[D].成都:电子科技大学,2014:10-11.
[7] AHO A V,Corasick M J.Efficient String Matching:An Aid to Bibliographic Search[J].Communications of the ACM,1975,18(6):330-340.
[8] TAO Shu-song,CHEN Xing-shu,YIN Xue-yuan.Identification of Unknown Frame Structure Based on Data Mining[J].Journal of Sichuan University (Engineering Science Edition),2014,46(suppl):155-159.(in Chinese) 陶术松,陈兴蜀,尹学渊.基于数据挖掘的未知帧结果识别[J].四川大学学报(工程科学版),2014,46(增刊):155-159.
[9] WANG Yong,WU Yan-mei,LI Fen,et al.Protocol identification association analysis in mobile network environment[J].Application Research of Computers,2015,32(1):243-248.(in Chinese) 王勇,吴艳梅,李芬,等.面向比特流数据的未知协议关联分析与识别[J].计算机应用研究,2015,32(1):243-248.
[10] JU Yu-jian,XIE Shao-bin,ZHANG Wei.Research and simulation of optimization process for network protocol frame segmentation[J].Computer Simulation,2015,32(1):318-321.(in Chinese) 琚玉建,谢绍斌,张薇.基于自适应权值的数据报指纹特征识别与发现[J].计算机仿真,2015,32(1):318-321.
[11] 韩家炜,裴健.数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社,2012:157-169.
[12] ZHENG Q.An improved multiple patterns matching algorithm for intrusion detection[C]∥ International Conference on Intelligent Computing and Intelligent Systems.Xiamen,China:IEEE Press,2010:124-127.
[13] BOYER R S,MOORE J S.A fast string searching algorithm[J].Communications of the Association for Computing Machine-ry,1977,20(10):762-772.
[14] FAN Jang-jong,SU K.An efficient algorithm for matching multiple pattems[J].IEEE Transactions on Knowledge and Data Engineering,1993,5(2):339-351.
[15] COIT C J,STANIFORD S,MCALERNY J.Towards FasterString Matching for Intrusion Detection or Exceeding the Speed of Snort[C]∥Proc 2nd DARPA Information Survivability Conference & Exposition II.IEEE CS,2001:367-373.
[16] SONG Hua,DAI Yi-qi.A new fast string matching algorithm for content filtering and detection[J].Journal of Computer Research and Development,2004,41(6):940-945.(in Chinese) 宋华,戴一奇.一种用于内容过滤和检测的快速多关键字识别算法[J].计算机研究与发展,2004,41(6):940-945.
[17] SOURDIS I,PNEVMATIKATOS D.Pre-decoded CAMs for efficient and high-speed NIDS pattern matching[C]∥Field-Programmable Custom Computing Machines.2004:258-267.
[18] KNUTH D.The Art of Computer Programming Volume 3:Sorting and Searching(2th ed)[M].Boston:Addison-Wesley Press,1997:492.
[19] Tanenbaum A S,Wetherall D J.计算机网络(5版)[M].严伟,潘爱民,译.北京:清华大学出版社,2012:151-161.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!