Computer Science ›› 2017, Vol. 44 ›› Issue (Z11): 448-452.doi: 10.11896/j.issn.1002-137X.2017.11A.095

Previous Articles     Next Articles

Research on Template Extraction Based on Large-scale Network Log

CUI Yuan and ZHANG Zhuo   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Aiming at the problem of extracting network events directly from large-scale network log,a template extraction method based on large-scale network log was proposed.The method can automatically convert the massive and original network logs into log templates,so as to provide important pre-preparation for understanding the network events root causes and preventing the occurrence of network failure.Firstly,the structure of the log is analyzed,and the words in the log are divided into two types:template word and parameter word.Then,from three different angles,the log template extraction is studied respectively.Finally,the actual production data of the Internet company is used,and Rand_index method is used to evaluate the accuracy and validity of the three extraction methods.The results show that the average accuracy of the log templates based on the tag recognition tree model is 99.57%,which is higher than that of the four different types of messages collected from the service cluster.

Key words: Cut words,Extract template,Statistical clustering,Signature tree,Online clustering

[1] 王兆丰.一种基于k_均值的DBSCAN算法参数动态选择方法[J].计算机工程与应用,2017,3(3):80-86.
[2] WANG T,SRIVATSA M,AGRAWAL D,et al.Learning,Indexing,and Diagnosing Network Faults[C]∥Proc.of KDD.2009.
[3] WANG T,SRIVATSA M,AGRAWAL D,et al.Spatio-temporal Patterns in Network Events[C]∥Proc.of CoNEXT.2010.
[4] QIU T,GE Z,PEI D,et al.What Happened in my Network? Mining Network Events from Router Syslogs[C]∥Proc.of IMC.2010.
[5] OLINER A.GANAPATHI A,XU W.Advances and Challenges in Log Analysis[J].Communications of the ACM,2012,5(2):55-61.
[6] KIMURA T,ISHIBASHI K,MORI T,et al.Spatio-temporalFactorization of Log Data for Understanding Large-scale Network Events[C]∥Proc.INFOCOM.2014.
[7] XIE Y L,YU F,ACHAN K,et al.Spamming botnets:Signa-tures and characteristics[C]∥Proc.ACM SIGCOMM.2008.
[8] KIMURA,WATANABE A,TOYONO T,et al.Proactive Fai-lure Detection Learning Generation Patterns of Large-scale Network Logs[C]∥2015 11th International Conference on Network and Service Management(CNSM).IEEE,2015:8-14.
[9] 庄军,郭平,周杨.路由器日志序列模式挖掘[J].计算机科学,2005,2(11):179-181.
[10] JIA W H,KAMBER M,PEI J.数据挖掘:概念与技术(第3版)[M].机械工业出版社,2012:306-309.
[11] 张曼琪.基于前缀树的日志模式聚类[D].上海:华东理工大学,2013.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!