一种基于域知识的协议状态机主动推断算法

计算机科学 ›› 2015, Vol. 42 ›› Issue (12): 233-239.

一种基于域知识的协议状态机主动推断算法

王辰,吴礼发,洪征,郑成辉,庄洪林

解放军理工大学指挥信息系统学院南京210007,解放军理工大学指挥信息系统学院南京210007,解放军理工大学指挥信息系统学院南京210007,解放军理工大学指挥信息系统学院南京210007,解放军理工大学指挥信息系统学院南京210007

出版日期:2018-11-14 发布日期:2018-11-14
基金资助:
本文受国家自然科学基金项目(611032253),江苏省自然科学基金项目(BK2011115)资助

Domain-specific Algorithm of Protocol State Machine Active Inference

WANG Chen, WU Li-fa, HONG Zheng, ZHENG Cheng-hui and ZHUANG Hong-lin

Online:2018-11-14 Published:2018-11-14

摘要/Abstract

摘要： 现有基于L*算法的协议状态机主动推断方法忽略了协议特有的域知识,将协议报文抽象为相互独立、无意义的符号,并完全随机地生成测试样本进行状态机等价判定,导致产生大量的无效询问和测试样本,在真实网络环境下推断效率较低。在L⁺M算法的基础上提出了一种基于域知识的协议状态机主动推断算法L⁺N,其改进主要体现在:依据会话样本集提取各报文之间的强顺序约束关系来过滤无效的输出询问,构建会话样本集对应的扩展前缀树接受器(Extended Prefix Tree Accepter,EPTA)对输出询问进行预响应,提出了一种基于正例样本变异的等价询问近似判定算法以提升寻找反例的效率。实验结果表明,L⁺N算法能够大幅提高推断效率,并且具有与L⁺M算法相同的推断准确度。

关键词: L*算法,协议状态机,主动推断,域知识,推断效率

Abstract: Existing protocol state machine inference approaches based on algorithm L* are inefficient owing to ignorance of protocol-specific knowledge.As the protocol messages are abstracted as the independent and insignificant symbols,and test samples are completely generated randomly in equivalence query,invalid queries and test samples are inevi-table.A protocol state machine active inference algorithm named L⁺N was proposed,which improves the algorithm L⁺M in three aspects.Firstly,L⁺N filters the invalid output query according to the constraint on strict order,which is extracted from conservation samples.Secondly,L⁺N constructs the extended prefix tree accepter(EPTA) corresponding to the sample set and answers the output query in advance.Thirdly,a new proposed strategy to find counterexamples more effectively is applied to judge the equivalence query based on positive sample mutation.Experimental results show that L⁺N improves the inference efficiency greatly and achieves the same accuracy as algorithm L⁺M.

Key words: Algorithm L*,Protocol state machine,Active inference,Domain-specific knowledge,Inference efficiency

王辰,吴礼发,洪征,郑成辉,庄洪林. 一种基于域知识的协议状态机主动推断算法[J]. 计算机科学, 2015, 42(12): 233-239. https://doi.org/

WANG Chen, WU Li-fa, HONG Zheng, ZHENG Cheng-hui and ZHUANG Hong-lin. Domain-specific Algorithm of Protocol State Machine Active Inference[J]. Computer Science, 2015, 42(12): 233-239. https://doi.org/

参考文献

[1] 李伟明,张爱芳,刘建财,等.网络协议的自动化模糊测试漏洞挖掘方法[J].计算机学报,2011,34(2):242-255 Li W M,Zhang A F,Liu J C,et al.An Automatic Network Protocol Fuzz Testing and Vulnerability Discovering Method [J].Chinese Journal of Computers,2011,34(2):242-255
[2] 侯莹,洪征,潘璠,等.基于模型的Fuzzing测试脚本自动化生成[J].计算机科学,2013,40(3):206-209 Hou Y,Hong Z,Pan F,et al.Model Based Automatic Fuzzing Script Generation [J].Computer Science,2013,38(3):206-209
[3] 应凌云,杨轶,冯登国,等.恶意软件网络协议的语法和行为语义分析方法[J].软件学报,2011,22(7):1676-1689 Ying L Y,Yang Y,Feng D G,el al.Syntax and Behavior Semantics Analysis of Network Protocol of Malware[J].Journal of Software,2011,22(7):1676-1689
[4] Tridgell.How Samba was written [EB/OL].2005-02-04[2014-11-10].http://samba.org/ftp/tridge/misc/french_cafe.txt
[5] Lang K J.Faster Algorithms for Finding Minimal ConsistentDFAs[R].NEC Research Institute,1999
[6] Oncina J,Garcia P.Inferring regular languages in polynomial update time [J].Pattern Recognition and Image Analysis,1992,1:49-61
[7] Lang K J,Pearlmutter B A,Price R A.Results of the Abbadingo one DFA learning competition and a new evidence-driven state merging algorithm[M]∥Grammatical Inference.Springer Berlin Heidelberg,1998:1-12
[8] Gold E M.Language Identification in the Limit [J].Information and Control,1967,10(5):447-474
[9] Dupont P,Lambeau B,Damas C,et al.The QSM algorithm and its application to software behavior model induction [J].Applied Artificial Intelligence,2008,22(1):77-115
[10] Angluin D.Learning regular sets from queries and counterexamples [J].Information and computation,1987,75(2):87-106
[11] Shahbaz M.Reverse Engineering Enhanced State Models ofBlack Box Software Components to Support Integration Testing [D].Laboratoire Informatique de Grenoble,2008
[12] Cho C Y,Shin E C R,Song D.Inference and analysis of formal models of botnet command and control protocols[C]∥Procee-dings of the 17th ACM Conference on Computer and Communications Security.ACM,2010:426-439
[13] Caballero J,Poosankam P,Kreibich C,et al.Dispatcher:Enabling active botnet infiltration using automatic protocol reverse-engineering[C]∥Proceedings of the 16th ACM Conference on Computer and Communications Security.ACM,2009:621-634
[14] Wikipedia Contributors.Mealy machine [EB/OL].2013-11-13[2014-11-10].http://zh.wikipedia.org/wiki/Mealy%E6%9C%BA
[15] 肖明明,余顺争.基于文法推断的协议逆向工程[J].计算机研究与发展,2013,50(10):2044-2058 Xiao M M,Yu S Z.Protocol Reverse Engineering Using Grammatical Inference[J].Journal of Computer Research and Deve-lopment,2013,50(10):2044-2058
[16] 潘璠,洪征,杜有翔,等.基于递归聚类的报文结构提取方法[J].四川大学学报(工学版),2012,4(6):137-142 Pan F,Hong Z,Du Y X,et al.Recursive Clustering Based Method for Message Structure Extraction [J].Journal of Sichuan University (Engineering Science Edition),2012,44(6):137-142
[17] Comparetti P M,Wondracek G,Kruegel C,et al.Prospex:Protocol specification extraction[C]∥2009 30th IEEE Symposium on Security and Privacy.IEEE,2009:110-125
[18] Bossert G,Guihéry F.Security Evaluation of CommunicationProtocols in Common Criteria[R].Paris:ICCC,2012
[19] Postel J.RFC821:Simple Mail Transfer Protocol[S].IETF,1982
[20] Myers J,Rose M.RFC1939:Post Office Protocol-Version3[S].IETF,1996
[21] 潘璠,洪征,周振吉,等.语义层次的协议格式提取方法[J].通信学报,2013,34(10):162-173 Pan F,Hong Z,Zhou Z J,et al.Protocol format extraction at semantic level [J].Journal on Communications,2013,34(10):162-173

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed