Computer Science ›› 2020, Vol. 47 ›› Issue (12): 319-326.doi: 10.11896/jsjkx.191000193

Previous Articles     Next Articles

Message Format Inference Method Based on Rough Set Clustering

LI Yi-hao, HONG Zheng, LIN Pei-hong, FENG Wen-bo   

  1. Army Engineering University of PLA Nanjing 210000,China
  • Received:2019-10-29 Revised:2020-04-12 Published:2020-12-17
  • About author:LI Yi-hao,born in 1996postgraduate.His main research interests includecyberspace security and protocol reverse engineering.
    HONG Zeng,born in 1979Ph.Dasso-ciate professor.His main research in-terests include cyberspace security and protocol reverse engineering.
  • Supported by:
    National Key R&D Program of China (2017YFB0802900).

Abstract: Message clustering is an important procedure of message format inference.Most of the existing message clustering methods take message global similarity as the clustering criteria.Howeverthe accuracy of such clustering methods is often not high enoughand affects the accuracy of subsequent message format extraction.To solve this problemthis paper proposes a message format inference method based on rough set clusteringwhich consists of preprocessing phaserough-setbased clustering phasefeature word extraction phase and message format extraction phase.Firstlymessages are separated into business messages and control messages.Secondlymessages are clustered on the basis of position attributions according to rough set theoryand the clustering method considers local features of message sequences which ensures high accuracy of message clustering.Thirdlyprotocol feature words are extracted according to lengthfrequency and position characteristics.Finallyprotocol feature words are classified into mandatory fields and optional fieldsand they are used to represent message formats.Experimental results show that the proposed method can extract message formats precisely.

Key words: Feature word extraction, Message clustering, Messages format inference, Protocol reverse engineering, Rough set theory

CLC Number: 

  • TP398.08
[1] WU L F,HONG Z,PAN F.Network Protocol Reverse Analysis and Application[M]//National Defense Industry Press.Beijing,China,2016:11-12.
[2] DUCHÊNE J,GUERNIC C L,ALATA E,et al.Protocol Re-verse Engineering:Challenges and Obfuscation[C]//International Conference on Risks and Security of Internet and Systems.2017.
[3] BEDDOE M.Protocol information project[EB/OL].(2004-10-05)[2019-06-25].http://www.4tphi.net/~awalters/PI/PI.html.
[4] HE C,LIU F,ZENG X.Clustering Analysis of Unknown Proto-col Message Sequence[J].Communications Technology ,2017,50(2):277-286.
[5] LU Z Y,LI G S,SHEN Y Z,et al.Unknown protocol message clustering algorithm based on continuous features[J].Journal of Shandong University (Natural Science),2018,54(5):1-7.
[6] LI W M,ZHANG A F,LIU J C,et al.An Automatic Network Protocol Fuzz Testing and Vulnerability Discovering Method[J].Chinese Journal of Computers,2011,34(2):242-255.
[7] LI Y,LI Q,ZHAGN X.Outline Format Signature Construction Method Based on Separate Protocol Message[J].Journal of Information Engineering University,2018,19(2):134-139.
[8] YOUNG-HOON,GOO K S S,BYEONG-MIN CHAE,et al.Framework for precise protocol reverse engineering based on network traces[M]//2018 IEEE/IFIP Network Operations and Management Symposium.2018.
[9] BICHENG C,RENHUI L,YUNFEI Z,et al.Research on non-standard industrial control protocol formats reverse[J].Computer Technology and Its Applications,2018,44(4):126-129.
[10] YANG L,QING L,XIA Z.Automatic protocol format signature construction algorithm based on discrete series protocol message[J].Journal of Computer Applications,2017,37(4):954-969.
[11] ZHANG Z,ZHANG Z,LEE P P C,et al.Proword:An unsupervised approach to protocol feature word extraction[C]//IEEE Conference on Computer Communications.Toronto,Canada,2014:1393-1401.
[12] PAWLAK Z.Rough sets[J].International Journal of Computer and Information Sciences,1982,11(5):341-356.
[13] ZHANG H Z,HONG Z,WANG C,et al.Closed Sequential Patterns Mining Based Unknown Protocol Formal Inference Me-thod[J].Computer Science,2019,46(6):80-89.
[14] X Z,DING S Y,LI O,et al.Keyword Sequence Extraction Basedon Byte Entropy Iterative Segmentation[C]//presented at the
2017 3rd IEEE International Conference on Computer and Communications.Chengdu,China,2017.
[15] KUROSE J F,ROSS K W.Computer Networking:A Top-Down Approach Featuring the Internet[M].Addison-Wesley,2002.
[16] WRCCDC Public Archive traces[DB/OL].[2019-07-08].https://archive.wrccdc.org/pcaps/2019/.
[17] MCHUGH J.Testing intrusion detection systems:a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performedby lincoln laboratory[J].ACM Transactions on Informationand System Security,2000,3(4):262-294.
[18] BOSSERT G,HIET G,HENIN T.Modelling to simulate botnet command and control protocols for the valuation of network intrusion detection systems[C]//2011 Conference on Network and Information System Security (SAR-SSI).La Rochelle:IEEE,2011:1-8.
[19] KLEBER S,MAILE L,KARGL F.Survey of Protocol Reverse Engineering Algorithms:Decomposition of Tools for Static Traffic Analysis[J].IEEE Communications Surveys &Tuto-rials,2019,21(1):526-561.
[20] NARAYAN J,SHUKLA S K,CLANCY T C.A Survey of Automatic Protocol Reverse Engineering Tools[J].Acm Computing Surveys,2015,48(3):1-26.
[1] WANG Sheng-wu,CHEN Hong-mei. Feature Selection Method Based on Rough Sets and Improved Whale Optimization Algorithm [J]. Computer Science, 2020, 47(2): 44-50.
[2] ZHANG Hong-ze, HONG Zheng, WANG Chen, FENG Wen-bo, WU Li-fa. Closed Sequential Patterns Mining Based Unknown Protocol Format Inference Method [J]. Computer Science, 2019, 46(6): 80-89.
[3] JIAO Na. Feature Selection Algorithm Based on Segmentation Strategy [J]. Computer Science, 2018, 45(10): 43-46.
[4] JIAO Na. Research on Vertical Segmentation Knowledge Reduction Algorithm Based on Tolerance Rough Set Theory [J]. Computer Science, 2016, 43(1): 49-52.
[5] JIAO Na. Research on Knowledge Reduction Algorithm Based on Variable Precision Tolerance Rough Set Theory [J]. Computer Science, 2015, 42(5): 265-269.
[6] WANG Yong-sheng, ZHENG Xue-feng and SUO Yan-feng. Dynamic Algorithm for Computing Attribute Reduction Based on Information Granularity [J]. Computer Science, 2015, 42(4): 213-216.
[7] ZHONG Jin-yi and YE Dong-yi. Extended Decision-theoretic Rough Set Models Based on Fuzzy Minimum Cost [J]. Computer Science, 2014, 41(3): 50-54.
[8] JIAO Na. Evolutionary Gene Selection Based on Tolerance Rough Set Theory [J]. Computer Science, 2013, 40(Z6): 125-128.
[9] WEI Bi-peng,LV Yue-jin,LI Jin-hai and LI Da-lin. Attribute Reduction and Rule Acquisition in Incomplete and Inconsistent Ordered Decision Systems [J]. Computer Science, 2013, 40(Z11): 160-164.
[10] QIAN Wen-bin,YANG Bing-ru,XU Zhang-yan and XIE Yong-hong. Rule Extraction Algorithm Based on Discernibility Matrix in Inconsistent Decision Table [J]. Computer Science, 2013, 40(6): 215-218.
[11] . Efficient Dynamic Updating Algorithm of the Computation of Core in Decision Table [J]. Computer Science, 2012, 39(7): 210-214.
[12] . Interval-valued Decision-theoretic Rough Sets [J]. Computer Science, 2012, 39(7): 178-181.
[13] . Granular Matrix-based Knowledge Representation for Tolerance Relation [J]. Computer Science, 2012, 39(12): 214-215.
[14] . Fuzzy Decision-theoretic Rough Sets [J]. Computer Science, 2012, 39(12): 25-29.
[15] . Positive Region and its Algorithms in Rough Set Model of Variable Precision Upper Approximation and Grade Lower Approximation [J]. Computer Science, 2012, 39(1): 248-251.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!