Computer Science ›› 2020, Vol. 47 ›› Issue (12): 332-335.doi: 10.11896/jsjkx.190900116

Previous Articles    

Chain Merging Method for Unknown Text Protocol Candidate Keyword Stored in Multi-levelDictionary

CHEN Qing-chao1, WANG Tao1, YIN Shi-zhuang1, FENG Wen-bo2   

  1. 1 Equipment Simulation Training CenterArmy Engineering University Shijiazhuang 050003,China
    2 College of Command and Control Engineering Army Engineering University Nanjing 210007,China
  • Received:2019-09-17 Revised:2019-11-21 Published:2020-12-17
  • About author:CHEN Qing-chao,born in 1996postgraduate.His main research interests include cyber security and so on.
    WANG Tao,born in 1964Ph.Dprofessor.His main research interests include cyber security and cryptography.
  • Supported by:
    National Basic Research Program of China(2017YFB0802900) and Natural Science Foundation of Jiangsu Pro-vince,China (BK20161469).

Abstract: Keyword extraction is a key step in the reverse engineering of unknown network protocols.The existing keyword extraction methods have some problemssuch as low accuracycomplex operation and more prior knowledge is required.Thereforean automatic keyword extraction algorithm based on location information is proposed.Firstthe candidate keywords are obtained by Trigram word segmentation.After adding the location informationthese keywords are organized into a multi-level dictionary.On this basisthe traditional tree merging of candidate keywords is improved to chain merging according to the location informationso as to obtain more precise and the longest candidate keywords.The experimental results show thatwhen the frequency threshold is set to 0.6this method can accurately extract the keywords of text protocol.At the same timethe influence of frequency setting on experimental result is analyzedand the limitations of related algorithms for keyword mining based on frequent sequences are also discussed.

Key words: Chain, Keyword extraction, Location information, Multi-level dictionary, Trigram, Unknown text protocol

CLC Number: 

  • TP393
[1] DUCHENE J,LE GUERNIC C,ALATA E,et al.State of the art of network protocol reverse engineering tools[J].Journal of Computer Virology and Hacking Techniques,2018,14(1):53-68.
[2] Beddoe M A.Network protocol analysis using bioinformatics algorithms[OL].http://www.4tphi.net/~awalters/PI/pi.pdf.
[3] SIJA B D,GOO Y H,SHIM K S,et al.A survey of automatic protocol reverse engineering approaches,methods,and tools on the inputs and outputs view[J].Security and Communication Networks,2018,2018:1-17.
[4] CUI W,KANNAN J,WANG H J.Discoverer:Automatic Protocol Reverse Engineering from Network Traces[C]//USENIX Security Symposium.2007:1-14.
[5] PAN F,HONG Z,DU Y Y,et al.Recursive Clustering BasedMethod for Message Structure Extraction[J].Journal of Sichuan University (Engineering Science Edition),2012,44(6):137-142.
[6] BISWAS S K,BORDOLOI M,SHREYA J.A graph based key-word extraction model using collective node weight[J].Expert
Systems with Applications,2018,97:51-59.
[7] KLEBER S,MAILE L,KARGL F.Survey of Protocol Reverse Engineering Algorithms:Decomposition of Tools for Static Traffic Analysis[J].IEEE Communications Surveys &Tutorials,2018,21(1):526-561.
[8] OUSIRIMANEECHAI N,SINTHUPINYO S.Extraction ofTrend Keywords and Stop Words from Thai Facebook Pages Using Character n-Grams[J].International Journal of Machine Learning and Computing,2018,8(6):589-594.
[9] LIN M S,HAN X J,SONG W,et al.Based on multi-thread and multi-factor weighted keyword extraction algorithm[J].Computer Engineering and Design,2013,34(7):2398-2402.
[10] KRUEGER T K N P.Protocol Inspection and State MachineAnalysis[J].Journal of the American Chemical Society,2014,98(25):8101-8107.
[11] ZHANG Z,ZHANG Z,LEE P P,et al.Proword:An unsupervised approach to protocol feature word extraction[C]//IEEE INFOCOM 2014-IEEE Conference on Computer Communications.2014:1393-1401.
[12] LUO J Z,YU S Z.Position-based automatic reverse engineering of network protocols[J].Journal of Network and Computer Applications,2013,36(3):1070-1077.
[13] HONG Z,TIAN Y F,ZHANG H Z,et al.Extended prefix tree based protocol format inference[J].Computer Engineering and Applications,2018,54(12):19-25.
[14] HOU F J,WANG L,WANG S,et al.Position-based automated protocol reverse engineer on network flows[J/OL].Computer Engineering.https://doi.org/10.19678/j.jssn.1000-3428.0050950.
[15] ERMAN J,ARLITT M,MAHANTI A.Traffic classificationusing clustering algorithms[C]//Proceedings of the 2006 SIGCOMM Workshop on Mining Network Data.2006:281-286.
[16] NIYAZMAND T,IZADI I.Pattern mining in alarm flood sequences using a modified PrefixSpan algorithm[J].ISA Transactions,2019,90:287-293.
[17] LI Y,LI Q,ZHANG X.Separate Protocol Message-Based Format Signature Construction Method for Variable Field[J].Journal of Information Engineering University,2018,19(1):30-38.
[18] PARK S H,SYNN J,KWON O H,et al.Apriori-based textmining method for the advancement of the transportation management plan in expressway work zones[J].The Journal of Supercomputing,2018,74(3):1283-1298.
[1] KONG Shi-ming, FENG Yong, ZHANG Jia-yun. Multi-level Inheritance Influence Calculation and Generalization Based on Knowledge Graph [J]. Computer Science, 2022, 49(9): 221-227.
[2] WANG Zi-kai, ZHU Jian, ZHANG Bo-jun, HU Kai. Research and Implementation of Parallel Method in Blockchain and Smart Contract [J]. Computer Science, 2022, 49(9): 312-317.
[3] WU Gong-xing, Sun Zhao-yang, JU Chun-hua. Closed-loop Supply Chain Network Design Model Considering Interruption Risk and Fuzzy Pricing [J]. Computer Science, 2022, 49(7): 220-225.
[4] FU Li-yu, LU Ge-hao, WU Yi-ming, LUO Ya-ling. Overview of Research and Development of Blockchain Technology [J]. Computer Science, 2022, 49(6A): 447-461.
[5] GAO Jian-bo, ZHANG Jia-shuo, LI Qing-shan, CHEN Zhong. RegLang:A Smart Contract Programming Language for Regulation [J]. Computer Science, 2022, 49(6A): 462-468.
[6] YUAN Hao-nan, WANG Rui-jin, ZHENG Bo-wen, WU Bang-yan. Design and Implementation of Cross-chain Trusted EMR Sharing System Based on Fabric [J]. Computer Science, 2022, 49(6A): 490-495.
[7] MAO Dian-hui, HUANG Hui-yu, ZHAO Shuang. Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance [J]. Computer Science, 2022, 49(6A): 523-530.
[8] CHEN Yan-bing, ZHONG Chao-ran, ZHOU Chao-ran, XUE Ling-yan, HUANG Hai-ping. Design of Cross-domain Authentication Scheme Based on Medical Consortium Chain [J]. Computer Science, 2022, 49(6A): 537-543.
[9] XU Jia-nan, ZHANG Tian-rui, ZHAO Wei-bo, JIA Ze-xuan. Study on Improved BP Wavelet Neural Network for Supply Chain Risk Assessment [J]. Computer Science, 2022, 49(6A): 654-660.
[10] LI Bo, XIANG Hai-yun, ZHANG Yu-xiang, LIAO Hao-de. Application Research of PBFT Optimization Algorithm for Food Traceability Scenarios [J]. Computer Science, 2022, 49(6A): 723-728.
[11] ZHOU Hang, JIANG He, ZHAO Yan, XIE Xiang-peng. Study on Optimal Scheduling of Power Blockchain System for Consensus Transaction ofEach Unit [J]. Computer Science, 2022, 49(6A): 771-776.
[12] WANG Si-ming, TAN Bei-hai, YU Rong. Blockchain Sharding and Incentive Mechanism for 6G Dependable Intelligence [J]. Computer Science, 2022, 49(6): 32-38.
[13] SUN Hao, MAO Han-yu, ZHANG Yan-feng, YU Ge, XU Shi-cheng, HE Guang-yu. Development and Application of Blockchain Cross-chain Technology [J]. Computer Science, 2022, 49(5): 287-295.
[14] YANG Zhen, HUANG Song, ZHENG Chang-you. Study on Crowdsourced Testing Intellectual Property Protection Technology Based on Blockchain and Improved CP-ABE [J]. Computer Science, 2022, 49(5): 325-332.
[15] REN Chang, ZHAO Hong, JIANG Hua. Quantum Secured-Byzantine Fault Tolerance Blockchain Consensus Mechanism [J]. Computer Science, 2022, 49(5): 333-340.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!