计算机科学 ›› 2020, Vol. 47 ›› Issue (12): 332-335.doi: 10.11896/jsjkx.190900116
• 信息安全 • 上一篇
陈庆超1, 王韬1, 尹世庄1, 冯文博2
CHEN Qing-chao1, WANG Tao1, YIN Shi-zhuang1, FENG Wen-bo2
摘要: 关键词提取是进行未知网络协议逆向的关键步骤.鉴于现有的关键词提取方法存在精确度不高、需要较多先验知识、操作繁琐等问题提出了一种基于位置信息的关键词自动化提取算法.首先通过Trigram分词获取候选关键词附加上位置信息后将其组织成多级字典;在此基础上根据位置信息将传统的对候选关键词进行树状合并改进为对其进行链式合并以获得更精确的最长候选关键词.实验结果表明当设置频繁度阈值为0.6时该方法即可以准确提取出文本协议的关键词.同时分析了频繁度的设置对实验效果的影响并讨论了基于频繁序列对关键词进行挖掘的相关算法的局限性.
中图分类号:
[1] DUCHENE J,LE GUERNIC C,ALATA E,et al.State of the art of network protocol reverse engineering tools[J].Journal of Computer Virology and Hacking Techniques,2018,14(1):53-68. [2] Beddoe M A.Network protocol analysis using bioinformatics algorithms[OL].http://www.4tphi.net/~awalters/PI/pi.pdf. [3] SIJA B D,GOO Y H,SHIM K S,et al.A survey of automatic protocol reverse engineering approaches,methods,and tools on the inputs and outputs view[J].Security and Communication Networks,2018,2018:1-17. [4] CUI W,KANNAN J,WANG H J.Discoverer:Automatic Protocol Reverse Engineering from Network Traces[C]//USENIX Security Symposium.2007:1-14. [5] PAN F,HONG Z,DU Y Y,et al.Recursive Clustering BasedMethod for Message Structure Extraction[J].Journal of Sichuan University (Engineering Science Edition),2012,44(6):137-142. [6] BISWAS S K,BORDOLOI M,SHREYA J.A graph based key-word extraction model using collective node weight[J].Expert Systems with Applications,2018,97:51-59. [7] KLEBER S,MAILE L,KARGL F.Survey of Protocol Reverse Engineering Algorithms:Decomposition of Tools for Static Traffic Analysis[J].IEEE Communications Surveys &Tutorials,2018,21(1):526-561. [8] OUSIRIMANEECHAI N,SINTHUPINYO S.Extraction ofTrend Keywords and Stop Words from Thai Facebook Pages Using Character n-Grams[J].International Journal of Machine Learning and Computing,2018,8(6):589-594. [9] LIN M S,HAN X J,SONG W,et al.Based on multi-thread and multi-factor weighted keyword extraction algorithm[J].Computer Engineering and Design,2013,34(7):2398-2402. [10] KRUEGER T K N P.Protocol Inspection and State MachineAnalysis[J].Journal of the American Chemical Society,2014,98(25):8101-8107. [11] ZHANG Z,ZHANG Z,LEE P P,et al.Proword:An unsupervised approach to protocol feature word extraction[C]//IEEE INFOCOM 2014-IEEE Conference on Computer Communications.2014:1393-1401. [12] LUO J Z,YU S Z.Position-based automatic reverse engineering of network protocols[J].Journal of Network and Computer Applications,2013,36(3):1070-1077. [13] HONG Z,TIAN Y F,ZHANG H Z,et al.Extended prefix tree based protocol format inference[J].Computer Engineering and Applications,2018,54(12):19-25. [14] HOU F J,WANG L,WANG S,et al.Position-based automated protocol reverse engineer on network flows[J/OL].Computer Engineering.https://doi.org/10.19678/j.jssn.1000-3428.0050950. [15] ERMAN J,ARLITT M,MAHANTI A.Traffic classificationusing clustering algorithms[C]//Proceedings of the 2006 SIGCOMM Workshop on Mining Network Data.2006:281-286. [16] NIYAZMAND T,IZADI I.Pattern mining in alarm flood sequences using a modified PrefixSpan algorithm[J].ISA Transactions,2019,90:287-293. [17] LI Y,LI Q,ZHANG X.Separate Protocol Message-Based Format Signature Construction Method for Variable Field[J].Journal of Information Engineering University,2018,19(1):30-38. [18] PARK S H,SYNN J,KWON O H,et al.Apriori-based textmining method for the advancement of the transportation management plan in expressway work zones[J].The Journal of Supercomputing,2018,74(3):1283-1298. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 李素, 宋宝燕, 李冬, 王俊陆. 面向金融活动的复合区块链关联事件溯源方法 Composite Blockchain Associated Event Tracing Method for Financial Activities 计算机科学, 2022, 49(3): 346-353. https://doi.org/10.11896/jsjkx.210700068 |
[3] | 余晗青, 杨贞, 殷志坚. 基于区域激活策略的Tiny YOLOv3目标检测算法 Tiny YOLOv3 Target Detection Algorithm Based on Region Activation Strategy 计算机科学, 2021, 48(6A): 118-121. https://doi.org/10.11896/jsjkx.200700122 |
[4] | 毛湘科, 黄少滨, 余秦勇. 一种基于图的文档关键词和摘要协同抽取方法研究 Graph Based Collaborative Extraction Method for Keywords and Summary from Documents 计算机科学, 2021, 48(10): 44-50. https://doi.org/10.11896/jsjkx.200900082 |
[5] | 纪明轩, 宋玉蓉. 一种基于对数位置表示和自注意力的机器翻译新模型 New Machine Translation Model Based on Logarithmic Position Representation and Self-attention 计算机科学, 2020, 47(11A): 86-91. https://doi.org/10.11896/jsjkx.200200003 |
[6] | 徐立. 基于加权TextRank的文本关键词提取方法 Text Keyword Extraction Method Based on Weighted TextRank 计算机科学, 2019, 46(6A): 142-145. |
[7] | 杨玥,张德生. 中文文本的主题关键短语提取技术 Technology of Extracting Topical Keyphrases from Chinese Corpora 计算机科学, 2017, 44(Z11): 432-436. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.092 |
[8] | 陈湘涛,肖碧文. 基于位置信息的显露序列模式挖掘研究 Emerging Sequences Pattern Mining Based on Location Information 计算机科学, 2017, 44(7): 175-179. https://doi.org/10.11896/j.issn.1002-137X.2017.07.031 |
[9] | 王青芸,程春玲. 基于位置信息的移动SNS数据动态划分复制算法 Mobile SNS Data Dynamic Partitioning and Replication Algorithm Based on Location Information 计算机科学, 2017, 44(3): 220-225. https://doi.org/10.11896/j.issn.1002-137X.2017.03.046 |
[10] | 庞松超,罗长远,韩东东,庞涵滢. 一种新的航空自组网混合路由算法 Aeronautical Ad hoc Network Hybrid Routing Algorithm 计算机科学, 2016, 43(5): 56-61. https://doi.org/10.11896/j.issn.1002-137X.2016.05.010 |
[11] | 席瑞,李玉军,侯孟书. 室内定位方法综述 Survey on Indoor Localization 计算机科学, 2016, 43(4): 1-6. https://doi.org/10.11896/j.issn.1002-137X.2016.04.001 |
[12] | 陈伟鹤,刘云. 基于词或词组长度和频数的短中文文本关键词提取算法 Keyword Extraction Algorithm Based on Length and Frequency of Words or Phrases for Short Chinese Texts 计算机科学, 2016, 43(12): 50-57. https://doi.org/10.11896/j.issn.1002-137X.2016.12.009 |
[13] | 阿力甫·阿不都克里木,李晓. 基于TextRank算法和互信息相似度的维吾尔文关键词提取及文本分类 Uyghur Keyword Extraction and Text Classification Based on TextRank Algorithm and Mutual Information Similarity 计算机科学, 2016, 43(12): 36-40. https://doi.org/10.11896/j.issn.1002-137X.2016.12.006 |
[14] | 李响,孙华志. 一种新型的防范历史攻击的k-匿名算法 New k-anonymization Algorithm for Preventing Historical Attacks 计算机科学, 2015, 42(8): 194-197. |
[15] | 何远舵,陈之昀,王亚沙. 一种面向浏览式购物行为模式的LBS购书移动应用 Browse-shopping-behavior-pattern-oriented Indoor LBS Mobile Application for Book Shopping 计算机科学, 2015, 42(12): 32-35. |
|