计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 304-313.doi: 10.11896/jsjkx.220900145

• 信息安全 • 上一篇    下一篇

基于流量和文本指纹的两层物联网设备分类识别模型

祝博宇1, 陈霄1, 沙乐天1,2, 肖甫1,2   

  1. 1 南京邮电大学计算机学院 南京 210023
    2 江苏省无线传感网高技术研究重点实验室 南京 210023
  • 收稿日期:2022-09-16 修回日期:2022-12-12 出版日期:2023-08-15 发布日期:2023-08-02
  • 通讯作者: 肖甫(xiaof@njupt.edu.cn)
  • 作者简介:(475287366@qq.com)
  • 基金资助:
    国家自然科学基金重点项目(61932013)

Two-layer IoT Device Classification Recognition Model Based on Traffic and Text Fingerprints

ZHU Boyu1, CHEN Xiao1, SHA Letian1,2, XIAO Fu1,2   

  1. 1 School of Computer Science,Nanjing University of Posts and Telecomunications,Nanjing 210023,China
    2 Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks,Nanjing 210003,China
  • Received:2022-09-16 Revised:2022-12-12 Online:2023-08-15 Published:2023-08-02
  • About author:ZHU Boyu,born in 1998,postgraduate.His main research interest is Internet of Things security.
    XIAO Fu,born in 1980,Ph.D,professor,Ph.D supervisor.His main research interests include Internet of Things,intelligent sensing and mobile computing.
  • Supported by:
    Key Program of the National Natural Science Foundation of China(61932013).

摘要: 为及时隔离局域网内易受攻击的异常物联网设备,对网络管理员而言,具备高效的设备分类识别能力至关重要。现有方法中所选择的特征与设备关联性不高,且设备状态的差异会导致样本数据不平衡。针对上述问题,文中提出了一种基于流量和文本指纹的物联网设备分类识别模型FT-DRF(Flow Text-Double Random Forest)。首先设计特征挖掘模型,选取稳定的流统计数据作为设备流量指纹;其次基于HTTP,DNS和DHCP等应用层协议头部字段中的敏感文本信息生成设备文本指纹;在此基础上,对数据进行预处理并生成特征向量;最后,设计基于双层随机森林的机器学习算法对设备进行分类识别。对由13个物联网设备组成的模拟智能家居环境数据集和公共数据集进行有监督分类识别实验,结果表明,FT-DRF模型能够识别网络摄像头、智能音箱等物联网设备,平均准确率可达99.81%,相比现有典型方法提升了2%~5%。

关键词: 物联网, 设备识别, 机器学习, 流量分类, 敏感文本

Abstract: In order to isolate the vulnerable and abnormal IoT devices in the local area network in time,efficient device classification and identification capability is very important for network administrators.The features selected in the existing methods are not highly correlated with equipment,and the sample data is unbalanced due to differences in equipment status.Aiming at the above problems,this paper proposes an IoT device classification and identification model FT-DRF based on traffic and text fingerprints.This method firstly designs a feature mining model,selects stable flow statistics as device traffic fingerprints,and then generates device text fingerprints based on sensitive text information in the header fields of application layer protocols such as HTTP,DNS,and DHCP.On this basis,the data is preprocessed and the feature vector is generated.Finally,a machine learning algorithm based on double-layer random forest is designed to classify and identify the devices.A supervised classification and re-cognition experiment is conducted on the simulated smart home environment dataset composed of 13 IoT devices and public dataset.The results show that the FT-DRF model can identify IoT devices such as network cameras and smart speakers,with an ave-rage accuracy rate of 99.81%,which is 2%~5% higher than that of the existing typical methods.

Key words: Internetof Things, Device recognition, Machine learning, Traffic classification, Sensitive text

中图分类号: 

  • TP393
[1]Internet of Things statistics for 2022-Taking Things Apart[OL].https://dataprot.net/statistics/iot-statistics.
[2]ANTONAKAKIS M,APRIL T,BAILEY M,et al.Understan-ding the mirai botnet[C]//26th USENIX Security Symposium.Vancouver,BC:USENIX Association,2017:1093-1110.
[3]A giant botnet hiding around us:Pink[OL].https://blog.netlab.360.com/pinkbot.
[4]KOHNO T,BROIDO A,CLAFFY K C.Remote physical device fingerprinting[J].IEEE Transactions on Dependable and Secure Computing,2005,2(2):93-108.
[5]LANZE F,PANCHENKO A,BRAATZ B,et al.Clock skewbased remote device fingerprinting demystified[C]//2012 IEEE Global Communications Conference(GLOBECOM).IEEE,2012:813-819.
[6]LI Q,JIA Y X,SONG J K,et al.Search of Internet of Thing Information in the Cyberspace[J].Journal of Cyber Security,2018,3(5):38-53.
[7]CAO L C,ZHAO J J,CUI X,et al.Cyberspace device identification based on K-means with cosine distance measure[J].Journal of University of Chinese Academy of Sciences,2016,33(4):562-569.
[8]REN C L,GU Y,CUI J,et al.Web Features-based RecognitionSpecific-Type IoT Device in Cyberspace[J].Communications Technology,2017,50(5):1003-1009.
[9]SONG Y B,QI X Y,HUANG Q,et al.Two-stage multi-classification algorithm for Internet of Things equipment identification[J].Journal of Tsinghua University(Science and Technology),2020,60(5):365-370.
[10]LI Q,FENG X,LI Z,et al.GUIDE:Graphical user interface fingerprints physical devices[C]//2016 IEEE 24th International Conference on Network Protocols(ICNP).IEEE,2016:1-2.
[11]YU L,LUO B,MA J,et al.You Are What You Broadcast:Identification of Mobile and IoT Devices from(Public)WiFi[C]//29th USENIX Security Symposium.2020:55-72.
[12]MIETTINEN M,MARCHAL S,HAFEEZ I,et al.Iot sentinel:Automated device-type identification for security enforcement in iot[C]//2017 IEEE 37th International Conference on Distributed Computing Systems(ICDCS).2017:2177-2184.
[13]BEZAWADA B,BACHANI M,PETERSON J,et al.IoTSense:Behavioral fingerprinting of iot devices[C]//Proceedings of the 2018 Workshop on Attacks and Solutions in Hardware Security.2018:41-50.
[14]SIVANATHAN A,GHARAKHEILI H,LOI F,et al.Classi-fying IoT Devices in Smart Environments Using Network Traffic Characteristics[J].IEEE Transactions on Mobile Computing,2019,18(8):1745-1759.
[15]DONG S,LI Z,TANG D,et al.Your smart home can't keep a secret:Towards automated fingerprinting of iot traffic[C]//Proceedings of the 15th ACM Asia Conference on Computer and Communications Security.2020:47-59.
[16]KOSTAS K,JUST M,LONES M A.IoTDevID:A Behavior-Based Device Identification Method for the IoT[J].IEEE Internet of Things Journal,2022,9(23):23741-23749.
[17]HOFFMAN P,MCMANUS P.DNS Queries over HTTPS(DoH).RFC 8484[J/OL].https://doi.org/10.17487/RFC8484.
[18]JA3 Fingerprint[OL].https://en.everybodywiki.com/JA3_Fingerprint.
[19]Mercury[OL].https://github.com/cisco/mercury.
[20]Bag-of-words model[OL].https://en.wikipedia.org/wiki/Bag-of-words_model.
[21]Bow vs Sow-What's the difference?[OL].https://wikidiff.com/sow/bow.
[22]AL-GARADI M A,MOHAMED A,AL-ALI A,et al.A survey of machine and deep learning methods for Internet of things(IoT) security[J].IEEE Communications Surveys & Tutorials,2020,22(3):1646-1685.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!