计算机科学 ›› 2023, Vol. 50 ›› Issue (7): 261-269.doi: 10.11896/jsjkx.220700076
李育强1, 李林峰2, 朱浩1, 侯孟书1
LI Yuqiang1, LI Linfeng2, ZHU Hao1, HOU Mengshu1
摘要: 由于IPv6拥有庞大的地址空间,基于现有网络速度和硬件计算能力,难以实现全球IPv6地址扫描。通过地址生成算法来预测网络中可能出现的 IPv6 地址,随后将预测地址作为扫描的目标,可以达到 IPv6 地址快速扫描的目的。文中通过分析IPv6地址结构和分配方式来探索潜在的分配模式,结合已有的传统语言模型和目标生成算法,提出了一种基于深度学习的算法6LMNS,来预测潜在的活跃IPV6地址。6LMNS首先通过地址向量空间映射模型Add2vec来构建具有一定语义关系的 IPv6 地址词向量空间;随后基于Transformer构建语言训练模型GPT-IPv6,以此来估计IPv6地址词向量序列的概率分布;最后引入核心采样替代传统贪心搜索解码,完成活跃地址的生成。经验证,与其他语言模型和目标生成算法相比,6LMNS生成的地址拥有更好的多样性以及更高的活跃率。
中图分类号:
[1]BOU-HARB E,DEBBABI M,ASSI C.Cyber Scanning:A Comprehensive Survey[J].IEEE Communications Surveys and Tutorials,2014,16(3):1496-1519. [2]RYE E C,BEVERLY R.Discovering the IPv6 network peri-phery[C]//International Conference on Passive and Active Network Measurement.Cham,Springer,2020:3-18. [3]BEVERLY R,DURAIRAJAN R,PLONKA D,et al.In the IP of the beholder,Strategies for active IPv6 topology discovery[C]//Proceedings of the Internet Measurement Conference 2018.2018:308-321. [4]KÜHRER M,HUPPERICH T,ROSSOW C,et al.ExitfromHell?Reducing the Impact of Amplification DDoS Attacks[C]//Proceedings of the 23rd USENIX Security Symposium.USA,2014:111-125. [5]BEVERLY R.Yarrp’ing the Internet: Randomized high-speed active topology discovery[C]//Proceedings of the 2016 Internet Measurement Conference.2016:413-420. [6]PLONKA D,BERGER A.KIP,A measured approach to IPv6address anonymization[J].arXiv:1707.03900,2017. [7]SARABI A,LIU M.Characterizing the internet host population using deep learning,A universal and lightweight numerical embedding[C]//Proceedings of the Internet Measurement Confe-rence 2018.2018:133-146. [8]DURUMERIC Z,WUSTROW E,HALDERMAN J A.{ZMap},Fast Internet-wide Scanning and Its Security Applications[C]//22nd USENIX Security Symposium(USENIX Secu-rity 13).2013:605-620. [9]GRAHAM R D.Masscan,Mass ip port scanner[EB/OL].https://github.com/robertdavidgraham/masscan. [10]GASSER O,SCHEITLE Q,GEBHAR D S,et al.Scanning the IPv6 internet,towards a comprehensive hitlist[J].arXiv:1607.05179,2016. [11]STROWES S D.Bootstrapping active IPv6 measurement withIPv4 and public DNS[J].arXiv:1710.08536,2017. [12]FIEBIG T,BORGOLTE K,HAO S,et al.In rDNS we trust,revisiting a common data-source’s reliability[C]//International Conference on Passive and Active Network Measurement.Cham:Springer,2018:131-145. [13]HOLTZMAN A,BUYS J,DU L,et al.The curious case of neural text degeneration[J].arXiv:1904.09751,2019. [14]COULL S E,MONROSE F,BAILEY M.On Measuring theSimilarity of Network Hosts,Pitfalls,New Metrics,and Empirical Analyses[C]//NDSS.2011. [15]RING M,DALLMANN A,LANDES D,et al.Ip2vec:Learning similarities between ip addresses[C]//2017 IEEE International Conference on Data Mining Workshops(ICDMW).IEEE,2017:657-666. [16]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013. [17]PLONKA D,BERGER A.Temporal and spatial classification of active IPv6 addresses[C]//Proceedings of the 2015 Internet Measurement Conference.2015:509-522. [18]ULLRICH J,KIESEBERG P,KROMBHOLZ K,et al.On re-connaissance with IPv6:a pattern-based scanning approach[C]//2015 10th International Conference on Availability,Reliability and Security.IEEE,2015:186-192. [19]FOREMSKI P,PLONKA D,BERGER A.Entropy/ip:Uncovering structure in IPv6 addresses[C]//Proceedings of the 2016 Internet Measurement Conference.2016:167-181. [20]ZUO Z,MA Y,ZHANG P,et al.Predictional algorithm of active IPv6 address prefix[J].Iournal on Communications,2018,39(S1):1-8. [21]MURDOCK A,LI F,BRAMSEN P,et al.Target generationforinternet-wide IPv6 scanning[C]//Proceedings of the 2017 Internet Measurement Conference.2017:242-253. [22]LIU Z,XIONG Y,LIU X,et al.6Tree:Efficient dynamic disco-very of active addresses in the IPv6 address space[J].Computer Networks,2019,155:31-46. [23]SONG G,HE L,WANG Z,et al.Towards the construction of global IPv6 hitlist and efficient probing of IPv6 address space[C]//2020 IEEE/ACM 28th International Symposium on Qua-lity of Service(IWQoS).IEEE,2020:1-10. [24]CUI T,XIONG G,GOU G,et al.6veclm:Language modeling in vector space for IPv6 target generation[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer,2020:192-207. [25]CUI T,GOU G,XIONG G,et al.6GAN:IPv6 multi-pattern target generation via generative adversarial nets with reinforcement learning[C]//IEEE INFOCOM 2021-IEEE Conference on Computer Communications.IEEE,2021:1-10. [26]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in Neural Information Processing Systems,2017,1706:03762. [27]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014. [28]DAI Z,YANG Z,YANG Y,et al.Transformer-xl:Attentivelanguage models beyond a fixed-length context[J].arXiv:1901.02860,2019. [29]DAUPHIN Y N,FAN A,AULI M,et al.Language modelingwith gated convolutional networks[C]//International Confe-rence on Machine Learning.PMLR,2017:933-941. [30]ZAREMBA W,SUTSKEVER I,VINYALS O.Recurrent neural network regularization[J].arXiv:1409.2329,2014. [31]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Impro-ving language understanding by generative pre-training[J/OL].https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf. [32]FREITAG M,AL-ONAIZAN Y.Beam search strategies forneural machine translation[J].arXiv:1702.01806,2017. [33]VIJAYAKUMAR A,COGSWELL M,SELVARAJU R,et al.Diverse beam search for improved description of complex scenes[C]//Proceedings of the AAAI Conferenceon Artificial Intelligence.2018. [34]MEISTER C,VIEIRA T,COTTERELL R.If beam search is the answer,what was the question?[J].arXiv:2010.02650,2020. [35]WELLECK S,KULIKOV I,ROLLER S,et al.Neural text ge-neration with unlikelihood training[J].arXiv:1908.04319,2019. [36]SUNDERMEYER M,SCHLÜTER R,NEY H.LSTM neural networks for language modeling[C]//Thirteenth Annual Conference of the International SpeechCommunication Association.2012. |
|