Computer Science ›› 2023, Vol. 50 ›› Issue (7): 317-324.doi: 10.11896/jsjkx.220600068

• Information Security • Previous Articles     Next Articles

Browser Fingerprint Recognition Based on Improved Self-paced Ensemble Algorithm

ZHANG Desheng1, CHEN Bo2, ZHANG Jianhui2, BU Youjun2, SUN Chongxin2, SUN Jia1   

  1. 1 School of Cyber Science and Engineering,Zhengzhou University,Zhengzhou,450000,China
    2 Information Technology Institute,PLA Strategic Support Force Information Engineering University,Zhengzhou 450000,China
  • Received:2022-06-07 Revised:2022-10-14 Online:2023-07-15 Published:2023-07-05
  • About author:ZHANG Desheng,born in 1997,postgraduate.His main research interests include cyberspace security and so on.ZHANG Jianhui,born in 1977,Ph.D,associate researcher,master supervisor.His main research interests include new network architecture,network routing technology,network data analysis and security control.
  • Supported by:
    National Natural Science Foundation of China(62176264).

Abstract: Browser fingerprinting technology has been used by many websites for user tracking,advertising delivery and security verification due to its stateless,cross-domain consistency and other advantages.The task of browser fingerprint recognition is a typical classification task of imbalanced data.The data imbalance exists in browser fingerprint long-term tracking task,which will lead to low accuracy of fingerprint recognition and failure of long-term tracking.An improved Self-paced Ensemble(ISPE) method is proposed to identify browser fingerprints.And the undersampling process of browser fingerprint sample and the training process of single classifier in ensemble learning are improved.Focusing on the browser fingerprint which is difficult to identify,added attention-like mechanism and self-paced factor are optimized to make the classifier pay more attention to the boundary samples which are difficult to classify in the training process,to improve the overall accuracy of browser fingerprint recognition.The results show that the F1-score of ISPE algorithm for browser fingerprint recognition reaches 95.6%,which is 16.8% higher than that of Bi-RNN algorithm.It proves that the method has excellent performance for long-term browser fingerprint tracking.

Key words: Browser fingerprinting, User tracking, Self-paced Ensemble, Undersampling, Ensemble learning

CLC Number: 

  • TP393
[1]Cookie Policy - Intellias[EB/OL].[2021-12-28].https://intellias.com/cookie-policy/.
[2]Cookies:An overview of associated privacy and security risks-Infosec Resources[EB/OL].[2021-12-28].https://resources.infosecinstitute.com/topic/cookies-an-overview-of-associated-privacy-and-security-risks/.
[3]YEN T F,XIE Y,YU F,et al.Host Fingerprinting and Tra-cking on the Web:Privacy and Security Implications[C]//19th Annual Network and Distributed System Security Symposium,NDSS 2012.San Diego,California,USA,2012.
[4]ECKERSLEY P.How Unique Is Your Web Browser?[C]//Proceedings of the 10th International Conference on Privacy Enhancing Technologie.Berlin,Germany,2010:1-18.
[5]TRICKEL E,STAROV O,KAPRAVELOS A,et al.Everyone isDifferent:Client-side Diversification for Defending Against Extension Fingerprinting[C]//28th USENIX Security Symposium(USENIX Security 19).Santa Clara,CA:USENIX Association,2019:1679-1696.
[6]WU S,LI S,CAO Y,et al.Rendered Private:Making GLSLExecution Uniform to Prevent WebGL-based Browser Fingerprin-ting[C]//28th USENIX Security Symposium(USENIX Security 19).Santa Clara,CA:USENIX Association,2019:1645-1660.
[7]CAO Y,LI S,WIJMANS E.(Cross-)Browser Fingerprinting via OS and Hardware Level Features[C]//24th Annual Network and Distributed System Security Symposium,NDSS 2017.San Diego,California,USA,2017.
[8]TAO X M,HAO S Y,ZHANG D X,et al.A Review of Imba-lanced Data Classification Algorithms[J].Journal of Chongqing University of Posts and Telecommunications:Natural Science Edition,2013,25:1-11.
[9]LIU Z,CAO W,GAO Z,et al.Self-paced Ensemble for Highly Imbalanced Massive Data Classification[C]//36th IEEE International Conference on Data Engineering(ICDE 2020).Dallas,TX,USA:IEEE,2020:841-852.
[10]MUFIOZ-GARCIA Ó,MONTERRUBIO-MARTIN J,GAR-CIA-AUBERT D.Detecting browser fingerprint evolution for identifying unique users[J].International Journal of Electronic Business,2012,10(2):120-141.
[11]YAMADA T,SAITO T,TAKASU K,et al.Robust Identification of Browser Fingerprint Comparison Using Edit Distance[C]//10th International Conference on Broadband and Wireless Computing,Communication and Applications,BWCCA 2015.Krakow,Poland:IEEE Computer Society,2015:107-113.
[12]VASTEL A,LAPERDRIX P,RUDAMETKIN W,et al.FP-STALKER:Tracking Browser Fingerprint Evolutions[C]//2018 IEEE Symposium on Security and Privacy.San Francisco,California,USA:IEEE Computer Society,2018:728-741.
[13]LI X,CUI X,SHI L,et al.Constructing Browser Fingerprint Tracking Chain Based on LSTM Model[C]//Third IEEE International Conference on Data Science in Cyberspace(DSC 2018).Guangzhou,China:IEEE,2018:213-218.
[14]LIU Q X,LIU X Y,LUO C,et al.Android Browser Fingerprin-ting Method Based on Bidirectional Recurrent Neural Network [J].Journal of Computer Research and Development,2020,57:2294.
[15]NAKIBLY G,SHELEF G,YUDILEVICH S.Hardware Fingerprinting Using HTML5[J].arXiv:1503.01408,2015.
[16]MOWERY K,SHACHAM H.Pixel perfect:Fingerprinting canvas in HTML5[C]//Proceedings of W2SP.2012:1-12.
[17]LAPERDRIX P,RUDAMETKIN W,BAUDRY B.Beauty and the Beast:Diverting Modern Web Browsers to Build Unique Browser Fingerprints[C]//2016 IEEE Symposium on Security and Privacy(SP).2016:878-894.
[18]GitHub-fingerprintjs/fingerprintjs:Browser fingerprinting libr-ary with the highest accuracy and stability[EB/OL].[2021-12-29].https://github.com/fingerprintjs/fingerprintjs.
[19]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isAll You Need[J/OL].Advances in Neural Information Proces-sing Systems,2017,2017:5999-6009.https://arxiv.org/abs/1706.03762v5.
[20]BREIMAN L.Random Forests[J].Machine Learning,2001,45(1):5-32.
[21]KARAKOULAS G,SHAWE-TAYLOR J.Optimizing classifers for imbalanced training sets[C]//Advances in Neural Information Processing Systems.1998.
[22]CHAWLA N V,LAZAREVIC A,HALL L O,et al.SMOTE-Boost:Improving Prediction of the Minority Class in Boosting[C]//Knowledge Discovery in Databases:PKDD 2003,7th European Conference on Principles and Practice of Knowledge Discovery in Databases.Cavtat-Dubrovnik,Croatia:Springer,2003:107-119.
[1] YANG Qianlong, JIANG Lingyun. Study on Load Balancing Algorithm of Microservices Based on Machine Learning [J]. Computer Science, 2023, 50(5): 313-321.
[2] HU Zhongyuan, XUE Yu, ZHA Jiajie. Survey on Evolutionary Recurrent Neural Networks [J]. Computer Science, 2023, 50(3): 254-265.
[3] LIN Xi, CHEN Zi-zhuo, WANG Zhong-qing. Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning [J]. Computer Science, 2022, 49(6A): 144-149.
[4] KANG Yan, WU Zhi-wei, KOU Yong-qi, ZHANG Lan, XIE Si-yu, LI Hao. Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution [J]. Computer Science, 2022, 49(6A): 150-158.
[5] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[6] HAN Hong-qi, RAN Ya-xin, ZHANG Yun-liang, GUI Jie, GAO Xiong, YI Meng-lin. Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning [J]. Computer Science, 2022, 49(5): 33-42.
[7] REN Shou-peng, LI Jin, WANG Jing-ru, YUE Kun. Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction [J]. Computer Science, 2022, 49(2): 265-271.
[8] CHEN Wei, LI Hang, LI Wei-hua. Ensemble Learning Method for Nucleosome Localization Prediction [J]. Computer Science, 2022, 49(2): 285-291.
[9] WANG Bo, HUA Qing-yi, SHU Xin-feng. Study on Anomaly Detection and Real-time Reliability Evaluation of Complex Component System Based on Log of Cloud Platform [J]. Computer Science, 2022, 49(12): 125-135.
[10] WANG Ying-hui, LI Wei-hua, LI Chuan, CHEN Wei, WEN Jun-ying. Prediction of Antigenic Similarity of Influenza A/H5N1 Virus Based on Attention Mechanism and Ensemble Learning [J]. Computer Science, 2022, 49(11A): 210900032-6.
[11] XU Kun-cai, FENG Bao, CHEN Ye-hang, LIU Yu, ZHOU Hao-yang, CHEN Xiang-meng. Thymoma CT Image Prediction Method Based on Deep Learning and Improved Extreme Learning Machine Ensemble Learning [J]. Computer Science, 2022, 49(11A): 211200097-6.
[12] WEI Jun-sheng, LIU Yan, CHEN Jing, DUAN Shun-ran. Universal Multi-class Ensemble Method with Self Adaptive Weights [J]. Computer Science, 2022, 49(11): 212-220.
[13] LIU Zhen-yu, SONG Xiao-ying. Multivariate Regression Forest for Categorical Attribute Data [J]. Computer Science, 2022, 49(1): 108-114.
[14] ZHOU Xin-min, HU Yi-gui, LIU Wen-jie, SUN Rong-jun. Research on Urban Function Recognition Based on Multi-modal and Multi-level Data Fusion Method [J]. Computer Science, 2021, 48(9): 50-58.
[15] ZHOU Gang, GUO Fu-liang. Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data [J]. Computer Science, 2021, 48(6A): 250-254.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!