Computer Science ›› 2019, Vol. 46 ›› Issue (6): 143-147.doi: 10.11896/j.issn.1002-137X.2019.06.021

Previous Articles     Next Articles

Topic-based Re-identification for Anonymous Users in Social Network

LV Zhi-quan1, LI Hao2, ZHANG Zong-fu2, ZHANG Min2   

  1. (National Computer Network Emergency Response Technical Team & Coordination Center of China,Beijing 100029,China)1
    (Department of TCA,Institute of Software,Chinese Academy of Sciences,Beijing 100190,China)2
  • Received:2019-02-21 Published:2019-06-24

Abstract: Social network has become part of people’s daily life recently,and brings convenience to our social activities.However,it poses threats to our personal privacy at the same time.Usually,people want to protect part of their private social activity information to prevent relatives,friends,colleagues or other specific groups from visiting.One common protective method is to socialize anonymously.And some social networks provide anonymity mechanisms for users,allowing them to hide some private information about social activities,thus separating these social activities from the main account.In addition,users can create alternate accounts and set different attributes,friendships to achieve the same aim.This paper proposed a topic-based re-identification method for social network users to make an attack on these protection mechanisms.The text contents published by anonymous users (or alternate accounts) and non-anonymous users (main accounts) are analyzed based on topic model.And the time factor and text length factor are introduced to construct user profiles in order to improve the accuracy ofthe proposed method.Then the similarity between anonymous and non-anonymous user profiles is analyzed to match their identities.Finally,experiments on real social network dataset show that the proposed method can effectively improve the accuracy of re-identification for users in social networks.

Key words: Big data, Social networks, Privacy protection, Anonymity, Re-identification

CLC Number: 

  • TP309
[1]FENG D G,ZHANG M,LI H.Big Data Security and Privacy Protection[J].Chinese Journal of Computers,2014,37(1):246-258.(in Chinese)
[2]PERITO D,CASTELLUCCIA C,KAAFAR M A,et al.How Unique and Traceable Are Usernames?[C]∥Proceedings of the 11th international conference on Privacy enhancing techno-logies.2011:1-17.
[3]LIU J,ZHANG F,SONG X,et al.What’s in a name?:an unsupervised approach to link users across communities[C]∥ACM International Conference on Web Search and Data Mining.ACM,2013:495-504.
[4]MALHOTRA A,TOTTI L,MEIRA W,et al.Studying User Footprints in Different Online Social Networks[C]∥IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.IEEE,2012:1065-1070.
[5]VOSECKY J,HONG D,SHEN V Y.User identification across multiple social networks[C]∥2009 First International Confe-rence on Networked Digital Technologies.IEEE,2009:360-365.
[6]ZANG H,BOLOT J.Anonymization of location data does not work:A large-scale measurement study[C]∥Proceedings of the 17th Annual International Conference on Mobile Computing and Networking.New York:ACM,2011:145-156.
[7]WANG H,GAO C,LI Y,et al.De-anonymization of mobility trajectories:Dissecting the gaps between theory and practice[C]∥Proceedings of The 25th Annual Network & Distributed System Security Symposium (NDSS’18).2018.
[8]WANG R,ZHANG M,FENG D,et al.A de-anonymization attack on geo-located data considering spatio-temporal influences[C]∥Proceedings of the 2015 International Conference on Information and Communications Security.Springer,Cham,2015:478-484.
[9]CHEN Z,FU Y,ZHANG M,et al.The De-anonymization Method Based on User Spatio-Temporal Mobility Trace[C]∥Proceedings of the 2017 International Conference on Information and Communications Security.Cham:Springer,2017:459-471.
[10]NARAYANAN A,SHMATIKOV V.De-anonymizing social networks[C]∥30th IEEE Symposium on Security and Privacy.IEEE,2009:173-187.
[11]FU H,ZHANG A,XIE X.De-anonymizing social graphs via node similarity[C]∥International Conference on World Wide Web.2014:263-264.
[12]LIN S H,LIAO M H.Towards publishing social network data with graph anonymization[J].Journal of Intelligent & Fuzzy Systems,2016,30(1):333-345.
[13]YUAN Y,WANG G,XU J Y,et al.Efficient distributed subgraph similarity matching[J].The VLDB Journal,2015,24(3):369-394.
[14]SERGEY B,ANTON K,SEUNGTAEK P,et al.Joint link-at-tribute user identity resolution in online social networks[C]∥The 6th SNA-KDD Workshop.2012:1-9.
[15]ZHANG L,ZHANG W.Edge anonymity in social network graphs[C]∥Proceedings of the 2009 International Conference on Computational Science and Engineering,Piscataway,NJ:IEEE.2009(4):1-8.
[16]TASSA T,COHEN D J.Anonymization of Centralized and Distributed Social Networks by Sequential Clustering[J].IEEE Transactions on Knowledge and Data Engineering,2013,25(2):311-324.
[17]ZHENG R,LI J,CHEN H,et al.A framework for authorship identification of online messages:Writing-style features and classification techniques[J].Journal of the Association for Information Science and Technology,2006,57(3):378-393.
[18]KONG X,ZHANG J,YU P S.Inferring anchor links across multiple heterogeneous social networks[C]∥Proceedings of the 22nd ACM International Conference on Information & Know-ledge Management.ACM,2013:179-188.
[19]ZHANG Y,WU Y,YANG Q.Community Discovery in Twitter Based on User Interests[J].Journal of Computational Information Systems,2012,8(3):991-1000.
[20]YAN G H,SHU X,MA Z C,et al.Community discovery for microblog based on topic and link analysis[J].Application Research of Computers,2013,30(7):1953-1957.(in Chinese)
[1] YU Xue-yong, CHEN Tao. Privacy Protection Offloading Algorithm Based on Virtual Mapping in Edge Computing Scene [J]. Computer Science, 2021, 48(1): 65-71.
[2] YE Ya-zhen, LIU Guo-hua, ZHU Yang-yong. Two-step Authorization Pattern of Data Product Circulation [J]. Computer Science, 2021, 48(1): 119-124.
[3] MA Li-bo, QIN Xiao-lin. Topic-Location-Category Aware Point-of-interest Recommendation [J]. Computer Science, 2020, 47(9): 81-87.
[4] ZHAO Hui-qun, WU Kai-feng. Big Data Valuation Algorithm [J]. Computer Science, 2020, 47(9): 110-116.
[5] MA Meng-yu, WU Ye, CHEN Luo, WU Jiang-jiang, LI Jun, JING Ning. Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data [J]. Computer Science, 2020, 47(9): 117-122.
[6] LI Yan, SHEN De-rong, NIE Tie-zheng, KOU Yue. Multi-keyword Semantic Search Scheme for Encrypted Cloud Data [J]. Computer Science, 2020, 47(9): 318-323.
[7] CHAO Le-men. Course Design and Redesign for Introduction to Data Science [J]. Computer Science, 2020, 47(7): 1-7.
[8] GUO Rui, LU Tian-liang, DU Yan-hui, ZHOU Yang, PAN Xiao-qin, LIU Xiao-chen. WSN Source-location Privacy Protection Based on Improved Ant Colony Algorithm [J]. Computer Science, 2020, 47(7): 307-313.
[9] GU Rong-Jie, WU Zhi-ping and SHI Huan. New Approach for Graded and Classified Cloud Data Access Control for Public Security Based on TFR Model [J]. Computer Science, 2020, 47(6A): 400-403.
[10] LI Yong. Stock Investment Strategy Development Based on BigQuant Platform [J]. Computer Science, 2020, 47(6A): 612-615.
[11] GE Yu-ming, HAN Qing-wen, WANG Miao-qiong, ZENG Ling-qiu, LI Lu. Application Mode and Challenges of Vehicular Big Data [J]. Computer Science, 2020, 47(6): 59-65.
[12] LIU Ji-qin, SHI Kai-quan. Big Data Decomposition-Fusion and Its Intelligent Acquisition [J]. Computer Science, 2020, 47(6): 66-73.
[13] PEI Jia-zhen, XU Zeng-chun, HU Ping. Person Re -identification Fusing Viewpoint Mechanism and Pose Estimation [J]. Computer Science, 2020, 47(6): 164-169.
[14] LIANG Jun-bin, ZHANG Min, JIANG Chan. Research Progress of Social Sensor Cloud Security [J]. Computer Science, 2020, 47(6): 276-283.
[15] ZENG Wei-liang, WU Miao-sen, SUN Wei-jun, XIE Sheng-li. Comprehensive Review of Autonomous Taxi Dispatching Systems [J]. Computer Science, 2020, 47(5): 181-189.
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .