计算机科学 ›› 2019, Vol. 46 ›› Issue (9): 93-98.doi: 10.11896/j.issn.1002-137X.2019.09.012

• 第35届中国数据库学术会议 • 上一篇    下一篇

社交网络中同一用户的识别

张征, 王宏志, 丁小欧, 李建中, 高宏   

  1. (哈尔滨工业大学计算机科学与技术学院 哈尔滨150001)
  • 收稿日期:2018-07-10 出版日期:2019-09-15 发布日期:2019-09-02
  • 作者简介:张 征(1997-),男,主要研究方向为社交数据挖掘;王宏志(1978-),男,博士,教授,CCF会员,主要研究领域为数据库、大数据,E-mail:wangzh@hit.edu.cn;丁小欧(1993-),女,博士生,CCF学生会员,主要研究方向为数据质量管理、数据清洗等;李建中(1950-),男,博士,教授,CCF会员,主要研究方向为数据库系统实现技术、数据仓库等;高 宏(1966-),女,博士,教授,CCF会员,主要研究方向为复杂结构数据管理、无线传感器网络等。
  • 基金资助:
    国家自然科学基金重点项目(U1509216),国家重点研发计划项目(2016YFB1000703),国家自然科学基金面上项目(61472099,61602129)

Identification of Same User in Social Networks

ZHANG Zheng, WANG Hong-zhi, DING Xiao-ou, LI Jian-zhong, GAO Hong   

  1. (Department of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China)
  • Received:2018-07-10 Online:2019-09-15 Published:2019-09-02

摘要: 对不同社交全局网络中同一用户的身份识别进行了相关研究,将社交网络建模为节点带有属性值且含有一个中心节点的网络,即ego-network,并就社交网络中身份识别的问题设计了相关算法。为挖掘同一个用户的节点对,对用户的属性、好友关系的相似度进行了建模,从而综合评价了不同社交网络中节点间的相似度,即为用户匹配评分,将其作为节点匹配的优先度;然后通过改进后的RCM算法得到全局最优的匹配结果;最后剪掉用户匹配评分较低的已匹配用户对以达到更好的效果。基于真实数据集,实验对比了该算法与几种相关算法的表现,并分析了不同参数对实验效果的影响,验证了所提算法的合理性。

关键词: RCM算法, 社交网络, 用户识别, 用户属性

Abstract: This paper carried on the related research of the same user identification in different social global networks.The social network was modeled as a network with attribute value and a central node,namely ego-network.And aiming at the identification problem in the social network,this paper designed related algorithm.In order to mine the node pairs of the same user,the user’s attributes and the similarity of the friends’ relationship are modeled,so as to comprehensively evaluate the similarities among the nodes in different social networks,namely,to get the user match score and to use it in node matching.Then through the improved RCM algorithm,the global optimal matching results are obtained,and finally the matching user pairs with lower user match scores are cut off to achieve better results.Based on real datasets,the performance of the algorithm is compared with several related algorithms.The effect of different parameters on experimental results is also analyzed and the rationality of the proposed algorithm is verified.

Key words: RCM algorithm, Social networks, User attributes, User identification

中图分类号: 

  • TP311
[1]HASSANZADEH O,PU K Q,et al.Discovering linkage points over web data[C]//Proceedings of the VLDB Endowment.2013:445-456.
[2]IRANI D,WEBB S,LI K,et al.Large Online Social Footprints--An Emerging Threat[C]//2009 International Conference on Computational Science and Engineering.2009:271-276.
[3]VOSECKY,HONG D,SHEN V Y.User identification acrosssocial networks[C]//International Conference on Networked Digital Technologies.Ostrava,2009:3660-365.
[4]BARTUNOV S,KORSHUNOV A,PARK S T,et al.Joint link-attribute User Identity Resolution In Online Social Networks[C]//International Conference on Knowledge Discovery and Data Mining,Workshop on Social Network Mining and Analysis.2012.
[5]MENG B.Research on algorithms for identifying users acrossmultiple online social networks[D].Dalian:Dalian University of Technology,2015:1-10.(in Chinese)孟波.多社交网络用户身份识别算法研究[D].大连:大连理工大学,2015:1-10.
[6]YU M H.Entity linking on graph data[C]//Proceedings of the 23rd International Conference on World Wide Web.2014:21-26.
[7]LIANG W,MENG B,HE X,et al.GCM:A Greedy-BasedCross-Matching Algorithm for Identifying Users Across Multiple Online Social Networks[J].PAISI,2015,9074:51-70.
[8]ZHOU X,LIANG X,MA Y.Cross-Platform Identification ofAnonymous Identical Users in Multiple Social Media Networks[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(2):411-424.
[9]XU X,YURUK N,FENG Z,et al.SCAN:A structural clustering algorithm for networks[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM Press,2007:824-833.
[10]LI Z J,DAI Q Q,LI R H,et al.Social Relationship Mining Algorithm by Multi-Dimensional Graph Structural Clustering[J].Journal of Software,2018,29(3):839-852.(in Chinese)李振军,代强强,李荣华,等.多维图结构聚类的社交关系挖掘算法[J].软件学报,2018,29(3):839-852.
[11]JEH G,WIDOM J.SimRank:a measure of structur-al-context simi-larity[C]//Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,KDD’02.2002:538-543.
[12]CHEN J M,CHEN J J,LIU J,et al.Clustering algorithms for large-scale social networks based on structural similarity[J].Journal of Electronics &Information Technology,2015,37(2):449-454.(in Chinese)陈季梦,陈佳俊,刘杰,等.基于结构相似度的大规模社交网络聚类算法[J].电子与信息学报,2015,37(2):449-454.
[1] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[2] 魏鹏, 马玉亮, 袁野, 吴安彪.
用户行为驱动的时序影响力最大化问题研究
Study on Temporal Influence Maximization Driven by User Behavior
计算机科学, 2022, 49(6): 119-126. https://doi.org/10.11896/jsjkx.210700145
[3] 余皑欣, 冯秀芳, 孙静宇.
结合物品相似性的社交信任推荐算法
Social Trust Recommendation Algorithm Combining Item Similarity
计算机科学, 2022, 49(5): 144-151. https://doi.org/10.11896/jsjkx.210300217
[4] 畅雅雯, 杨波, 高玥琳, 黄靖云.
基于SEIR的微信公众号信息传播建模与分析
Modeling and Analysis of WeChat Official Account Information Dissemination Based on SEIR
计算机科学, 2022, 49(4): 56-66. https://doi.org/10.11896/jsjkx.210900169
[5] 左园林, 龚月姣, 陈伟能.
成本受限条件下的社交网络影响最大化方法
Budget-aware Influence Maximization in Social Networks
计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228
[6] 李昊, 曹书瑜, 陈亚青, 张敏.
基于注意力机制的用户轨迹识别模型
User Trajectory Identification Model via Attention Mechanism
计算机科学, 2022, 49(3): 308-312. https://doi.org/10.11896/jsjkx.210300231
[7] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[8] 王剑, 王玉翠, 黄梦杰.
社交网络中的虚假信息:定义、检测及控制
False Information in Social Networks:Definition,Detection and Control
计算机科学, 2021, 48(8): 263-277. https://doi.org/10.11896/jsjkx.210300053
[9] 谭琪, 张凤荔, 王婷, 王瑞锦, 周世杰.
融入结构度中心性的社交网络用户影响力评估算法
Social Network User Influence Evaluation Algorithm Integrating Structure Centrality
计算机科学, 2021, 48(7): 124-129. https://doi.org/10.11896/jsjkx.200600096
[10] 张人之, 朱焱.
基于主动学习的社交网络恶意用户检测方法
Malicious User Detection Method for Social Network Based on Active Learning
计算机科学, 2021, 48(6): 332-337. https://doi.org/10.11896/jsjkx.200700151
[11] 鲍志强, 陈卫东.
基于最大后验估计的谣言源定位器
Rumor Source Detection in Social Networks via Maximum-a-Posteriori Estimation
计算机科学, 2021, 48(4): 243-248. https://doi.org/10.11896/jsjkx.200400053
[12] 张少杰, 鹿旭东, 郭伟, 王世鹏, 何伟.
供需匹配中的非诚信行为预防
Prevention of Dishonest Behavior in Supply-Demand Matching
计算机科学, 2021, 48(4): 303-308. https://doi.org/10.11896/jsjkx.200900090
[13] 袁得嵛, 陈世聪, 高见, 王小娟.
基于斯塔克尔伯格博弈的在线社交网络扭曲信息干预算法
Intervention Algorithm for Distorted Information in Online Social Networks Based on Stackelberg Game
计算机科学, 2021, 48(3): 313-319. https://doi.org/10.11896/jsjkx.200400079
[14] 谭琪, 张凤荔, 张志扬, 陈学勤.
社交网络用户影响力的建模方法
Modeling Methods of Social Network User Influence
计算机科学, 2021, 48(2): 76-86. https://doi.org/10.11896/jsjkx.191200102
[15] 郁友琴, 李弼程.
基于多粒度文本特征表示的微博用户兴趣识别
Microblog User Interest Recognition Based on Multi-granularity Text Feature Representation
计算机科学, 2021, 48(12): 219-225. https://doi.org/10.11896/jsjkx.201100128
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!