计算机科学 ›› 2018, Vol. 45 ›› Issue (7): 16-21.doi: 10.11896/j.issn.1002-137X.2018.07.003

• 第五届CCF 大数据学术会议 • 上一篇    下一篇

一种基于空间变换的协同过滤推荐算法

赵兴旺,梁吉业,郭兰杰   

  1. 山西大学计算机与信息技术学院 太原030006;
    计算智能与中文信息处理教育部重点实验室山西大学 太原030006
  • 收稿日期:2017-07-16 出版日期:2018-07-30 发布日期:2018-07-30
  • 作者简介:赵兴旺(1984-),男,博士生,讲师,CCF会员,主要研究方向为数据挖掘与机器学习,E-mail:zhaoxw84@163.com;梁吉业(1962-),男,博士,教授,CCF会员,主要研究方向为粒计算、数据挖掘与机器学习,E-mail:ljy@sxu.edu.cn(通信作者);郭兰杰(1991-),男,硕士,主要研究方向为社会化推荐,E-mail:guolanjiesxu@163.com。
  • 基金资助:
    本文受国家自然科学基金项目(61432011,U1435212,61603230),山西省自然科学基金项目(201601D202039),山西省教育厅高校科技创新项目(2016111),山西省研究生教育创新项目(2018BY007)资助。

Collaborative Filtering Recommendation Algorithm Based on Space Transformation

ZHAO Xing-wang,LIANG Ji-ye,GUO Lan-jie   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;
    Key Laboratory of Computational Intelligence and Chinese Information ProcessingShanxi University, Ministry of Education,Taiyuan 030006,China
  • Received:2017-07-16 Online:2018-07-30 Published:2018-07-30

摘要: 传统的协同过滤推荐算法在实际应用中往往面临着计算可扩展性的问题。为解决此问题,文中在基于物品的协同过滤推荐的框架下,通过融合社交关系信息,提出了一种基于空间变换的协同过滤推荐算法。首先,根据用户社交网络信息,运用社区发现算法将用户划分为不同的类;其次,基于评分信息,根据用户和物品之间的对应关系找到各个用户类所对应的物品类;最后,通过各个物品对每一物品类的隶属关系,将稀疏的高维评分信息矩阵转换为一个低维稠密的物品隶属度矩阵,进而基于该矩阵进行相似度计算并进行协同过滤推荐。在公开数据集上将所提方法与其他算法进行了对比实验分析,结果表明,所提算法能够在保证推荐准确性的同时明显提升计算效率。

关键词: 可扩展性, 空间变换, 社交网络, 协同过滤

Abstract: In real applications,traditional collaborative filtering recommendation algorithms are usually faced with the problem of computational scalability.To solve this problem,in the framework of item-based collaborative filtering re-commendation,a collaborative filtering recommendation algorithm based on space transformation was proposed in this paper.Concretely speaking,according to the user social network information,the users are firstly divided into different clusters by using the community discovery algorithm.Then,item clusters are found according to the corresponding relationship between users and items in the rating information matrix.And the membership of each item for each item clusters is calculated.The sparse high dimensional rating information matrix is transformed into a low dimensional dense membership matrix,and then the similarities between items are carried on the transformed matrix.The proposed algorithm was compared with other algorithms on the public data set.The experimental results show that the proposed algorithm can significantly improve the computational efficiency while guaranteeing the accuracy of recommendation.

Key words: Collaborative filtering, Scalability, Social network, Space transformation

中图分类号: 

  • TP391
[1]ADOMAVICIUS G,TUZHILIN A.Toward the next generation of recommender systems:A survey of the state-of-the-art and possible extensions [J].IEEE Transactions on Knowledge and Data Engineering,2005,17(6):734-749.
[2]ZHU Y Y,SUN J.Recommender system:Up to now [J].Journal of Frontiers of Computer Science and Technology,2015,9(5):513-525.(in Chinese)
朱扬勇,孙婧.推荐系统研究进展[J].计算机科学与探索,2015,9(5):513-525.
[3]LENG Y J,LU Q,LIANG C Y.Survey of Recommendation Based on Collaborative Filtering [J].Pattern Recognition and Artificial Intelligence,2014,27(8):720-734.(in Chinese)
冷亚军,陆青,梁昌勇.协同过滤推荐技术综述[J].模式识别与人工智能,2014,27(8):720-734.
[4]GOLDBERG D,NICHOLS D,OKI B M,et al.Using collaborative filtering to weave an information tapestry [J].Communications of the ACM,1992,35(12):61-70.
[5]SHI Y,LARSON M,HANJALIC A.Collaborative filtering beyond the user-item matrix:A survey of the state of the art and future challenges[J].ACM Computing Surveys,2014,47(1):1-45.
[6]LENG Y J,LIANG C Y,DING Y,et al.Method of neighborhood formation in collaborative filtering[J].Pattern Recognition and Artificial Intelligence,2013,26(10):968-974.(in Chinese)
冷亚军,梁昌勇,丁勇,等.协同过滤中一种有效的最近邻选择方法[J].模式识别与人工智能,2013,26(10):968-974.
[7]WANG J,VRIES A P D,REINDERS M J T.Unifying user-based and item-based collaborative filtering approaches by similarity fusion[C]∥29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2006.
[8]LIANG C,LENG Y.Collaborative filtering based on information-theoretic co-clustering[J].International Journal of Systems Science,2014,45(3):589-597.
[9]KOREN Y,BELL R,VOLINSKY C.Matrix factorization techniques for recommender systems[J].IEEE Computer,2009,42(8):30-37.
[10]ZENG W,ZENG A,LIU H,et al.Uncovering the information core in recommender systems[J].Scientific Reports,2014,4:6140.
[11]CAI Y,LEUNG H,LI Q,et al.Typicality-based collabora-tive filtering recommendation[J].IEEE Transactions on Knowledge and Data Engineering,2010,2(3):97-104.
[12]XU B,BU J,CHEN C,et al.An exploration of improving colla-borative recommender systems via user-item subgroups[C]∥International Conference on World Wide Web.2012:21-30.
[13]LIU F,HONG J H.Use of social network information to enhancecollaborative filtering performance[J].Expert Systems with Applications,2010,37(7):4772-4778.
[14]GUO G B,ZHANG J,THALMANN D.Merging trust in colla-borative filtering to alleviate data sparsity and cold start[J].Knowledge-Based Systems,2014,57(2):57-68.
[15]GUO L J,LIANG J Y,ZHAO X W.Collaborative filtering re-commendation algorithm incorporating social network information[J].Pattern Recognition and Artificial Intelligence,2016,29(3):281-288.(in Chinese)
郭兰杰,梁吉业,赵兴旺.融合社交网络信息的协同过滤推荐算法[J].模式识别与人工智能,2016,29(3):281-288.
[16]WALTMAN L,ECK N J V.A smart local moving algorithm for large-scale modularity-based community detection[J].European Physical Journal B,2013,86(11):1-14.
[17]TANG J,HU X,LIU H.Social recommendation:A review[J].Social Network Analysis & Mining,2013,3(4):1113-1133.
[18]BREESE J S,HECKERMAN D,KADIE C.Empirical analysis of predictive algorithms for collaborative filtering[C]∥14th Conference on Uncertainty in Artificial Intelligence.1998:43-52.
[19]DESHPANDE M,KARYPIS G.Item-based top-n recommendation algorithms[J].ACM Transactions on Information Systems,2014,22(1):143-177.
[18]MCCLAIN J O,RAO V R.CLUSTISZ:A Program to Test for the Quality of Clustering of a Set of Objects.Journal of Marketing Research,1975,12(4):456-460.
[19]DAVIES D L,BOULDIN D W.A cluster separation measure.IEEE Transactions on Pattern Analysis & Machine Intelligence,1979,PAMI-1(2):224-227.
[20]INCORPORATED C S I.SAS - C Socket Library for TCP-IP,Release 5.01:SAS Technical Report C-111.SAS Publishing,1992.
[21]ROUSSEEUW P.Silhouettes:A graphical aid to the interpretation and validation of cluster analysis.Journal of Computational & Applied Mathematics,1987,20(20):53-65.
[22]KRZANOWSKI W J,LAI Y T.A Criterion for Determining the Number of Groups in a Data Set Using Sum-of-Squares Clustering.Biometrics,1988,44(1):23-34.
[23]XIE X L,BENI G.A Validity Measure for Fuzzy Clustering.IEEE Transactions on Pattern Analysis & Machine Intelligence,1991,13(13):841-847.
[24]HALKIDI M,VAZIRGIANNIS M,BATISTAKIS Y.QualityScheme Assessment in the Clustering Process∥Principles of Data Mining and Knowledge Discovery.Springer Berlin Heidelberg,2000:265-276.
[25]HALKIDI M,VAZIRGIANNIS M.Clustering validity assessment:finding the optimal partitioning ofa data set∥IEEE International Conference on Data Mining.IEEE,2001:187-194.
[26]AMORIM R C D,HENNIG C.Recovering the number of clusters in data sets with noise features using feature rescaling factors.Information Science,2015,324:126-145.
[27]CAMPO D N,STEGMAYER G,MILONE D H.A new index for clustering validation with overlapped clusters.Expert Systems with Applications,2016,64(C):549-556.
[28]FRIEDMAN H P,RUBIN J.On Some Invariant Criteria forGrouping Data.Publications of the American Statistical Association,1967,62(320):1159-1178.
[29]SCOTT A J,SYMONS M J.Clustering Methods Based on Likelihood Ratio Criteria.Biometrics,1971,27(2):387-397.
[30]HUBERT L J,LEVIN J R.A general statistical framework for assessing categorical clustering in free recall.Psychological Bulletin,1975,83(6):1072-1080.
[31]MILLIGAN G W.An examination of the effect of six types of error perturbation on fifteen clustering algorithms.Psychometrika,1980,45(3):325-342.
[32]JAIN A K,MURTY M N,FLYNN P J.Data clustering:a review.Acm Computing Surveys,1999,31(3):264-323.
[33]XU R,WUNSCH I D.Survey of clustering algorithms.IEEE Transactions on Neural Networks,2005,16(3):645-678.
[34]LAROSE D T.Introduction to Data Mining.Boston:China Machine Press,2010.
[35]SALTON G,HARMAN D.Information retrieval.Chichester:John Wiley and Sons Ltd.,2003.
[36]MANNING C D,RAGHAVAN P,SCH TZE H.An Introduction to Information Retrieval.Journal of the American Society for Information Science & Technology,2008,61(4):852-853.
[37]WITTEN D M,TIBSHIRANI R.A framework for feature selection in clustering.Publications of the American Statistical Association,2010,105(490):713-726.
[38]SUN W,WANG J,FANG Y.Regularized k-means clustering of high-dimensional data and its asymptotic consistency.Electronic Journal of Statistics,2012,6(2):148-167.
[1] 程章桃, 钟婷, 张晟铭, 周帆.
基于图学习的推荐系统研究综述
Survey of Recommender Systems Based on Graph Learning
计算机科学, 2022, 49(9): 1-13. https://doi.org/10.11896/jsjkx.210900072
[2] 王冠宇, 钟婷, 冯宇, 周帆.
基于矢量量化编码的协同过滤推荐方法
Collaborative Filtering Recommendation Method Based on Vector Quantization Coding
计算机科学, 2022, 49(9): 48-54. https://doi.org/10.11896/jsjkx.210700109
[3] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[4] 孙晓寒, 张莉.
基于评分区域子空间的协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm Based on Rating Region Subspace
计算机科学, 2022, 49(7): 50-56. https://doi.org/10.11896/jsjkx.210600062
[5] 蔡晓娟, 谭文安.
一种改进的融合相似度和信任度的协同过滤算法
Improved Collaborative Filtering Algorithm Combining Similarity and Trust
计算机科学, 2022, 49(6A): 238-241. https://doi.org/10.11896/jsjkx.210400088
[6] 何亦琛, 毛宜军, 谢贤芬, 古万荣.
基于点割集图分割的矩阵变换与分解的推荐算法
Matrix Transformation and Factorization Based on Graph Partitioning by Vertex Separator for Recommendation
计算机科学, 2022, 49(6A): 272-279. https://doi.org/10.11896/jsjkx.210600159
[7] 魏鹏, 马玉亮, 袁野, 吴安彪.
用户行为驱动的时序影响力最大化问题研究
Study on Temporal Influence Maximization Driven by User Behavior
计算机科学, 2022, 49(6): 119-126. https://doi.org/10.11896/jsjkx.210700145
[8] 郭亮, 杨兴耀, 于炯, 韩晨, 黄仲浩.
基于注意力机制和门控网络相结合的混合推荐系统
Hybrid Recommender System Based on Attention Mechanisms and Gating Network
计算机科学, 2022, 49(6): 158-164. https://doi.org/10.11896/jsjkx.210500013
[9] 余皑欣, 冯秀芳, 孙静宇.
结合物品相似性的社交信任推荐算法
Social Trust Recommendation Algorithm Combining Item Similarity
计算机科学, 2022, 49(5): 144-151. https://doi.org/10.11896/jsjkx.210300217
[10] 畅雅雯, 杨波, 高玥琳, 黄靖云.
基于SEIR的微信公众号信息传播建模与分析
Modeling and Analysis of WeChat Official Account Information Dissemination Based on SEIR
计算机科学, 2022, 49(4): 56-66. https://doi.org/10.11896/jsjkx.210900169
[11] 左园林, 龚月姣, 陈伟能.
成本受限条件下的社交网络影响最大化方法
Budget-aware Influence Maximization in Social Networks
计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228
[12] 郭磊, 马廷淮.
基于好友亲密度的用户匹配
Friend Closeness Based User Matching
计算机科学, 2022, 49(3): 113-120. https://doi.org/10.11896/jsjkx.210200137
[13] 董晓梅, 王蕊, 邹欣开.
面向推荐应用的差分隐私方案综述
Survey on Privacy Protection Solutions for Recommended Applications
计算机科学, 2021, 48(9): 21-35. https://doi.org/10.11896/jsjkx.201100083
[14] 王剑, 王玉翠, 黄梦杰.
社交网络中的虚假信息:定义、检测及控制
False Information in Social Networks:Definition,Detection and Control
计算机科学, 2021, 48(8): 263-277. https://doi.org/10.11896/jsjkx.210300053
[15] 朝乐门, 王锐.
数据科学平台:特征、技术及趋势
Data Science Platform:Features,Technologies and Trends
计算机科学, 2021, 48(8): 1-12. https://doi.org/10.11896/jsjkx.210600033
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!