计算机科学 ›› 2020, Vol. 47 ›› Issue (2): 31-36.doi: 10.11896/jsjkx.190500130

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于极端评分行为的相似度计算

冯晨娇1,2,梁吉业1,宋鹏3,王智强1   

  1. (山西大学计算智能与中文信息处理教育部重点实验室 太原030006)1;
    (山西财经大学应用数学学院 太原030006)2;
    (山西大学经济与管理学院 太原030006)3
  • 收稿日期:2019-05-23 出版日期:2020-02-15 发布日期:2020-03-18
  • 通讯作者: 梁吉业(ljy@sxu.edu.cn)
  • 基金资助:
    国家自然科学基金项目(61876103);山西省重点研发计划重点项目(201603D111014);山西省回国留学人员科研资助项目(2017-005);山西省1331工程项目

New Similarity Measure Based on Extremely Rating Behavior

FENG Chen-jiao1,2,LIANG Ji-ye1,SONG Peng3,WANG Zhi-qiang1   

  1. (Key Laboratory of Computation Intelligence & Chinese Information Processing (Shanxi University),Ministry of Education,Taiyuan 030006,China)1;
    (College of Applied Mathematics,Shanxi University of Finance and Economics,Taiyuan 030006,China)2;
    (School of Economics and Management,Shanxi University,Taiyuan 030006,China)3
  • Received:2019-05-23 Online:2020-02-15 Published:2020-03-18
  • About author:FENG Chen-jiao,born in 1977,doctorial student,lecturer,is member of China Computer Federation.Her main research interests include data mining,big data correlation analysis and recommender systems;LIANG Ji-ye,born in 1962,Ph.D,professor,Ph.D supervisor,is member of China Computer Federation.His main research interests include granular computing,data mining and machine learning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61876103), Projects of Key Research and Development Plan of Shanxi Province (201603D111014), Research Project Supported by Shanxi Scholarship Council of China (2017-005) and 1331 Engineering Project of Shanxi Province, China.

摘要: 随着互联网技术的迅猛发展,互联网信息急剧增长,信息过载问题愈发凸显。面对海量的互联网信息,用户往往需要耗费大量的时间来搜索所需的信息或产品,而搜索的解往往受到制约。为解决信息过载问题,推荐系统应运而生。推荐系统根据用户的历史行为推测其需求、兴趣等,将用户感兴趣的信息、产品等推荐给用户。作为推荐领域中一类重要的推荐方法,基于记忆的协同过滤方法通常依据用户或产品的近邻信息来构造评分预测函数,其核心在于准确度量用户或产品之间的相似度。传统的相似度量,如皮尔逊、余弦及秩相关系数等,通常只考虑了用户之间的线性关系;而启发式相似度如基于3个特殊因子的PIP相似度及其改进方法,则只刻画了用户之间的非线性关系。事实上,在推荐系统中,就用户之间的相似关系而言,仅用线性或是非线性函数来度量均是不准确的。为了更为精细地刻画用户之间的相似程度,文中提出了基于非线性函数的用户极端评分行为的相似程度度量指数,通过将该指数融入传统的线性相关系数,构造了一个考虑极端评分行为的新的相似度。为验证该方法的有效性,基于Ml(100k)和Ml-latest-small两个数据集,将其与传统相似度以及启发式相似度进行比较,结果显示基于极端评分行为相似度的协同过滤方法在MAE和RMSE指标上能够获得更好的表现。

关键词: 基于记忆的协同过滤, 极端评分行为, 推荐系统, 相似度, 协同过滤

Abstract: With the rapid development of Internet technology,drastic Internet information explosion makes information overload as an increasingly serious problem.Faced with the massive Internet information,users consume a lot of time to search for information or products,but the search solution is constrained.The recommender systems is hence proposed to address the problem of information overload.The recommender systems use users’ historical behaviors to speculate their needs,interests,etc.,and recommend the information and products users may be interested in.As an important type of recommendation approach,the memory-based collaborative filtering methods establish the rating prediction function based on neighbor information of the user or pro-duct.The essence of the function is to precisely measure the similarity between users or products.The traditional similarity mea-sures such as Pearson,Cosin and Spearman rank correlation coefficients,only take into account the linear relationship between users,while the heuristic similarities,such as the PIP measurement based on three special factors and its improved version,only depict the non-liner relationship between users.Indeed,in the recommender systems,it is neither the linear relation nor the non-linear relation is good for measuring the similarity between users.In order to describe the similarity among users more finely,this paper proposed a similarity measure index of the correlation level considering the extreme rating behaviors based on anonli-near function.By integrating this index with the traditional linear correlation coefficients,this paper constructed a novel similarity measure.Comparative experiments were conducted to test the practicability and validity of the proposed approach on Ml (100k) and Ml-latest-small datasets.The results demonstrate that the proposed method performs better judged by indicators of MAE and RMSE.

Key words: Collaborative filtering, Extremely rating behavior, Memory-based collaborative filtering, Recommender systems, Similarity

中图分类号: 

  • TP182
[1]GOLDBERG D,NICHOLS D,OKIB M,et al.Using collaborative filtering to weave an information tapestry[J].Communications of ACM,1992,35(12):61-70.
[2]RESNICK P,VARIAN H R.Recommender systems[J].Communications of ACM,1997,40(3):56-58.
[3]ZENEBE A,NORCIO A F.Representation:Similarity measures and aggregation methods using fuzzy sets for content-based re-commender systems[J].Fuzzy Sets and Systems,2009,160(1):76-94.
[4]SCHAFER J B,KONSTAN J A,RIEDL J.E-commerce recommendation applications[J].Data Mining and Knowledge Disco-very,2001,5(1/2):115-153.
[5]BOBADILLA J,ORTEGA F,HERNANDO A,et al.Recom-mender systems survey[J].Knowledge-Based Systems,2013,46(1):109-132.
[6]AAMIR M,BHUSRY M.Recommendation system:State of the art approach[J].International Journal of Computer Applications,2015,120:25-32.
[7]XIAO Y Y,ZHANG H Y.Friend recommendation method based on users’latent features in social networks[J].Computer Science,2018,45(3):220-254.
[8]ZHANG S,YAO L,SUN A,et al.Deep learning based recommender system:A survey and new perspectives [J].ACM Computing Surveys,2017,1(1):1-35.
[9]HANG L V,JIANG B T,LV S Y,et al.Survey on deep learning based recommender systems[J].Chinese Journal of Computers,2018,41(7):191-219.
[10]HSU C C,YEH M Y,LIN S D.A general framework for impli-cit and explicit social recommendation[J].IEEE Transactions on Knowledge and Data Engineering,2018,14(8):1-14.
[11]KATZMAN J,SHAHAM U,BATES J,et al.DeepSurv:perso-nalized treatment recommender system using a cox proportional hazards deep neural network[J].Bmc Medical Research Metho-dology,2016,18(1):24.
[12]QUADRANA M,CREMONESI P,JANNACH D.Sequence-aware recommender systems[J].ACM Computing Surveys,2018,51(4):373-374.
[13]BREESE J S,HECKERMAN D,KADIE C.Empirical analysis of predictive algorithms for collaborative filtering[J].Uncertainty in Artificial Intelligence,2013,98(7):43-52.
[14]SU X Y,KHOSHGOFTAAR T M.A survey of collaborative filtering techniques[J].Advances in Artificial Intelligence,2012,2009(12):1-19.
[15]SHI Y,LARSON M,HANJALIC A.Collaborative filtering beyond the user-item matrix:A survey of the state of the art and future challenges[J].ACM Computing Surveys,2014,47(1):1-45.
[16]LEE S.Using entropy for similarity measures in collaborative filtering[J/OL].Journal of Ambient Intelligence and Humanized Computing,2019.https://doi.org/10.1007/s12652-019-01226-0.
[17]HE X,HE Z,SONG J,et al.NAIS:Neural attentive item similarity model for recommendation[J].IEEE Transactions on Knowledge and Data Engineering,2018,30(12):2354-2366.
[18]LIAN D,GE Y,ZHANG F,et al.Scalable content-aware colla-borative filtering for location recommendation[J].IEEE Transa-ctions on Knowledge and Data Engineering,2018,30(6):1122- 1135.
[19]SARWAR B M,KARYPIS G,KONSTAN J A,et al. Analysis of recommendation algorithms for e-commerce[C]∥Proceedings of ACM E-Commerce.Minneapolis,Minn,USA,2000:158-167.
[20]RESNICK P,IACOVOU N,SUCHAK M,et al.Grouplens:An open architecture for collaborative filtering of netnews[C]∥Proceedings of the ACM Conference on Computer Supported Cooperative Work.New York:ACM Press,1994:175-186.
[21]SHARDANAND U,MAES P.Social information filtering:algorithm for automating’ word of mouth’[C]∥Proceedings of ACM CHI’95 Conference on Human Factors in Computing Systems.New York:ACM Press,1995:210-217.
[22]KENDALL M G.Rank correlation methods[J].British Journal of Psychology,1990,25(1):86-91.
[23]AHN H J.A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem[J].Information Sciences,2008,178(1):37-51.
[24]LIU H,ZHENG H,MIAN A,et al.A new user similarity model to improve the accuracy of collaborative filtering[J].Knowledge-Based Systems,2014,56(3):156-166.
[25]HERLOCKER J L,KONSTAN J A,BORCHERS A,et al.An algorithmic framework for performing collaborative filtering[C]∥Proceedings of the SIGIR ’99 International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM Press,1999:230-237.
[26]JAMALI M,ESTER M.TrustWalker:A random walk model for combining trust-based and item-based recommendation[C]∥Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2009:397-406.
[1] 程章桃, 钟婷, 张晟铭, 周帆.
基于图学习的推荐系统研究综述
Survey of Recommender Systems Based on Graph Learning
计算机科学, 2022, 49(9): 1-13. https://doi.org/10.11896/jsjkx.210900072
[2] 王冠宇, 钟婷, 冯宇, 周帆.
基于矢量量化编码的协同过滤推荐方法
Collaborative Filtering Recommendation Method Based on Vector Quantization Coding
计算机科学, 2022, 49(9): 48-54. https://doi.org/10.11896/jsjkx.210700109
[3] 柴慧敏, 张勇, 方敏.
基于特征相似度聚类的空中目标分群方法
Aerial Target Grouping Method Based on Feature Similarity Clustering
计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203
[4] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[5] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[6] 方义秋, 张震坤, 葛君伟.
基于自注意力机制和迁移学习的跨领域推荐算法
Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning
计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011
[7] 李斌, 万源.
基于相似度矩阵学习和矩阵校正的无监督多视角特征选择
Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment
计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124
[8] 帅剑波, 王金策, 黄飞虎, 彭舰.
基于神经架构搜索的点击率预测模型
Click-Through Rate Prediction Model Based on Neural Architecture Search
计算机科学, 2022, 49(7): 10-17. https://doi.org/10.11896/jsjkx.210600009
[9] 齐秀秀, 王佳昊, 李文雄, 周帆.
基于概率元学习的矩阵补全预测融合算法
Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning
计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
[10] 孙晓寒, 张莉.
基于评分区域子空间的协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm Based on Rating Region Subspace
计算机科学, 2022, 49(7): 50-56. https://doi.org/10.11896/jsjkx.210600062
[11] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[12] 黄少滨, 孙雪薇, 李熔盛.
基于跨句上下文信息的神经网络关系分类方法
Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network
计算机科学, 2022, 49(6A): 119-124. https://doi.org/10.11896/jsjkx.210600150
[13] 蔡晓娟, 谭文安.
一种改进的融合相似度和信任度的协同过滤算法
Improved Collaborative Filtering Algorithm Combining Similarity and Trust
计算机科学, 2022, 49(6A): 238-241. https://doi.org/10.11896/jsjkx.210400088
[14] 王毅, 李政浩, 陈星.
基于用户场景的Android 应用服务推荐方法
Recommendation of Android Application Services via User Scenarios
计算机科学, 2022, 49(6A): 267-271. https://doi.org/10.11896/jsjkx.210700123
[15] 何亦琛, 毛宜军, 谢贤芬, 古万荣.
基于点割集图分割的矩阵变换与分解的推荐算法
Matrix Transformation and Factorization Based on Graph Partitioning by Vertex Separator for Recommendation
计算机科学, 2022, 49(6A): 272-279. https://doi.org/10.11896/jsjkx.210600159
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!