计算机科学 ›› 2023, Vol. 50 ›› Issue (11): 132-142.doi: 10.11896/jsjkx.230400045

• 数据库&大数据&数据科学 • 上一篇    下一篇

面向兴趣点推荐系统的自然噪声过滤算法

朱俊1,2, 韩立新2, 宗平2, 徐逸卿1, 夏吉安1, 唐铭1   

  1. 1 南京工业职业技术大学计算机与软件学院 南京 210023
    2 河海大学计算机与信息学院 南京 211100
  • 收稿日期:2023-04-07 修回日期:2023-07-04 出版日期:2023-11-15 发布日期:2023-11-06
  • 通讯作者: 朱俊(zj_zijin@163.com)
  • 基金资助:
    国家自然科学基金(41771251);江苏省高校自然科学研究项目(21KJB520009);南京工业职业技术大学引进人才科研启动基金(YK23-05-01)

Natural Noise Filtering Algorithm for Point-of-Interest Recommender Systems

ZHU Jun1,2, HAN Lixin2, ZONG Ping2, XU Yiqing1, XIA Ji’an1, TANG Ming1   

  1. 1 School of Computer and Software,Nanjing Vocational University of Industry Technology,Nanjing 210023,China
    2 College of Computer and Information Engineering,Hohai University,Nanjing 211100,China
  • Received:2023-04-07 Revised:2023-07-04 Online:2023-11-15 Published:2023-11-06
  • About author:ZHU Jun,born in 1987,Ph.D,associate professor,is a member of China Computer Federation.Her main research interests include machine learning and recommender systems.
  • Supported by:
    National Natural Science Foundation of China(41771251), Natural Science Foundation of the Higher Education Institutions of Jiangsu Province, China(21KJB520009) and Start-up Fund for New Talented Researchers of Nanjing Vocational University of Industry Technology(YK23-05-01).

摘要: 推荐系统源数据中存在着固有的自然噪声,给推荐算法带来了误差与干扰。现有研究更加关注以各类安全攻击为代表的恶意噪声,仅有少数文献针对更为隐蔽、更难处理的自然噪声进行研究,且这些研究几乎都集中在传统推荐领域。在兴趣点推荐场景中,无论是源数据特征,还是自然噪声的产生原因和表现方式,均与传统推荐领域有较大差别。针对兴趣点推荐系统中的自然噪声,提出了基于离散特征量化与聚类距离分析的自然噪声过滤算法NFDC。该算法定义并计算用户签到数据的离散度,量化数据驱动的不确定性,利用推荐算法的准确度(F1值)量化预测驱动的不确定性,深入挖掘两者之间的相关性,构建经验模型,推导潜在自然噪声比例;采用模糊C均值聚类方法分析用户行为模式的相似性,在聚类距离分析的基础上筛选可疑噪声,并自定义噪声验证规则,删除真正的自然噪声。在两个真实的位置社交网络数据集(Brightkite和Gowalla)中,分别采用NFDC算法和其他4种基准方法对源数据进行预处理,将处理后的数据集分别输入到5类代表性的兴趣点推荐算法中,对比不同的降噪技术对提升各类兴趣点推荐算法准确性的影响程度。实验结果表明,NFDC算法能够有效降低系统源数据中的自然噪声,为后续的推荐算法提供可靠的输入。与其他降噪数据集中的最高推荐精度相比,各类推荐算法在NFDC处理后的Brightkite和Gowalla数据集中的准确度分别平均提高了15.95%和5.00%。

关键词: 推荐系统, 兴趣点推荐, 自然噪声, 不确定性, 离散度, 聚类

Abstract: The inherent natural noise in the original dataset of recommender systems(RSs) causes error and interference to re-commendation algorithms.Existing studies pay more attention to the malicious noise represented by various security attacks.The natural noise which is more subtle and difficult to deal with has rarely been documented.Most researches about natural noise are conducted for conventional RSs.However,the data feature and the causes and forms of natural noise in point-of-interest(POI) RSs are all different from those in conventional RSs.To filter the natural noise for POI RSs,a novel natural noise filtering method(NFDC) based on dispersion quantification and clustering distance analysis is proposed.The dispersion of a subset of the original check-in dataset is defined and calculated to indicate the data-driven uncertainty,and the accuracy metric F1 is adopted to represent the prediction-driven uncertainty.The measures of dispersion and accuracy metric vectors are empirically categorized to identify the proportion of the potential noise.The fuzzy C-means-based denoi-sing algorithm is performed to analyze the similarity of user behavior patterns and then screen the potentially noisy points based on clustering distance analysis.A customized rule is designed to further verify and delete the natural noise.Extensive experiments are conducted on two real-world location-based social network datasets,Brightkite and Gowalla.The datasets processed by NFDC and the other four benchmark algorithms are respectively input into five representative POI recommendation algorithms for comparison.Experimental results show that NFDC effectively filters the natural noise and provides reliable input for RSs.Compared with the highest accuracy supported by other denoi-sing methods,the accuracy in NFDC-processed Brightkite and Gowalla datasets is respectively improved by 15.95% and 5.00% on average.

Key words: Recommender system, Point-of-Interest recommendation, Natural noise, Uncertainty, Dispersion, Clustering

中图分类号: 

  • TP181
[1]ZHANG Q,YU S Y,YIN H F,et al.Neural collaborative filtering for social recommendation algorithm based on graph attention[J].Computer Science,2023,50(2):115-122.
[2]CHENG Z T,ZHONG T,ZHANG S M,et al.Survey of recommender systems based on graph learning[J].Computer Science,2022,49(9):1-13.
[3]O’MAHONY M P,HURLEY N J,SILVESTRE G.Detecting noise in recommender system databases[C]//Proceedings of the 11th International Conference on Intelligent User Interfaces.2006:109-115.
[4]BAG S,KUMAR S,AWASTHI A,et al.A noise correction-based approach to support a recommender system in a highly sparse rating environment[J].Decision Support Systems,2019,118:46-57.
[5]WANG Y L,JIANG C C,FENG X N,et al.Time aware point-of-interest recommendation[J].Computer Science,2021,48(9):43-49.
[6]ZHU J,HAN L X,GOU Z N,et al.A robust personalized location recommendation based on ensemble learning[J].Expert Systems With Applications,2021,167:114065.
[7]BELLOGÍN A,SAID A,DE VRIES A P.The magic barrier of recommender systems - no magic,just ratings[C]//International Conference on User Modeling,Adaptation,and Personalization.2014:25-36.
[8]WANG Z W,GAO M,LI J D,et al.Gray-Box shilling attack:An adversarial learning approach[J].ACM Transactions on Intelligent Systems and Technology,2022,13(5):82.
[9]LIU Z,FENG X D,WANG Y C,et al.Self-paced learning enhanced neural matrix factorization for noise-aware recommendation[J].Knowledge-Based Systems,2021,213:106660.
[10]CASTRO J,YERA R,MARTÍNEZ L.A fuzzy approach for na-tural noise management in group recommender systems[J].Expert Systems With Applications,2018,94(15):237-249.
[11]SHARON M J,DHINESH B L D.A fuzzy linguistic approach-based non-malicious noise detection algorithm for recommendation system[J].International Journal of Fuzzy Systems,2018,20:2368-2382.
[12]AMATRIAIN X,PUJOL J M,TINTAREV N,et al.Rate itagain:increasing recommendation accuracy by user re-rating[C]//Proceedings of the Third ACM Conference on Recommender Systems.2009:173-180.
[13]PHAM H X,JUNG J J.Preference-based user rating correction process for interactive recommendation systems[J].Multimedia Tools and Applications,2013,65(1):119-132.
[14]YU P H,LIN L F,YAO Y G.A novel framework to process the quantity and quality of user behavior data in recommender systems[C]//International Conference on Web-Age Information Management.2016:3-5.
[15]TOLEDO R Y,MOTA Y C,MARTÍNEZ L.Correcting noisy ra-tings in collaborative recommender systems[J].Knowledge-Based Systems,2015,76:96-108.
[16]XIA B,LI T,LI Q M,et al.Noise-tolerance matrix completion for location recommendation[J].Data Mining and Knowledge Discovery,2018,32:1-24.
[17]LI D T C,LIU H,ZHANG Z L,et al.CARM:Confidence-aware recommender model via review representation learning and historical rating behavior[J].Neurocomputing,2021,455:283-296.
[18]COSLEY D,LAM S K,ALBERT I,et al.Is seeing believing?How recommender system interfaces affect users’ opinions[C]//Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.2003:585-592.
[19]LI W H,LI X G,DENG J Z,et al.Sentiment based multi-index integrated scoring method to improve the accuracy of recommender system[J].Expert Systems With Applications,2021,179:115105.
[20]LI B,CHEN L,ZHU X Q,et al.Noisy but non-malicious user detection in social recommender systems[J].World Wide Web,2013,16(5):677-699.
[21]YERA R,CASTRO J,MARTÍNEZ L.A fuzzy model for ma-naging natural noise in recommender systems[J].Applied Soft Computing,2016,40:187-198.
[22]YERA R,BARRANCO M J,ALZAHRANI A A,et al.Exploring fuzzy rating regularities for managing natural noise in collaborative recommendation[J].International Journal of Computational Intelligence Systems,2019,12(2):1382-1392.
[23]CASTRO J,YERA R,MARTÍNEZ L.An empirical study ofnatural noise management in group recommendation systems[J].Decision Support Systems,2017,94:1-17.
[24]WANG P Y,WANG Y,ZHANG L Y,et al.An effective and efficient fuzzy approach for managing natural noise in recommender systems[J].Information Sciences,2021,570:623-637.
[25]ZHOU D Q,WANG B,RAHIMI S M,et al.A study of recommending locations on location-based social network by collaborative filtering[C]//Proceedings of the 25th Canadian Conference on Advances in Artificial Intelligence.2012:255-266.
[26]BORGELT C,BRAUNE C,LESOT M J,et al.Handling noise and outliers in fuzzy clustering[M]//Fifty Years of Fuzzy Logic and its Applications.Springer,2015:315-335.
[27]SALAH A,ROGOVSCHI N,NADIF M.A dynamic collaborative filtering system via a weighted clustering approach[J].Neurocomputing,2016,175:206-215.
[28]GENG B R,JIAO L C,GONG M G,et al.A two-step persona-lized location recommendation based on multi-objective immune algorithm[J].Information Sciences,2019,475:161-181.
[29]CHEN K K,CHAN P P K,ZHANG F,et al.Shilling attack based on item popularity and rated item correlation against collaborative filtering[J].International Journal of Machine Lear-ning and Cybernetics,2019,10:1833-1845.
[30]CHEN J,WANG X S,ZHAO S,et al.Deep attention user-based collaborative filtering for recommendation[J].Neurocomputing,2020,383:57-68.
[31]JIANG L C,LIU R R,JIA C X.User-location distribution serves as a useful feature in item-based collaborative filtering[J].Phy-sica A-Statistical Mechanics and Its Applications,2022,586,126491.
[32]ZHOU X,HE J,HUANG G Y,et al.SVD-based incrementalapproaches for recommender systems[J].Journal of Computer and System Sciences,2015,81:717-733.
[33]LIN K H,WANG J J,ZHANG Z N,et al.Adaptive location re-commendation algorithm based on location-based social networks[C]//Proceedings of International Conference on Computer Science & Education.2015:137-142.
[34]SI Y L,ZHANG F Z,LIU W Y.An adaptive point-of-interest recommendation method for location-based social networks based on user activity and spatial features[J].Knowledge-Based Systems,2019,163:267-282.
[35]SU C,WU P F,XIE X Z,et al.Point of interest recommendation based on user's interest and geographic factors[J].Computer Science,2019,46(4):228-234.
[36]CHEN J,ZHANG H,CAO F Y.Study on point-of-interest collaborative recommendation method fusing multi-factors[J].Computer Science,2019,46(10):77-83.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!