计算机科学 ›› 2022, Vol. 49 ›› Issue (12): 178-184.doi: 10.11896/jsjkx.220600024

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于差异性汉明距离的变分推荐算法

董家玮, 孙福振, 吴相帅, 吴田慧, 王绍卿   

  1. 山东理工大学计算机科学与技术学院 山东 淄博255000
  • 收稿日期:2022-06-02 修回日期:2022-09-02 发布日期:2022-12-14
  • 通讯作者: 孙福振(sunfuzhen@sdut.edu.cn)
  • 作者简介:(443191260@qq.com)
  • 基金资助:
    国家自然科学基金(61841602);山东省自然科学基金(ZR2020MF147)

Variational Recommendation Algorithm Based on Differential Hamming Distance

DONG Jia-wei, SUN Fu-zhen, WU Xiang-shuai, WU Tian-hui, WANG Shao-qing   

  1. School of Computer Science and Technology,Shandong University of Technology,Zibo,Shandong 255000,China
  • Received:2022-06-02 Revised:2022-09-02 Published:2022-12-14
  • About author:DONG Jia-wei,born in 1998,postgra-duate,is a member of China Computer Federation.His main research interests include recommender systems and so on.SUN Fu-zhen,born in 1978,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include computer vision,data mining and data analysis,etc.
  • Supported by:
    National Natural Science Foundation of China(61841602) and Natural Science Foundation of Shandong Province,China(ZR2020MF147).

摘要: 目前基于哈希技术的推荐算法常用汉明距离表示用户和项目哈希码的相似性,但忽略了哈希码中每位的潜在区别信息,为此提出了一个差异性汉明距离,通过考虑哈希码之间的差异性为哈希码赋予位权重;为差异性汉明距离设计了一个变分推荐模型,该模型分为用户哈希组件和项目哈希组件两部分,以变分自编码器结构连接。首先,模型利用编码器为用户和项目生成哈希码,为提高哈希码的鲁棒性,在哈希码中加入高斯噪声。其次,通过差异性汉明距离优化用户和项目哈希码,以最大限度地提高模型重构用户-项目评分的能力。在两个公开的数据集上的实验结果表明,在计算开销不变的前提下与最先进的哈希推荐算法相比,所提模型在NDCG上提高了3.9%,在MRR上提高了4.7%。

关键词: 汉明距离, 差异性汉明距离, 位权重, 推荐算法, 变分自编码器

Abstract: Current recommendation algorithms based on hashing technology commonly uses Hamming distance to indicate the similarity between user hash code and item hash code,while it ignores the potential difference information of each bit dimension.Therefore,this paper proposes a differential Hamming distance,which by calculating the dissimilarity between hash codes to assign bit weights.This paper designs a variational recommendation model for dissimilarity Hamming distance.The model is divided into a user hash component and an item hash component,which are connected by variational autoencoder structure.The model uses encoder to generate hash codes for user and items.In order to improve the robustness of the hash codes,we apply a Gaussian noise to both user and item hash coeds.Besides,the user and item hash codes are optimized by differential Hamming distance to maximize the ability of the model to reconstruct user-item scores.Experiments on benchmark datasets demonstrate that the proposed algorithm VDHR improves 3.9% in NDCG and 4.7% in MRR compared to the state-of-the-art hash recommendation algorithm under the premise of constant computational cost.

Key words: Hamming distance, Differential Hamming distance, Bit weights, Recommendation algorithm, Variational autoencoder

中图分类号: 

  • TP391.3
[1]WANG Z S,LI Q,WANG J,et al.Real-Time Personalized Re-commendation Based on Implicit User Feedback Data Stream[J].Chinese Journal of Computers,2016,39(1):53-64.
[2]HUANG C L,LU Y X.Research on Hybrid Music Recommendation Algorithm based on Collaborative Filtering and Tags[J].Software Engineering,2021,24(4):10-14.
[3]YEHUDA K,ROBERT B,CHRIS V.Matrix factorization techniques for recommender systems[J].Computer,2009,42(8):30-37.
[4]LI H Q,WANG Y X,CHEN Z D,et al.Ranking-Based Supervised Discrete Cross-Modal Hashing[J].Chinese Journal of Computers,2021,44(8):1620-1635.
[5]WU Z B,YU J Q,HE Y F,et al.Multi-level Semantic Binary Descriptor for Image Retrieval[J].Chinese Journal of Compu-ters,2020,43(9):1641-1655.
[6]ZHANG Z W,WANG Q F,RUAN L Y,et al.Preference preserving hashing for efficient recommendation[C]//Proceedings of the 37th International ACM SIGIR Conference on Research &Development in Information Retrieval.2014:183-192.
[7]ZHOU K,ZHA H Y.Learning binary codes for collaborative filtering[C]//Proceedings of the 18th ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining.2012:498-506.
[8]SALAKHUTDINOV R,HINTON G.Semantic hashing [J].International Journal of Approximate Reasoning,2009,50(7):969-978.
[9]ZOU A,HAO W N,JIN D W,et al.Study on Text RetrievalBased on Pre-training and Deep Hash[J].Computer Science,2021,48(11):300-306.
[10]CHEN Q,DAI Y W,LIU G J.Research on KPI anomaly detection model for intelligent operation and maintenance[J].Journal of Chongqing University of Technology(Natural Science),2022,36(6):181-188.
[11]ZHANG H W,SHEN F M,LIU W,et al.Discrete Collaborative Filtering[C]//Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA,2016:325-334.
[12]HUANG X L,CHEN C M,LIU G H.A hybrid second-order total variational noise reduction method for radiation-resistant images[J].Journal of Chongqing University of Technology(Natural Science),2022,34(4):585-594.
[13]CHAIDAROON S,FANG Y.Variational deep semantic hashing for text documents[C]//Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval.2017:75-84.
[14]ZHANG L,ZHANG Y D,TANG J H,et al.Binary code ranking with weighted Hamming distance[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2013:1586-1593.
[15]LIU C H,LU T,WANG X,et al.Compositional Coding for Collaborative Filtering[C]//International ACM SIGIR Conference on Research and Development in Information Retrieval.2019:145-154.
[16]CHEN W J.An Adaptive Approach for Knowledge Representation Fused with Topic Feature[J].Computer Engineering,2021,47(1):87-93,100.
[17]OOSTERHUIS H,RIJKE M.Unifying Online and Counterfactual Learning to Rank:A Novel Counterfactual Estimator that Effectively Utilizes Online Interventions[C]//Proceedings of the 14th ACM International Conference on Web Search and Data Mining(WSDM ’21).Association for Computing Machinery,New York,USA,2021:463-471.
[18]CHAIDAROON S,EBESU T,FANG Y.Deep Semantic Text Hashing with Weak Supervision[C]//International ACM SIGIR Conference on Research and Development in Information Retrieval.2018:1109-1112.
[19]ZHAO B Y,WANG L S,ZHANG M l,et al.Random PlaintextCollision Attack Against AES Algorithm with Reused Masks[J].Computer Engineering,2022,48(6):139-145,153.
[20]HANSEN C,HANSEN C,SIMONSEN J G,et al.Content-aware Neural Hashing for Cold-start Recommendation[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.Association for Computing Machinery,New York,USA,2020:971-980.
[21]LIAN D F,LIU R,GE Y,et al.Discrete Content-aware Matrix Factorization[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:325-334.
[22]ZHANG Y,YIN H Z,HUANG Z,etal.Discrete Deep Learning for Fast Content-Aware Recommendation[C]//ACM International Conference on Web Search and Data Mining.2018:717-726.
[23]SUN Y,PAN J W,ZHANG A,et al.FM2:Field-matrixed Factorization Machines for Recommender Systems[C]//Procee-dings of the Web Conference 2021(WWW ’21).Association for Computing Machinery,New York,USA,2021:2828-2837.
[24]GUO F Q,MENG F R,WANG Z X.Rumor Stance Classification Algorithm Based on Variational Auto-Encoder[J].Compu-ter Engineering,2022,48(2):99-105.
[1] 王冠宇, 钟婷, 冯宇, 周帆.
基于矢量量化编码的协同过滤推荐方法
Collaborative Filtering Recommendation Method Based on Vector Quantization Coding
计算机科学, 2022, 49(9): 48-54. https://doi.org/10.11896/jsjkx.210700109
[2] 窦家维.
保护隐私的汉明距离与编辑距离计算及应用
Privacy-preserving Hamming and Edit Distance Computation and Applications
计算机科学, 2022, 49(9): 355-360. https://doi.org/10.11896/jsjkx.220100241
[3] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[4] 蒲岍岍, 雷航, 李贞昊, 李晓瑜.
增强列表信息和用户兴趣的个性化新闻推荐算法
Personalized News Recommendation Algorithm with Enhanced List Information and User Interests
计算机科学, 2022, 49(6): 142-148. https://doi.org/10.11896/jsjkx.210400173
[5] 王美玲, 刘晓楠, 尹美娟, 乔猛, 荆丽娜.
基于评论和物品描述的深度学习推荐算法
Deep Learning Recommendation Algorithm Based on Reviews and Item Descriptions
计算机科学, 2022, 49(3): 99-104. https://doi.org/10.11896/jsjkx.210200170
[6] 唐雨潇, 王斌君.
基于深度生成模型的人脸编辑研究进展
Research Progress of Face Editing Based on Deep Generative Model
计算机科学, 2022, 49(2): 51-61. https://doi.org/10.11896/jsjkx.210400108
[7] 伍美霖, 黄佳进, 秦进.
用于协同过滤的序列解耦变分自编码器
Disentangled Sequential Variational Autoencoder for Collaborative Filtering
计算机科学, 2022, 49(12): 163-169. https://doi.org/10.11896/jsjkx.211200080
[8] 董云薪, 林耿, 张清伟, 陈颖婷.
基于Apriori算法填充数据及改进相似度的推荐算法
Recommendation Algorithm Based on Apriori Algorithm and Improved Similarity
计算机科学, 2022, 49(11A): 211000005-5. https://doi.org/10.11896/jsjkx.211000005
[9] 富坤, 郭云朋, 禚佳明, 李佳宁, 刘琪.
语义增强的完全不平衡标签网络表示学习算法
Semantic Information Enhanced Network Embedding with Completely Imbalanced Labels
计算机科学, 2022, 49(11): 109-116. https://doi.org/10.11896/jsjkx.210900101
[10] 董晓梅, 王蕊, 邹欣开.
面向推荐应用的差分隐私方案综述
Survey on Privacy Protection Solutions for Recommended Applications
计算机科学, 2021, 48(9): 21-35. https://doi.org/10.11896/jsjkx.201100083
[11] 赵金龙, 赵中英.
基于异质信息网络表示学习与注意力神经网络的推荐算法
Recommendation Algorithm Based on Heterogeneous Information Network Embedding and Attention Neural Network
计算机科学, 2021, 48(8): 72-79. https://doi.org/10.11896/jsjkx.200800226
[12] 张仁杰, 陈伟, 杭梦鑫, 吴礼发.
基于变分自编码器的不平衡样本异常流量检测
Detection of Abnormal Flow of Imbalanced Samples Based on Variational Autoencoder
计算机科学, 2021, 48(7): 62-69. https://doi.org/10.11896/jsjkx.200600022
[13] 熊旭东, 杜圣东, 夏琬钧, 李天瑞.
基于二分图卷积表示的推荐算法
Recommendation Algorithm Based on Bipartite Graph Convolution Representation
计算机科学, 2021, 48(4): 78-84. https://doi.org/10.11896/jsjkx.200400023
[14] 富坤, 赵晓梦, 付紫桐, 高金辉, 马浩然.
基于不完全信息的深度网络表示学习方法
Deep Network Representation Learning Method on Incomplete Information Networks
计算机科学, 2021, 48(12): 212-218. https://doi.org/10.11896/jsjkx.201000015
[15] 陈源毅, 冯文龙, 黄梦醒, 冯思玲.
基于知识图谱的行为路径协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm of Behavior Route Based on Knowledge Graph
计算机科学, 2021, 48(11): 176-183. https://doi.org/10.11896/jsjkx.201000004
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!