计算机科学 ›› 2015, Vol. 42 ›› Issue (4): 194-198.doi: 10.11896/j.issn.1002-137X.2015.04.039

• 人工智能 • 上一篇    下一篇

高效隐私保护频繁模式挖掘算法研究

程舒通,徐从富,但红卫   

  1. 浙江大学计算机科学与技术学院 杭州310027;杭州广播电视大学信息工程学院 杭州310012,浙江大学计算机科学与技术学院 杭州310027,浙江大学计算机科学与技术学院 杭州310027
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61272303),杭州广播电视大学科研课题(HKYYB-2013-1)资助

Research on Efficient Privacy Preserving Frequent Pattern Mining Algorithm

CHENG Shu-tong, XU Cong-fu and DAN Hong-wei   

  • Online:2018-11-14 Published:2018-11-14

摘要: 阐述了隐私保护数据挖掘的目标,即在获取有效的数据挖掘结果的同时,满足用户对隐私保护的要求。针对个体用户及组织用户的隐私保护,论述了不同的方法,并归纳出隐私保护数据挖掘中所采用的两种主流算法。改进了高效隐私保护关联规则挖掘算法(EMASK)中需要完全的数据库扫描并且进行多次比较操作的弊端,提出了基于粒度计算的高效隐私保护频繁模式挖掘算法(BEMASK)。该算法将关系数据表转换成面向机器的关系模型,数据处理被转换成粒度计算的方式,计算频繁项集变成了计算基本颗粒的交集。特别是数据的垂直Bitmap表示,在保证准确性不降低的情况下,一方面减少了I/O操作的次数,另一方面较大地提高了效率。

关键词: 数据挖掘,隐私保护,频繁模式,知识粒度

Abstract: This paper elaborated the goal of privacy preserving data mining,that is to satisfy the demand of users for privacy protection as we acquire the mining results of effective data mining.For privacy protection to individual users and group users,this paper discussed different methods,and summed up two main algorithms in data mining of privacy preservation.Since efficient mining associations with secrecy konstraints(EMASK) needs full database scan and many comparison operations,the author came up with efficient mining associations with secrecy konstraints which is based on Bitmap computation(BEMASK).It transforms relational data form into relational model for machine,data processing is converted into granular computing method and calculation of frequent item-sets is turned into computing the intersection set of basic particles.Especially the vertical representation of Bitmap,under the condition of ensuring that accuracy is not reduced,on one hand reduces the number of I/O operations,on the other hand,greatly improves the efficiency.

Key words: Data mining,Privacy preserving,Frequent pattern,Knowledge granularity

[1] 王艳,乐嘉锦,孙捷,等.网络用户行为的隐私保护数据挖掘方法[J].计算机工程与应用,2012,48(13):138-143
[2] 马进,李锋,李建华.分布式数据挖掘中基于扰乱的隐私保护方法[J].浙江大学学报:工学版,2010,44(2):276-282
[3] 张鹏,童云海,唐世渭,等.一种有效的隐私保护关联规则挖掘方法[J].软件学报,2006,17(8):1764-1774
[4] 孙茂华.安全多方计算及其应用研究[D].北京:北京邮电大学,2013
[5] Pawlak Z,Grzymala-Busses J,Slowinski R,et al.Rough sets[J].Communications of the ACM,1995,38(11):88-95
[6] Zadeh L A.Some reflections on soft computing,granular computing and their roles in the conception,design and utilization of information/intelligent systems[J].Soft Computing,1998,2(1):23-25
[7] Yao Yi-yu.The Art of Granular Computing[C]∥Proc of the International Conference on Rough Sets and Emerging Intelligent Systems Paradigms,2007.Warsaw,Poland,2007:101-112
[8] Chen Hong-mei,Li Tian-rui,Ruan Da,et al.A rough-set based incremental approach for updating approximations under dynamicmaintenance environments[J].IEEE Transactions on Know-ledge and Data Engineering,2013,25(2):274-284
[9] 王磊,李天瑞.一种基于矩阵的知识粒度计算方法模式[J].模式识别与人工智能,2013,26(5):447-453
[10] 项海飞.基于互信息粒度的相对约简的矩阵计算方法[J].西南师范大学学报:自然科学版,2014,39(3):60-64
[11] Rizvi S J,Haritsa J R.Maintaining Data Privacy in Association Rule Mining[C]∥Proc of the 28th Intl Conf on Very Large DataBases (VLDB),2002.Hong Kong,China,2002:682-693
[12] Agrawal S,Krishnan V,Haritsa J R.On Addressing Efficiency Concerns in Privacy-preserving Mining[C]∥Proc of 9th Intl Conf on Database Systems for Advanced Applications (DASFAA),2004.Jeju Island,Korea,2004:113-124

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!