计算机科学 ›› 2017, Vol. 44 ›› Issue (5): 120-124.doi: 10.11896/j.issn.1002-137X.2017.05.022

• 信息安全 • 上一篇    下一篇

基于分类树的动态集值型数据发布的隐私保护

石秀金,胡艳玲   

  1. 东华大学计算机科学与技术学院 上海201620,东华大学计算机科学与技术学院 上海201620
  • 出版日期:2018-11-13 发布日期:2018-11-13

Privacy Preserving Based on Taxonomy Tree for Dynamic Set-valued Data Publishing

SHI Xiu-jin and HU Yan-ling   

  • Online:2018-11-13 Published:2018-11-13

摘要: 基于分类树的差分隐私保护方法有效地对静态集值型数据进行了保护,但对于动态集值型数据却没有相应的保护方法,因此提出一种基于分类树的差分隐私保护下的动态集值型数据发布的算法。该算法首先根据数据集中项的全集构造关系矩阵,挑选关系最紧密的项集构造分类树;然后设定一个边界值来限制数据的增量更新,并将新增的记录添加到分类树的根节点中,按照初始分类树的分配法迭代分配每个记录;最后根据拉普拉斯机制向叶子节点中加入噪音,保证整个算法满足差分隐私的要求。相对已有算法,所提算法优化了分类树,使所发布数据建立的分类树模型有少量的叶子节点产生,减少了噪音的添加。实验用两组真实的数据集验证了所提算法的有效性和相对于其他算法的优越性。

关键词: 隐私保护,分类树,动态集值型数据,增量更新

Abstract: Differential privacy protection method based on taxonomy tree is effective for static set-valued of data protection,but it doesn’t have corresponding protection method for dynamic set-valued data.So,this paper presented a classification tree based analysis of differential protection of privacy under dynamic set-valued data released.Firstly,according to the complete set structure relation matrix of the data set,this algorithm chooses the most closely related item set,and constructs the taxonomy tree.And then a boundary value is set to limit the incremental data update,and the new record is added to the root node of the taxonomy tree,in accordance with the initial taxonomy tree distribution method iteratively assigns each record.Finally,according to the Laplace mechanism,noise is added into leaf node to ensure that the algorithm satisfies differential privacy requirements.Compared with the existing algorithms,this algorithm optimizes the taxonomy tree,so that the release of data taxonomy tree model is established with small leaf nodes,reducing the noise added.Experiment with two real datasets shows that the algorithm is effective and performs better than existing algorithms.

Key words: Privacy protection,Taxonomy tree,Dynamic set-valued data,Incremental update

[1] XIONG P,ZHU T Q,WANG X F.A Survey on Differential Privacy and Application [J].Journal of Computers,2014,37(1):101-122.(in Chinese) 熊平,朱天清,王晓峰.差分隐私保护及其应用[J].计算机学报,2014,37(1):101-122.
[2] ZHANG X J,MENG X F.Differential Privacy in Data Publication and Analysis [J].Journal of Computers,2014,37(4):101-122.(in Chinese) 张啸剑,孟晓峰.面向数据发布和分析的差分隐私保护[J].计算机学报,2014,37(4):101-122.
[3] LI Y,HAO Z F,WEN W,et al.Research on Differential Privacy Preserving K-means Clustering[J].Journal of Computers,2013,40(3):287-290.(in Chinese) 李杨,郝志峰,温雯,等.差分隐私保护k-means聚类方法研[J].计算机科学,2013,40(3):287-290.
[4] ZHOU S G,LI F,TAO Y F,et al.Privacy Preservation in Database Application:A Survey[J].Journal of Computers,2009,32(5):847-861.(in Chinese) 周水庚,李丰,陶宇飞,等.面向数据库应用的隐私保护研宄综述[J].计算机学报,2009,32(5):847-861.
[5] TERROVITIS M,MAMOULIS N,KALNIS P.Privacy-preserving anonymization of set-valued data[J].Proceedings of the Vldb Endowment,2008,1(1):115-125.
[6] MAO Y Q.A Study of Efficiently Privacy Preserving Data Publishing of Set-valued Data[D].Hanzhou:Zhejiang University,2011.(in Chinese) 毛云青.高效的集值属性数据隐私保护发布技术研究[D].杭州:浙江大学,2011.
[7] CHEN R,ACS G,CASTELLUCCIA C.Differentially PrivateSequential Data Publication via Variable-Length N-Grams[C]∥ACM Conference on Computer and Communications Security (CCS).2012:638-649.
[8] CHEN R,MOHAMMED N,FUNG B C M,et al.Publishing Set-Valued Data via Differential Privacy[J].Proceedings VLDB Endowment,2011,4(11):1087-1098.
[9] SWEENEY L.Achieving k-anonymity privacy protection usinggeneralization and suppression[J].International Journal on Uncertaint,Fuzziness and Knowledge-based Systems,2002,10(5):571-588.
[10] LU W T,MIKLAU G,GUPTA V.Generating private Synthetic Databases for Untrusted System Evaluation[C]∥Proceedings of the 30th International Conference on Data Engineerin(ICDE).Washington,USA,2014:654-663.
[11] DWORK C.Differential privacy in new settings[C]∥Proc.Symposium on Discrete Algorithms (SODA),Society for Industrial and Applied Mathematics.2010:174-183.
[12] DWORK C.A firm foundation for private data analysis[J].Communications of the ACM,2011,54(1):86-95.
[13] XIAO X K,WANG G Z,GCHRKE J.Differential privacy via wavelet transforms[J].IEEE Transactions on Knowledge and Data Engincering,2011,23(8):1200-1214.
[14] NERGIZ M E,CLIFTON C,NERGIZ A E.Multirelational k-anonymity[J].IEEE Trans on Knowledge and Data Enginee-ring,2009,1(8):1104-1117.
[15] Releasing Differentially Private DataCubes for Health Information[C]∥Proceedings of the 28th International Conference on Data Engineerin(ICDE).Washington,USA,2012:1305-1308.
[16] HECKERMAN D.MSNBC [EB/OL].(1999-09-28).http://ar-chive.ics.uci.edu/ml/datasets.
[17] Frequent itemset mining dataset repository.http://fimi.ua.ac.be/data.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!