(k，l)-多样性数据发布研究

Abstract

Abstract: In order to avoid disclosure of individual identity and sensitive attribute,reduce the information loss when data release,a clustering-based algorithm to achieve(k,l)-diversity(CBAD)in data publishing was presented．The discrete attributes and continuous attributes mixed in the data set were fully taken into account while clustering．The probability distribution was used as metrics to measure similarity between the data objects．We solved the confusion of the information loss and the distance between data objects,pointed out that the clustering-based optimization(k,l)-diversity algorithm is NP-hard problem,proposed the concept of privacy protection degree with parameter k and l,and analysed the complexity of the algorithm．Theoretical analysis and experimental results show that the method can effectively reduce the execution time and information loss,improve query precision.

Key words: Privacy preserving,Data publishing,l-Diversity,Data utility,Clustering,Similarity measures

YANG Gao-ming,LI Jing-zhao,YANG Jing and ZHU Guang-li. Achieving(k,l)-Diversity in Privacy Preserving Data Publishing[J].Computer Science, 2013, 40(8): 140-145.

References

[1] 杨高明,杨静,张健沛．隐私保护的数据发布研究[J]．计算机科学,2011,38(9):11-17
[2] Machanavajjhala A,Gehrke J,Kifer D,et al．l-Diversity:Privacy beyond k-anonymity[C]∥22nd International Conference on Data Engineering:Institute of Electrical and Electronics Engineers Computer Society．Atlanta,G A,United states,2006:24
[3] Wong R,Li J,Fu A,et al．(α,k)-anonymous data publishing[J]．Journal of Intelligent Information Systems,2009,33,(2):209-234
[4] Ninghui L,Tiancheng L,Venkatasubramanian S．t-Closeness:Privacy beyond k-anonymity and l-diversity[C]∥ Proceedings of the 23rd International Conference on Data Engineering．Inst．of Elec．and Elec．Eng．Computer Society,Istanbul,Turkey,2007:106-115
[5] Lefevre K,Dewitt D J,Ramakrishnan R．Incognito:Efficientfull-domain k-anonymity[C]∥ACM SIGMOD International Conference on Management of Data．United states.Association for Computing Machinery,Baltimore,Maryland,2005:49-60
[6] Kabir M E,Wang H,Bertino E．Efficient systematic clusteringmethod for k-anonymization[J]．Acta Informatica,2011,48,(1):51-66
[7] Aggarwal G,Panigrahy R,Tom,et al．Achieving anonymity via clustering [J]．ACM Trans．Algorithms,2010,6(3):1-19
[8] 王智慧,许俭,汪卫,等．一种基于聚类的数据匿名方法[J]．软件学报,2010,21(04):680-693
[9] Kenig B,Tassa T．A practical approximation algorithm for optimal k-anonymity[J]．Data Mining and Knowledge Discovery,2012,25(1):134-168
[10] Ni W,Chong Z．Clustering-oriented privacy-preserving data publishing[J]．Knowledge-Based Systems,2012,35:264-270
[11] Sweeney L．k-anonymity:A model for protecting privacy[J]．International Journal of Uncertainty Fuzziness and Knowledge-Based Systems,2002,10(5):557-570
[12] Xu J,Wang W,Pei J,et al．Utility-based anonymization using local recoding[C]∥Philadelphia,PA,USA．Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining．USA:ACM,2006:785-790
[13] Li C,Biswas G．Unsupervised learning with mixed numeric and nominal data[J]．IEEE Transactions on Knowledge and Data Engineering,2002,14(4):673-690
[14] Meyerson A,Williams R．On the complexity of optimal k-ano-nymity [C]∥Proceedings of the twenty-third ACM SIGMOD-SIGACT- SIGART symposium on Principles of database systems ACM．2004:223-228
[15] Xiao X,Yi K,Tao Y．The hardness and approximation algo-rithms for L-diversity[C]∥13th International Conference on Extending Database Technology:Advances in Database Technology．Association for Computing Machinery,Lausanne,Switzerland,2010:135-146
[16] Ghinita G,Karras P,Kalnis P,et al．A framework for efficient data anonymization under privacy and accuracy constraints[J]．ACM Transactions on Database Systems,2009,34(2)

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Achieving(k,l)-Diversity in Privacy Preserving Data Publishing

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0