计算机科学 ›› 2017, Vol. 44 ›› Issue (7): 42-46.doi: 10.11896/j.issn.1002-137X.2017.07.008

• 2016 年全国理论计算机科学学术年会 • 上一篇    下一篇

一种基于权重属性熵的分类匿名算法

廖军,蒋朝惠,郭春,平源   

  1. 贵州大学计算机科学与技术学院 贵阳550000,贵州大学计算机科学与技术学院 贵阳550000,贵州大学计算机科学与技术学院 贵阳550000,许昌学院信息工程学院 许昌461000
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金项目(61303232,61540049),贵州省基础研究重大项目(黔科合JZ字[2014]2001-21),贵州大学研究生创新基金(院项目),河南省高等学校重点科研项目(16A520025),许昌学院优秀青年骨干教师资助

Classification Anonymity Algorithm Based on Weight Attributes Entropy

LIAO Jun, JIANG Chao-hui, GUO Chun and PING Yuan   

  • Online:2018-11-13 Published:2018-11-13

摘要: 为了在高效地保护数据隐私不被泄露的同时保证数据效用,提出了一种基于权重属性熵的分类匿名方法(Weight-properties Entropy for Classification Anonymous,WECA)。该方法在数据分类挖掘的特定应用背景下,通过信息熵的概念来计算数据集中不同准标识符属性对敏感属性的分类重要程度,选取分类权重属性熵比率最高的准标识符属性对分类树进行有利的划分,同时构建了分类匿名信息损失度量,在更好地保护隐私数据的前提下确保了数据分类效用。最后,在标准数据集上的实验结果表明,该算法在保证较少的匿名损失的同时具有较高的分类精度,提高了数据可用性。

关键词: 隐私保护,分类匿名,权重属性熵,分类精度

Abstract: In order to efficiently protect data privacy being not leaked,which have high availability,a classification anony-mous method based on weight attributes entropy(WECA) was proposed.The method builds on application-specific background of data classification mining,and calculates the classification importance of different standard identifier to sensitive attribute by the concept of information entropy in the data set,which selects the highest ratio of weight attribu-tes entropy in classification quasi-identifier attributes to favorably divide the classification tree.The method also constructs the anonymous information loss measures of classification,which ensures the utility of classification on the premise of protecting privacy data.Finally,the experimental results on the standard data set show that the algorithm has fewer anonymous losses and higher classification accuracy,improving data availability.

Key words: Privacy protection,Classification anonymous,Weight attributes entropy,Classification accuracy

[1] FENG D G,ZHANG M,LI H.Big data security and privacyprotection [J].Chinese Journal of Computers,2014,37(1):246-258.(in Chinese) 冯登国,张敏,李昊.大数据安全与隐私保护[J].计算机学报,2014,7(1):246-258.
[2] LIU Y H,ZHANG T Y,JIN X L,et al.Personal privacy protection in the era of big data [J].Journal of Computer Research and Development,2015,52(1):229-247.(in Chinese) 刘雅辉,张铁赢,勒小龙,等.大数据时代的个人隐私保护[J].计算机研究与发展,2015,2(1):229-247.
[3] SWEENEY L.K-anonymity:a model for protecting privacy[J].International Journal on Uncertainty,Fuzziness and Knowledge based Systems,2002,10(5):571-578.
[4] AGGARWAL G,PANIGRAHY R,FEDR T,et al.Achievinganonymity via clustering [J].ACM Transactions on Algorithms,2010,6(3):1-19.
[5] B C,FUNG M,WANG K,et al.Top-Down Specialization for Information and Privacy Preservation[C]∥Proceedings of the 21st IEEE International Conference on Data Engineering (ICDE2005).Tokyo Japan,2005:205-216.
[6] XU J,WANG WEI,PEI J,et al.Utility-based anonymization using local recoding[C]∥ Proceedings of the 12th International Conference on Knowledge Discovery and Data Mining (SIGK-DD).Philadelphia,PA,USA,2006:785-790.
[7] LI T C,LI N H.On the tradeoff between privacy and utility in data publishing [C]∥Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York,NY,USA:Association for Computing Machinery,2009:517-525.
[8] SHEN Y G,SHAO H,ZHANG Y Q.Research on privacy preserving distributed decision-tree classification algorithm[J].Application Research of Computers,2010,7(8):3070-3072.(in Chinese) 申艳光,邵慧,张永强.隐私保护的分布式决策树分类算法的研究[J].计算机应用研究,2010,7(8):3070-3072.
[9] LI G,WANG Y D.An improved privacy-preserving classifica-tion mining method based on singular value decomposition [J].ACTA Electronica Sinica,2012,0(4):739-744.(in Chinese) 李光,王亚东.一种改进的基于奇异值分解的隐私保持分类挖掘方法[J].电子学报,2012,0(4):739-744.
[10] KISILEVICH S,ROKACH L,ELOVICI Y,et al.Efficient multidimensional suppression for K-anonymity[J].IEEE Transactions on Knowledge and Data Engineering,2010,2(3):334-347.
[11] ZHAO S,CHEN L.Personalized (a,l)-anonymity method based on sensitivity[J].Computer Engineering,2015,1(1):115-120.(in Chinese) 赵爽,陈力.基于敏感度个性化 (a,l)-匿名方法[J].计算机工程,2015,41(1):115-120.
[12] YANG J,WANG C,ZHANG J P,et al.Micro-aggregation algorithm based on sensitive attribute entropy[J].ACTA Electronica Sinica,2014,2(7):1327-1337.(in Chinese) 杨静,王超,张健沛,等.基于敏感属性熵的微聚集算法[J].电子学报,2014,2(7):1327-1337.
[13] LI J Y,LIU J X,BAIG M.Information based data anonymization for classification utility[J].IEEE Transactions on Knowle-dge and Data Engineering,2011,0(12):1030-1045.
[14] XU Y,QIN X L,YANG Y T,et al.A QIweight-aware approach to privacy preserving publishing data set[J].Journal of Compu-ter Research and Development,2012,9(5):913-924.(in Chinese) 徐勇,秦小麟,杨一涛,等.一种考虑属性权重的隐私保护数据发布方法[J].计算机研究与发展,2012,9(5):913-924.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!