Computer Science ›› 2017, Vol. 44 ›› Issue (6): 139-143, 149.doi: 10.11896/j.issn.1002-137X.2017.06.023

Previous Articles     Next Articles

Method of Constructing Differential Privacy Decision Tree Classifier with Incomplete Data Sets

SHEN Si-qian, MAO Yu-guang and JIANG Guan-ru   

  • Online:2018-11-13 Published:2018-11-13

Abstract: We mainly studied the problem of constructing differential privacy decision tree classifier with incomplete data sets.We first introduced the differential privacy ID3 decision tree algorithm and differentially private random decision tree algorithm.Then we considered the weakness of the algorithms talked above,and created a new differentially private random decision tree algorithm with exponential mechanism.Finally,an approach for decision tree classifier with incomplete data sets was proposed,which yields better prediction while maintaining good privacy without inserting values,called WP(Weight Partition).And the experimental results show that our approach is suitable for either differential privacy ID3 decision trees or differentially private random decision trees,either laplace or exponential mechanism.

Key words: Differential privacy,Incomplete data sets,ID3 decision tree algorithm,Random decision tree algorithm

[1] DWORK C,DANOS V,KASHEFI K,et al.Automata,Languages and Programming[J].Verlag Lecture Notes in Computer Science,1980,27(3):282-298.
[2] CLIFTON C,KANTARCIOGLU M,VAIDYA J.Defining Privacy for Data Mining[C]∥Proceedings of the National Science Foundation Workshop on Next Generation Data Mining.2012:126-133.
[3] SWEENEY L.k-anonymity:A Model for Protecting Privacy[J].International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems,2002,0(5):557-570.
[4] MACHANAVAJJHALA A,KIFER D,GEHRKE J,et al.l-Diversity:Privacy Beyond k-Anonymity[C]∥ Proceeding of the 22nd International Conference on Data Engineering.2006:24 .
[5] XIONG P,ZHU T Q,WANG X F.A Survey on Differential Privacy and Applications[J].Chinese Journal of Computers,2014,7(1):101-122.(in Chinese) 熊平,朱天清,王晓峰.差分隐私保护及其应用[J].计算机学报,2014,7(1):101-122.
[6] HAY M,RASTOGI V,MIKLAU G,et al.Boosting the Accuracy of Differentially Private Histograms Through Consistency[J].Proceedings of the VLDB Endowment,2010,3(1/2):1021-1032.
[7] XIAO X,WANG G,GEHRKE J.Differential Privacy via Wavelet Transforms[J].IEEE Transactions on Knowledge and Data Engineering,2011,3(8):1200-1214.
[8] CORMODE G,PROCOPIUC C,SRIVASTAVA D,et al.Diffe-rentially Private Spatial Decompositions[C]∥2012 IEEE 28th International Conference on Data Engineering (ICDE).2012:20-31.
[9] LI C,HAY M,RASTOGI V,et al.Optimizing Linear Counting Queries under Differential Privacy[C]∥Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems.2010:123-134.
[10] XIONG P,ZHU T,NIU W,et al.A Differentially Private Algorithm for Location Data Release[J].Knowledge and Information Systems,2016,47(3):647-669.
[11] XIAO X,BENDER G,HAY M,et al.iReduct:Differential Privacy with Reduced Relative Errors[C]∥Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data.2011:229-240.
[12] CHAUDHURI K,MONTELEONI C,SARWATE A D.Diffe-rentially Private Empirical Risk Minimization[J].The Journal of Machine Learning Research,2011,2:1069-1109.
[13] ZHANG J,XIAO X,YANG Y,et al.PrivGene:DifferentiallyPrivate Model Fitting using Genetic Algorithms[C]∥Procee-dings of the 2013 ACM SIGMOD International Conference on Mana-gement of Data.2013:665-676.
[14] BLUM A,DWORK C,MCSHERRY F,et al.Practical privacy:the SuLQ Framework[C]∥Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems.2005:128-138.
[15] MCSHERRY F D.Privacy Integrated Queries:an ExtensiblePlatform for Privacy-preserving Data Analysis[C]∥Proceedings of the 2009 ACM SIGMOD International Conference on Mana-gement of Data.2009:19-30.
[16] FRIEDMAN A,SCHUSTER A.Data mining with DifferentialPrivacy[C]∥Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2010:493-502.
[17] JAGANNATHAN G,PILLAIPAKKAMNATT K,W RIGHT R N.A Practical Differentially Private Random Decision Tree Classifier[C]∥IEEE International Conference on Data Mining Workshops,2009(ICDMW’09).2009:114-121.
[18] XIONG P,ZHU T Q,JING D W.Different Private Data Publishing Algorithm for Building Decision Tree[J].Application Research Computers,2014,1(10):3108-3112.(in Chinese) 熊平,朱天清,金大卫.一种面向决策树构建的差分隐私保护算法[J].计算机应用研究,2014,1(10):3108-3112.
[19] DWORK C.The promise of Differential Privacy:A Tutorial on Algorithmic Techniques[C]∥Foundations of Computer Science (FOCS).2011:1-2.
[20] DWORK C.Differential Privacy in New Settings[C]∥SODA.2010:174-183.
[21] DWORK C.Differential Privacy:A survey of Results[M]∥Theo-ry and Applications of Models of Computation.Springer,2008:1-19.
[22] MCSHERRY F,TALWAR K.Mechanism Design via Differen-tial Privacy[C]∥48th Annual IEEE Symposium on Foundations of Computer Science,2007(FOCS’07).2007:94-103.

No related articles found!
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .