计算机科学 ›› 2019, Vol. 46 ›› Issue (1): 206-211.doi: 10.11896/j.issn.1002-137X.2019.01.032

• 信息安全 • 上一篇    下一篇

面向医疗数据发布的动态更新隐私保护算法

陈虹云, 王杰华, 胡兆鹏, 贾露, 喻纪文   

  1. (南通大学计算机科学与技术学院 江苏 南通226019)
  • 收稿日期:2017-12-19 出版日期:2019-01-15 发布日期:2019-02-25
  • 作者简介:陈虹云(1993-),女,硕士生,主要研究方向为信息安全;王杰华(1965-),男,硕士,教授,主要研究方向为信息安全、数字水印,E-mail:wang.jh@ntu.edu.cn(通信作者);胡兆鹏(1994-),男,硕士生,主要研究方向为网络安全、机器学习;贾 露(1993-),女,硕士生,主要研究方向为数据挖掘;喻纪文(1992-),男,硕士生,主要研究方向为数据挖掘、机器学习。
  • 基金资助:
    国家自然科学基金项目(61170171),江苏高校优势学科建设工程资助项目,江苏省“六大人才高峰”项目(2010-WLW-006),南通市应用基础研究计划项目(GY12016015,MS12016048)资助

Privacy Preserving Algorithm Based on Dynamic Update in Medical Data Publishing

CHEN Hong-yun, WANG Jie-hua, HU Zhao-peng, JIA Lu, YU Ji-wen   

  1. (College of Computer Science and Technology,Nantong University,Nantong,Jiangsu 226019,China)
  • Received:2017-12-19 Online:2019-01-15 Published:2019-02-25

摘要: 随着信息技术的发展,医疗数据发布中的隐私保护技术一直是数据隐私研究的热点,医疗数据发布的同步更新是其中一个重要问题。为解决医疗数据匿名发布的同步问题,提出了一种建立在(α,k)-匿名数据基础上的支持数据动态更新的算法——(α,k)-UPDATE。该算法通过对语义贴近度的计算,在(α,k)-匿名数据集中选择最贴近的等价类,再进行相应的更新操作。更新后的匿名数据集满足(α,k)-匿名约束,可有效地保护患者的隐私信息。实验结果表明,该算法能在实际环境中稳定、有效地运行,在满足医疗数据实时一致性的同时,具有运算时间短、信息损失度小的优点。

关键词: (α, k)-匿名, 动态更新, 数据发布, 隐私保护, 语义贴近度

Abstract: With the development of information technology,the privacy protection technology in medical data publishing has always been a hotspot in data privacy research.One of the important issues is the synchronous update of medical data publishing.To solve the synchronization problem of medical data’s anonymous publication,an algorithm based on (α,k) anonymous dataset to support dynamic update of data was proposed,i.e.,(α,k)-UPDATE.By calculating the semantic closeness,the algorithm is able to select the most similar equivalent class in (α,k)-anonymous dataset.Then the corresponding update operation is processed.The final dynamically updated dataset can satisfy (α,k)-anonymous and protect the patient’s privacy information effectively.The experimental results show that the algorithm can run stably and effectively in real environment,satisfies the real-time consistency of medical dataset and has the advantage of shorter operating time and less information loss.

Key words: (α, Data publishing, Dynamic update, k)-anonymous, Privacy preserving, Semantic closeness

中图分类号: 

  • TP309
[1]GARTRELL K,TRINKOFF A M,STORR C L,et al.Electronic Personal Health Record Use Among Nurses in the Nursing Informatics Community [J].Computers Informatics Nursing,2015,33(7):306-314.<br /> [2]WU D.Research on Patient Privacy Protection for Medical Data in Cloud Computing[J].Journal of Networks,2013,8(11):2678-2684.<br /> [3]ZHANG X J,MENG X F.Differential Privacy in Data Publication and Analysis[J].Journal of Computers,2014,37(4):101-122.(in Chinese)<br /> 张啸剑,孟晓峰.面向数据发布和分析的差分隐私保护[J].计算机学报,2014,37(4):101-122.<br /> [4]ZHOU S G,LI F,TAO Y F,et al.Privacy Preservation in Database Application:A Survey[J].Journal of Computers,2009,32(5):847-861.(in Chinese)<br /> 周水庚,李丰,陶宇飞,等.面向数据库应用的隐私保护研究综述[J].计算机学报,2009,32(5):847-861.<br /> [5]SWEENY L.k-anonymity:A model for Protecting Privacy[J].International Journal of Uncertainty,Fuzziness and Knowledge-Based Systems,2002,10(5):557-570.<br /> [6]TERROVITIS M,MAMOULIS N,KALNIS P.Privacy-preserving anonymization of set-valued data[J].Proceedings of the Vldb Endowment,2008,1(1):115-125.<br /> [7]WONG C W,LI J,FU W C,et al.(<i>α,k</i>)-anonymity:an enhanced k-anonymity model for privacy preserving data publishing[C]//Proceedings of the Twelfth ACM SIGKDD InternationalConfe-rence on Knowledge Discovery and Data Mining.ACM,2006:754-759.<br /> [8]BYUN J W,SOHN Y,BERTINO E,et al.Secure Anonymization for Incremental Datasets[M].Secure Data Management.Berlin:Springer,2006:48-63.<br /> [9]SHI X J,HU Y L.Privacy Preserving Based on Taxonomy Tree for Dynamic Set-valued Data Publishing[J].Computer Science,2017,44(5):120-124.(in Chinese)<br /> 石秀金,胡艳玲.基于分类树的动态集值型数据发布的隐私保护[J].计算机科学,2017,44(5):120-124.<br /> [10]WU Y,WANG D,JIANG Z L.Privacy Preserving in Re-Publication of Dynamic Set-Valued Data Based on Transactional K-Anonymity[J].Journal of Computer Research and Development,2013,50(S1):248-256.(in Chinese)<br /> 武毅,王丹,蒋宗礼.基于事务型K-Anonymity的动态集值属性数据重发布隐私保护方法[J].计算机研究与发展,2013,50(S1):248-256.<br /> [11]WANG Z H,XU J,WANG W,et al.Clustering-Based Approach for Data Anonymization[J].Journal of Software,2010,21(4):680-693.(in Chinese)<br /> 王智慧,许俭,汪卫,等.一种基于聚类的数据匿名方法[J].软件学报,2010,21(4):680-693.<br /> [12]ABAWAJY J H,NINGGAL M I H,HERAWAN T.Privacy Preserving Social Network Data Publication[J].IEEE Communications Surveys & Tutorials,2016,18(3):1974-1997.<br /> [13]CHEN B C,KIFER D,LEFEVRE K,et al.Privacy-Preserving Data Publishing[J].Acm Computing Surveys,2009,2(1-2):1-167.<br /> [14]XIAO X K,TAO Y F.Personalized privacy Preservation[C]//Proceedings of the 2006 AcmSigmod International Conference on Management of Data.Chicago,Illinois,USA:ACMPress,2006:229-240.<br /> [15]SWEENEY L.Achieving k-anonymmity privacy protection using generalization and suppression[J].International Journal on Uncertainty Fuzziness and Knowledge-based Systems,2002,10(5):571-588.<br /> [16]TAKENOUCHI T,KAWAMURA T,OHSUGA A.Distributed Anonymization Method with Hiding the Presence of Indivi-duals[J].IEICE Transactions on Information & Systems,2013,96(3):596-610.
[1] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 吕由, 吴文渊.
隐私保护线性回归方案与应用
Privacy-preserving Linear Regression Scheme and Its Application
计算机科学, 2022, 49(9): 318-325. https://doi.org/10.11896/jsjkx.220300190
[4] 王健.
基于隐私保护的反向传播神经网络学习算法
Back-propagation Neural Network Learning Algorithm Based on Privacy Preserving
计算机科学, 2022, 49(6A): 575-580. https://doi.org/10.11896/jsjkx.211100155
[5] 李利, 何欣, 韩志杰.
群智感知的隐私保护研究综述
Review of Privacy-preserving Mechanisms in Crowdsensing
计算机科学, 2022, 49(5): 303-310. https://doi.org/10.11896/jsjkx.210400077
[6] 王美珊, 姚兰, 高福祥, 徐军灿.
面向医疗集值数据的差分隐私保护技术研究
Study on Differential Privacy Protection for Medical Set-Valued Data
计算机科学, 2022, 49(4): 362-368. https://doi.org/10.11896/jsjkx.210300032
[7] 薛占熬, 侯昊东, 孙冰心, 姚守倩.
带标记的不完备双论域模糊概率粗糙集中近似集动态更新方法
Label-based Approach for Dynamic Updating Approximations in Incomplete Fuzzy Probabilistic Rough Sets over Two Universes
计算机科学, 2022, 49(3): 255-262. https://doi.org/10.11896/jsjkx.201200042
[8] 吕由, 吴文渊.
基于同态加密的线性系统求解方案
Linear System Solving Scheme Based on Homomorphic Encryption
计算机科学, 2022, 49(3): 338-345. https://doi.org/10.11896/jsjkx.201200124
[9] 孔钰婷, 谭富祥, 赵鑫, 张正航, 白璐, 钱育蓉.
基于差分隐私的K-means算法优化研究综述
Review of K-means Algorithm Optimization Based on Differential Privacy
计算机科学, 2022, 49(2): 162-173. https://doi.org/10.11896/jsjkx.201200008
[10] 金华, 朱靖宇, 王昌达.
视频隐私保护技术综述
Review on Video Privacy Protection
计算机科学, 2022, 49(1): 306-313. https://doi.org/10.11896/jsjkx.201200047
[11] 雷羽潇, 段玉聪.
面向跨模态隐私保护的AI治理法律技术化框架
AI Governance Oriented Legal to Technology Bridging Framework for Cross-modal Privacy Protection
计算机科学, 2021, 48(9): 9-20. https://doi.org/10.11896/jsjkx.201000011
[12] 王辉, 朱国宇, 申自浩, 刘琨, 刘沛骞.
基于用户偏好和位置分布的假位置生成方法
Dummy Location Generation Method Based on User Preference and Location Distribution
计算机科学, 2021, 48(7): 164-171. https://doi.org/10.11896/jsjkx.200800069
[13] 季琰, 戴华, 姜莹莹, 杨庚, 易训.
面向混合云的可并行多关键词Top-k密文检索技术
Parallel Multi-keyword Top-k Search Scheme over Encrypted Data in Hybrid Clouds
计算机科学, 2021, 48(5): 320-327. https://doi.org/10.11896/jsjkx.200300160
[14] 郭蕊, 芦天亮, 杜彦辉.
WSN中基于目标决策的源位置隐私保护方案
Source-location Privacy Protection Scheme Based on Target Decision in WSN
计算机科学, 2021, 48(5): 334-340. https://doi.org/10.11896/jsjkx.200400099
[15] 彭春春, 陈燕俐, 荀艳梅.
支持本地化差分隐私保护的k-modes聚类方法
k-modes Clustering Guaranteeing Local Differential Privacy
计算机科学, 2021, 48(2): 105-113. https://doi.org/10.11896/jsjkx.200700172
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!