计算机科学 ›› 2019, Vol. 46 ›› Issue (10): 148-153.doi: 10.11896/jsjkx.190100050

• 信息安全 • 上一篇    下一篇

面向大数据的隐私发布暴露检测方法

柯昌博1,2, 黄志球2, 吴嘉余1   

  1. (南京邮电大学计算机学院 南京210003)1
    (南京航空航天大学计算机科学与技术学院 南京210016)2
  • 收稿日期:2019-01-07 修回日期:2019-04-09 出版日期:2019-10-15 发布日期:2019-10-21
  • 作者简介:柯昌博(1984-),男,博士,讲师,CCF会员,主要研究方向为云计算中的隐私保护、基于本体的软件工程等,E-mail:brobo.ke@njupt.edu.cn;黄志球(1965-),男,博士,教授,博士生导师,CCF高级会员,主要研究方向为软件工程、形式化方法、数据仓库;吴嘉余(1996-),男,硕士生,主要研究方向为云计算中的隐私保护。
  • 基金资助:
    本文受中国博士后基金(2016M591842),江苏省博士后基金(1601198C),国家自然科学基金青年项目(Grant 61602262),江苏省自然科学基金青年项目(Grant BK20150865)资助。

Big Data Oriented Privacy Disclosure Detection Method for Information Release

KE Chang-bo1,2, HUANG Zhi-qiu2, WU Jia-yu1   

  1. (School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210003,China)1
    (College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 210016 China)2
  • Received:2019-01-07 Revised:2019-04-09 Online:2019-10-15 Published:2019-10-21

摘要: 为了防止云服务非法获取用户的个人敏感隐私信息,提出一种面向大数据的隐私信息发布检测与保护方法。首先,对用户的隐私数据进行分类,分别对隐私数据的相似度和暴露代价进行度量;其次,根据相似度和暴露代价检测云服务所要求用户提供的隐私数据中是否包含暴露链和关键隐私数据;再次,对连续隐私数据集(包含暴露链和关键隐私数据的数据集)进行离散化,同时防止离散的隐私数据集(不包含暴露链和关键隐私数据的数据集)连续化;最后,通过实验对离散的隐私数据集与没有离散的数据集进行隐私数据链的发现,从查准率和查全率上看,Exact过滤器的查准率低于未被离散的数据集57%,而查全率低于未被离散的数据集17%。因此,所提方法达到了保护用户敏感隐私信息的目的。

关键词: 大数据, 关键隐私数据, 隐私暴露链, 隐私发布检测, 隐私增强

Abstract: In order to prevent cloud services from illegally acquiring user’s personal sensitive privacy information,this paper proposed a privacy information publishing detection and protection method for big data.Firstly,the user’s privacy data are classified,and the similarity and the disclosure cost of privacy data are measured respectively.Secondly,accor-ding to the similarity and the disclosure cost,this method detectes whether privacy data required by cloud services contain disclosure chain and key privacy data.Thirdly,continuous privacy datasets,including disclosure chains and key privacy data are decomposed.At the same time,discrete datasets are prevented from being composed into continuous datasets,which do not contain disclosure chains and key privacy data.At last,privacy data chains between discrete privacy datasets and original privacy datasets are discovered by experiment.In terms of precision and recall,the precision of Exact filter is 57% lower than that of non-discrete data sets,while the recall rate is less than 17%.Therefore,the proposed method achieves the purpose of protecting user’s sensitivity privacy information.

Key words: Big data, Key privacy data, Privacy disclosure chain, Privacy enhancement, Privacy release detection

中图分类号: 

  • TP399
[1]SNIJDERS C,MATZAT U,REIPS U D.“Big Data”:big gaps of knowledge in the field of internet science[J].International Journal of Internet Science,2012,7(1):1-5.
[2]DU J,JIANG C,CHEN K C,et al.Community-structured evolutionary game for privacy protection in social networks[J].IEEE Transactions on Information Forensics and Security,2018,13(3):574-589.
[3]HUGUENIN K,BILOGREVIC I,MACHADO J S,et al.A predictive model for user motivation and utility implications of privacy-protection mechanisms in location check-ins[J].IEEE Transactions on Mobile Computing,2018,17(4):760-774.
[4]PENG H F,HUANG Z Q,FAN D J,et al.Specification and Verification of User Privacy Requirements for Service Composition[J].Journal of Software,2016,27(8):1948-1963.(in Chinese)
彭焕峰,黄志球,范大娟,等.面向服务组合的用户隐私需求规约与验证方法[J].软件学报,2016,27(8):1948-1963.
[5]LIU X Y,WANG B,YANG X C.Survey on Privacy Preserving Techniques for Publishing Social Network Data[J].Journal of Software,2014,25(3):576-590.(in Chinese)
刘向宇,王斌,杨晓春.社会网络数据发布隐私保护技术综述[J].软件学报,2014,25(3):576-590.
[6]ZHANG Y X,HE J S,ZHAO B, et al.A Privacy Protection Model Base on Game Theory [J].Chinese Journal of Compu-ters,2016,39(3):615-627.(in Chinese)
张伊璇,何泾沙,赵斌,等.一个基于博弈理论的隐私保护模型[J].计算机学报,2016,39(3):615-627.
[7]WU D,SI S,WU S,et al.Dynamic trust relationships aware data privacy protection in mobile crowd-sensing[J].IEEE Internet of Things Journal,2018,5(4):2958-2970.
[8]XIAO Z,YANG J J,HUANG M,et al.QLDS:A Novel Design Scheme for Trajectory Privacy Protection with Utility Guarantee in Participatory Sensing[J].IEEE Transactions on Mobile Computing,2018,17(6):1397-1410.
[9]KE C B,HUANG Z,LI W,et al.Service Outsourcing Character Oriented Privacy Conflict Detection Method in Cloud Computing[J].Journal of Applied Mathematics,2014,2014:1-11.
[10]GU L,CHEUNG S C.Constructing and testing privacy aware services in cloud computing environment:Challenges and opportunities[C]//Proceedings of the 1st Asia-Pacific Symposium on Internet ware.New York:ACM,2009:1-10.
[11]NI W W,CHONG Z H.Clustering-oriented privacy-preserving data publishing[J].Knowledge-Based Systems,2012,35: 264-270.
[12]POLATIDIS N,GEORGIADIS C K,PIMENIDIS E,et al.Priva-cy-preserving collaborative recommendations based on random perturbations[J].Expert Systems with Applications,2017,71:18-25.
[13]YE M,WU X,HU X,et al.Anonymizing classification data using rough set theory[J].Knowledge-Based Systems,2013,43:82-94.
[14]LI S,ZHANG X,QIAN Z,et al.Key based Artificial Fingerprint Generation for Privacy Protection[J].IEEE Transactions on Dependable and Secure Computing,2018,PP(99):1-1.
[15]GAI K,CHOO K K R,QIU M,et al.Privacy-preserving con-tent-oriented wireless communication in internet-of-things[J].IEEE Internet of Things Journal,2018,5(4):3059-3067.
[1] 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述
Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[2] 陈晶, 吴玲玲.
多源异构环境下的车联网大数据混合属性特征检测方法
Mixed Attribute Feature Detection Method of Internet of Vehicles Big Datain Multi-source Heterogeneous Environment
计算机科学, 2022, 49(8): 108-112. https://doi.org/10.11896/jsjkx.220300273
[3] 孙轩, 王焕骁.
政务大数据安全防护能力建设:基于技术和管理视角的探讨
Capability Building for Government Big Data Safety Protection:Discussions from Technologicaland Management Perspectives
计算机科学, 2022, 49(4): 67-73. https://doi.org/10.11896/jsjkx.211000010
[4] 王美珊, 姚兰, 高福祥, 徐军灿.
面向医疗集值数据的差分隐私保护技术研究
Study on Differential Privacy Protection for Medical Set-Valued Data
计算机科学, 2022, 49(4): 362-368. https://doi.org/10.11896/jsjkx.210300032
[5] 王俊, 王修来, 庞威, 赵鸿飞.
面向科技前瞻预测的大数据治理研究
Research on Big Data Governance for Science and Technology Forecast
计算机科学, 2021, 48(9): 36-42. https://doi.org/10.11896/jsjkx.210500207
[6] 余乐章, 夏天宇, 荆一楠, 何震瀛, 王晓阳.
面向大数据分析的智能交互向导系统
Smart Interactive Guide System for Big Data Analytics
计算机科学, 2021, 48(9): 110-117. https://doi.org/10.11896/jsjkx.200900083
[7] 王立梅, 朱旭光, 汪德嘉, 张勇, 邢春晓.
基于深度学习的民事案件判决结果分类方法研究
Study on Judicial Data Classification Method Based on Natural Language Processing Technologies
计算机科学, 2021, 48(8): 80-85. https://doi.org/10.11896/jsjkx.210300130
[8] 王雪岑, 张昱, 刘迎婕, 于戈.
基于表示学习的在线学习交互质量评价方法
Evaluation of Quality of Interaction in Online Learning Based on Representation Learning
计算机科学, 2021, 48(2): 207-211. https://doi.org/10.11896/jsjkx.201000042
[9] 滕建, 滕飞, 李天瑞.
基于3D卷积和LSTM编码解码的出行需求预测
Travel Demand Forecasting Based on 3D Convolution and LSTM Encoder-Decoder
计算机科学, 2021, 48(12): 195-203. https://doi.org/10.11896/jsjkx.210400022
[10] 张育龙, 王强, 陈明康, 孙静涛.
图像去雨算法在云物联网应用中的研究综述
Survey of Intelligent Rain Removal Algorithms for Cloud-IoT Systems
计算机科学, 2021, 48(12): 231-242. https://doi.org/10.11896/jsjkx.201000055
[11] 曹萌, 于洋, 梁英, 史红周.
基于区块链的大数据交易关键技术与发展趋势
Key Technologies and Development Trends of Big Data Trade Based on Blockchain
计算机科学, 2021, 48(11A): 184-190. https://doi.org/10.11896/jsjkx.210100163
[12] 刘亚臣, 黄雪莹.
卫星监测时空大数据蠕变特征提取及预警算法
Research on Creep Feature Extraction and Early Warning Algorithm Based on Satellite MonitoringSpatial-Temporal Big Data
计算机科学, 2021, 48(11A): 258-264. https://doi.org/10.11896/jsjkx.201000071
[13] 张光君, 张翔.
应用“大数据+区块链”优化立法评估制度的机理与路径
Mechanism and Path of Optimizing Institution of Legislative Evaluation by Applying “Big Data+Blockchain”
计算机科学, 2021, 48(10): 324-333. https://doi.org/10.11896/jsjkx.201200105
[14] 叶雅珍, 刘国华, 朱扬勇.
数据产品流通的两阶段授权模式
Two-step Authorization Pattern of Data Product Circulation
计算机科学, 2021, 48(1): 119-124. https://doi.org/10.11896/jsjkx.191100217
[15] 赵会群, 吴凯锋.
一种大数据估价算法
Big Data Valuation Algorithm
计算机科学, 2020, 47(9): 110-116. https://doi.org/10.11896/jsjkx.191000156
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!