计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 4-12.doi: 10.11896/jsjkx.yg20240102

• 创刊五十周年特别专题 • 上一篇    下一篇

跨域数据管理

杜小勇, 李彤, 卢卫, 范举, 张峰, 柴云鹏   

  1. 数据工程与知识工程教育部重点实验室(中国人民大学) 北京100872
    中国人民大学信息学院 北京100872
  • 收稿日期:2023-11-20 修回日期:2023-12-22 出版日期:2024-01-15 发布日期:2024-01-12
  • 通讯作者: 杜小勇(duyong@ruc.edu.cn)

Cross-domain Data Management

DU Xiaoyong, LI Tong, LU Wei, FAN Ju, ZHANG Feng, CHAI Yunpeng   

  1. Key Laboratory of Data Engineering and Knowledge Engineering(Renmin University of China),Beijing 100872,China
    School of Information,Renmin University of China,Beijing 100872,China
  • Received:2023-11-20 Revised:2023-12-22 Online:2024-01-15 Published:2024-01-12
  • About author:DU Xiaoyong,born in 1963,Ph.D,professor,doctoral supervisor,is a member and fellow of CCF(No.05422F).His main research interests include database system,big data management and ana-lysis,intelligent information retrieval and knowledge engineering.

摘要: 随着数据成为新的生产要素和数字中国顶层战略的推进,跨域数据共享和流通对于实现数据要素价值最大化变得至关重要。国家通过布局全国一体化大数据中心体系、启动“东数西算”工程等一系列举措,为数据要素的跨域应用提供了基础设施。然而,传统的数据管理局限于单一域内,无法满足跨域场景下的数据管理需求。跨域数据管理面临通信层面的跨空间域挑战、数据建模层面的异构模型融合问题,以及数据访问层面的跨信任域挑战。从跨空间域、跨管辖域和跨信任域3个视角出发,探讨了跨域数据管理的内涵、研究挑战及关键技术,并展望了其未来发展趋势。

关键词: 数据管理, 跨空间域, 跨管辖域, 跨信任域

Abstract: As data becomes a new production factor and the digital China is promoted as a top-level strategy,cross-domain data sharing and circulation play a crucial role in maximizing the value of data factors.The country has taken a series of measures such as completing the overall layout design of the national integrated data center system and launching the “East-West Computing” project,providing infrastructure for the cross-domain application of data factors.Cross-domain data management faces challenges in communication,data modeling,and data access.This paper explores the connotation,research challenges,and key technologies of cross-domain data management from three perspectives:cross-spatial domain,cross-administrative domain,and cross-trust domain,and discusses its future development trends.

Key words: Data management, Cross-spatial domain, Cross-administrative domain, Cross-trust domain

中图分类号: 

  • TP315
[1]CHAI Y P,LI T,FAN J,et al.The Connotation and Challenges of Cross-Domain Data Management[J].Communications of the CCF,2022,18(11):37-40.
[2]LAMPORT L.The part-time parliament[J].ACM Transactions on Computer Systems,1998,16(2):133-169.
[3]JUNQUEIRA F P,REED B C,SERAFINI M.Zab:High-per-formance broadcast for primary-backup systems[C]//IEEE/IFIP DSN.Piscataway:IEEE Press,2011:245-256.
[4]ONGARO D,OUSTERHOUT J.In search of an understandable consensus algorithm[C]//USENIX ATC.2014:305-320.
[5]XU J J,WANG W,ZENG Y,et al.Raft-PLUS:improving raft by multi-policy based leader election with unprejudiced sorting[J].Symmetry,2022,14(6):1122.
[6]SAKIC E,VIZARRETA P,KELLERER W.Seer:Performance-aware leader election in single-leader consensus[J].arXiv:2104.01355,2021.
[7]LIU S Y,VUKOLIĆ M.Leader set selection for low-latencygeo-replicated state machine[J].IEEE Transactions on Parallel and Distributed Systems,2016,28(7):1933-1964.
[8]PARK S J,OUSTERHOUT J.Exploiting commutativity forpractical fast replication[J].arXiv:1710.09921,2017.
[9]MORARU I,ANDERSEN D G,KAMINSKY M.There is more consensus in Egalitarian parliaments[C]//SOSP.New York:ACM Press,2013:358-372.
[10]NAWAB F,AGRAWAL D,EL ABBADI A.DPaxos:managing data closer to users for low-latency and mobile applications[C]//ICMD.New York:ACM Press,2018:1221-1236.
[11]PU Q,ANANTHANARAYANAN G,BODIK P,et al.Low latency geo-distributed data analytics[J].ACM SIGCOMM Computer Communication Review,2015,45(4):421-434.
[12]OBASI E C M,EKE B,EGBONO F.Query Processing of Distributed Databases using an Improved GraphQL Model and Random Forest Algorithm[J].International Journal of Scientific and Research Publications,2022,12(4):454-466.
[13]NAWROCKE K,MCMANUS M,NETTLING M,et al.Query Transformations in a Hybrid Multi-Cloud Database Environment Per Target Query Performance[EB/OL].https://www.freepatentsonline.com/y2020/0356561.html.
[14]LU L,WANG W,WANG D.A query optimization method for distributed database.
[15]DONG L,CHU A K,LIU F K.DDQO:An Algorithm for Distributed Database Query Optimization[C]//Proceedings of the 4th International Conference on Big Data and Computing.2019.
[16]LIN Z F,YI W F,SHI G,et al.Distributed query engine and relational database query method thereof.
[17]ZHANG Q,LI J,ZHAO H,et al.Efficient Distributed Transaction Processing in Heterogeneous Networks[J].Proceedings of the VLDB Endowment,2023,16(6):1372-1385.
[18]YAN X N,YANG L G,ZHANG H B,et al.Carousel:Low-Latency Transaction Processing for Globally-Distributed Data[C]//Proceedings of the 2018 International Conference on Ma-nagement of Data,SIGMOD Conference.ACM,2018:231-243.
[19]MU S,NELSON L,LLOYD W,et al.Consolidating Concurrency Control and Consensus for Commits under Conflicts[C]//12th USENIX Symposium on Operating Systems Design and Implementation.2016:517-532.
[20]SUJAYA M,FAISAL N,DIVY A,et al.Unifying Consensus and Atomic Commitment for Effective Cloud Data Management[J].Proceedings of the VLDB Endowment,2019,12:611-623.
[21]ZHANG I,SHARMA N R,SZEKERES A,et al.Building consistent transactions with inconsistent replication[C]//Procee-dings of the 25th Symposium on Operating Systems Principles.ACM,2015:263-278.
[22]REN K,LI D,ABADI D J.Slog:Serializable,low-latency,geo-replicated transactions[J].Proceedings of the VLDB Endowment,2019,12(11):1747-1761.
[23]JIA X F,GAO S,ZHOU Y,et al.A data efficient cross-domain circulation technology framework for megacity governance [J].Frontiers in Data and Computing,2023,5(5):35-45.
[24]STONEBRAKER M,BRUCKNER D,ILYAS I F,et al.DataCuration at Scale:The Data Tamer System[C]//Biennial Conference on Innovative Data Systems Research(CIDR).Asilomar,CA,USA,2013.
[25]TANG N,FAN J,LI F Y,et al.RPT:Relational Pre-trainedTransformer Is Almost All You Need towards Democratizing Data Preparation[J].PVLDB,2021,14(8):1254-1261.
[26]TU J T,FAN J,WANG P,et al.Unicorn:A Unified Multi-Tasking Model for Supporting Matching Tasks in Data Integration[C]//Proceedings of the ACM on Management of Data.2023.
[27]TU J H,FAN J,TANG N,et al.Domain Adaptation for Deep Entity Resolution[C]//SIGMOD.2022:443-457.
[28]WANG P F,ZENG X C,CHEN L,et al.PromptEM:Prompt-tuning for Low-resource Generalized Entity Matching[J].PVLDB,2022,16(2):369-378.
[29]XIE T B,WU C H,SHI P,et al.UnifiedSKG:Unifying andMulti-Tasking Structured Knowledge Grounding with Text-to-Text Language Models[C]//EMNLP.2022:602-631.
[30]ZHU M P,RISCH T.Querying combined cloud-based and relational databases[C]//2011 International Conference on Cloud and Service Computing.2011.
[31]DEWITT D J,HALVERSON A,NEHME R,et al.Split Query Processing in Polybase[C]//ACM SIGMOD.2023:1255-1266.
[32]ABOUZEID A,BAJDA-PAWLIKOWSKI K,ABADI D,et al.HadoopDB:an architectural hybrid of MapReduce and DBMS technologies for analytical workloads[J].PVLDB,2009,2(1):922-933.
[33]ARMBRUST M,XIN R S,LIAN C,et al.Spark SQL:Relational Data Processing in Spark[C]//ACM SIGMOD.2015:1383-1394.
[34]JENNIE D,AARON J E,MICHAEL S,et al.The BigDAWG Polystore System[J].ACM SIGMOD Record,2015,44(2):11-16.
[35]Google.Cloud Dataprep[EB/OL].https://cloud.google.com/dataprep.
[36]HEER J,HELLERSTEIN J M,KANDEL S.Predictive interaction for data transformation[C]//Biennial Conference on Innovative Data Systems Research(CIDR).2015.
[37]STONEBRAKER M,BRUCKNER D,ILYAS I F,et al.Data Curation at Scale:The Data Tamer System[C]//Biennial Conference on Innovative Data Systems Research(CIDR).2013.
[38]GUO Z H,WU K,YAN C,et al.Releasing Locks As Early As You Can:Reducing Contention of Hotspots by Violating Two-Phase Locking[C]//Proceedings of the 2021 International Conference on Management of Data(SIGMOD’21).Association for Computing Machinery,New York,NY,USA,2021:658-670.
[39]WANG L,NEAR J P,SOMANI N,et al.Data Capsule:A New Paradigm for Automatic Compliance with Data Privacy Regulations[J].arXiv:1909.00077,2019.
[40]WANG L,KHAN U,NEAR J P,et al.PrivGuard:Privacy Re-gulation Compliance Made Easier[C]//USENIX Security Sym-posium.2022:3753-3770.
[41]ARACHCHILAGE N A G,NAMILUKO C,MARTIN A.Ataxonomy for securely sharing information among others in a trust domain[C]//8th International Conference for Internet Technology and Secured Transactions(ICITST-2013).IEEE,2013:296-304.
[42]LIN G,BIE Y,LEI M.Trust Based Access Control Policy inMulti-domain of Cloud Computing[J].Journal of Computational and Applied Mathematics,2013,8(5):1357-1365.
[43]TANG B,SANDHU R.Extending openstack access control with domain trust[C]//Network and System Security:8th International Conference.Springer International Publishing,2014:54-69.
[44]BHATTI R,BERTINO E,GHAFOOR A.A trust-based con-text-aware access control model for web-services[J].Distributed and Parallel Databases,2005,18:83-105.
[45]GRIFFIN J L,JAEGER T,PEREZ R,et al.Trusted virtual domains:Toward secure distributed services[C]//HotDep.2005:12-17.
[46]AWAN K A,DIN I U,ALMOGREN A,et al.Robusttrust-a pro-privacy robust distributed trust management mechanism for internet of things[J].IEEE Access,2019,7:62095-62106.
[47]BINJUBEIR M,AHMED A A,ISMAIL M A B,et al.Comprehensive survey on big data privacy protection[J].IEEE Access,2019,8:20067-20079.
[48]CHEN D,ZHAO H.Data security and privacy protection issues in cloud computing[C]//2012 International Conference on Computer Science and Electronics Engineering.IEEE,2012:647-651.
[49]BENANTAR M.Access control systems:security,identity ma-nagement and trust models[M].Springer Science & Business Media,2005.
[50]YAWALKAR P M,PAITHANKAR D N,PABALE A R,et al.Integrated identity and auditing management using blockchain mechanism[J].Measurement:Sensors,2023,27:100732.
[51]WANG Z,WEI K,JIANG C,et al.Research on productization and development trend of data desensitization technology[C]//2021 IEEE 20th International Conference on Trust,Security and Privacy in Computing and Communications(TrustCom).IEEE,2021:1564-1569.
[52]SAMARATI P,DE VIMERCATI S C.Access control:Policies,models,and mechanisms[M]//International School on Foundations of Security Analysis and Design.Berlin:Springer,2000:137-196.
[53]THAMBIRAJA E,RAMESH G,UMARANI D R.A survey on various most common encryption techniques[J].International Journal of Advanced Research in Computer Science and Software Engineering,2012,2(7):226-233.
[54]RABAH K.Theory and implementation of data encryptionstandard:A review[J].Information Technology Journal,2005,4(4):307-325.
[55]NADEEM A,JAVED M Y.A performance comparison of data encryption algorithms[C]//2005 International Conference on Information and Communication Technologies.IEEE,2005:84-89.
[56]ACAR A,AKSU H,ULUAGAC A S,et al.A survey on homomorphic encryption schemes:Theory and implementation[J].ACM Computing Surveys(CSUR),2018,51(4):1-35.
[57]GONG L,ZHANG L,ZHANG W,et al.The application of data encryption technology in computer network communication security[C]//AIP.2017.
[58]GOSHWE N Y.Data encryption and decryption using RSA algorithm in a network environment[J].IJCSNS,2013,13(7):9.
[59]HACIGUMUS H,IYER B,MEHROTRA S.Providing database as a service[C]//IEEE ICDE.2002:29-38.
[60]ANTONOPOULOS P,ARASU A,SINGH K D,et al.AzureSQL database always encrypted[C]//ACM SIGMOD.2020:1511-1525.
[61]ANSARI M D,GUNJAN V K,RASHID E.On security anddata integrity framework for cloud computing using tamper-proofing[C]//ICCCE.2021:1419-1427.
[62]YANG J,WEN J,JIANG B,et al.Blockchain-based sharing and tamper-proof framework of big data networking[J].IEEE Network,2020,34(4):62-67.
[63]JIAO T,SHEN D R,NIE T Z,et al.Blockchain Database:AQueryable and Tamper-proof Database[J].Journal of Software,2019,30(9):2671-2685.
[64]ZHENG Z,XIE S,DAI H N,et al.Blockchain challenges and opportunities:A survey[J].International Journal of Web and Grid Services,2018,14(4):352-375.
[65]KOLESNIKOV V.Truly efficient string oblivious transfer usingresettable tamper-proof tokens[C]//TCC.2010:327-342.
[66]BUCHMANN J,DAHMEN E,SZYDLO M.Hash-based digital signature schemes[M]//Post-Quantum Cryptography.Berlin:Springer,2009:35-93.
[67]ROUHANI S,POURHEIDARI V,DETERS R.Physical access control management system based on permissioned blockchain[C]//IEEE Smart Data.2018:1078-1083.
[68]LI T,ZHENG K,XU K.Acknowledgment Mechanisms ofTransmission Control[J/OL].http://www.jos.org.cn/jos/article/pdf/6939.
[69]Huawei[EB/OL].https://www.huawei.com/cn/huaweitech/publication/90/deterministric-ip-networking-dark-factory.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!