Computer Science ›› 2024, Vol. 51 ›› Issue (1): 4-12.doi: 10.11896/jsjkx.yg20240102

• Special Issue on the 50th Anniversary of Computer Science • Previous Articles     Next Articles

Cross-domain Data Management

DU Xiaoyong, LI Tong, LU Wei, FAN Ju, ZHANG Feng, CHAI Yunpeng   

  1. Key Laboratory of Data Engineering and Knowledge Engineering(Renmin University of China),Beijing 100872,China
    School of Information,Renmin University of China,Beijing 100872,China
  • Received:2023-11-20 Revised:2023-12-22 Online:2024-01-15 Published:2024-01-12
  • About author:DU Xiaoyong,born in 1963,Ph.D,professor,doctoral supervisor,is a member and fellow of CCF(No.05422F).His main research interests include database system,big data management and ana-lysis,intelligent information retrieval and knowledge engineering.

Abstract: As data becomes a new production factor and the digital China is promoted as a top-level strategy,cross-domain data sharing and circulation play a crucial role in maximizing the value of data factors.The country has taken a series of measures such as completing the overall layout design of the national integrated data center system and launching the “East-West Computing” project,providing infrastructure for the cross-domain application of data factors.Cross-domain data management faces challenges in communication,data modeling,and data access.This paper explores the connotation,research challenges,and key technologies of cross-domain data management from three perspectives:cross-spatial domain,cross-administrative domain,and cross-trust domain,and discusses its future development trends.

Key words: Data management, Cross-spatial domain, Cross-administrative domain, Cross-trust domain

CLC Number: 

  • TP315
[1]CHAI Y P,LI T,FAN J,et al.The Connotation and Challenges of Cross-Domain Data Management[J].Communications of the CCF,2022,18(11):37-40.
[2]LAMPORT L.The part-time parliament[J].ACM Transactions on Computer Systems,1998,16(2):133-169.
[3]JUNQUEIRA F P,REED B C,SERAFINI M.Zab:High-per-formance broadcast for primary-backup systems[C]//IEEE/IFIP DSN.Piscataway:IEEE Press,2011:245-256.
[4]ONGARO D,OUSTERHOUT J.In search of an understandable consensus algorithm[C]//USENIX ATC.2014:305-320.
[5]XU J J,WANG W,ZENG Y,et al.Raft-PLUS:improving raft by multi-policy based leader election with unprejudiced sorting[J].Symmetry,2022,14(6):1122.
[6]SAKIC E,VIZARRETA P,KELLERER W.Seer:Performance-aware leader election in single-leader consensus[J].arXiv:2104.01355,2021.
[7]LIU S Y,VUKOLIĆ M.Leader set selection for low-latencygeo-replicated state machine[J].IEEE Transactions on Parallel and Distributed Systems,2016,28(7):1933-1964.
[8]PARK S J,OUSTERHOUT J.Exploiting commutativity forpractical fast replication[J].arXiv:1710.09921,2017.
[9]MORARU I,ANDERSEN D G,KAMINSKY M.There is more consensus in Egalitarian parliaments[C]//SOSP.New York:ACM Press,2013:358-372.
[10]NAWAB F,AGRAWAL D,EL ABBADI A.DPaxos:managing data closer to users for low-latency and mobile applications[C]//ICMD.New York:ACM Press,2018:1221-1236.
[11]PU Q,ANANTHANARAYANAN G,BODIK P,et al.Low latency geo-distributed data analytics[J].ACM SIGCOMM Computer Communication Review,2015,45(4):421-434.
[12]OBASI E C M,EKE B,EGBONO F.Query Processing of Distributed Databases using an Improved GraphQL Model and Random Forest Algorithm[J].International Journal of Scientific and Research Publications,2022,12(4):454-466.
[13]NAWROCKE K,MCMANUS M,NETTLING M,et al.Query Transformations in a Hybrid Multi-Cloud Database Environment Per Target Query Performance[EB/OL].https://www.freepatentsonline.com/y2020/0356561.html.
[14]LU L,WANG W,WANG D.A query optimization method for distributed database.
[15]DONG L,CHU A K,LIU F K.DDQO:An Algorithm for Distributed Database Query Optimization[C]//Proceedings of the 4th International Conference on Big Data and Computing.2019.
[16]LIN Z F,YI W F,SHI G,et al.Distributed query engine and relational database query method thereof.
[17]ZHANG Q,LI J,ZHAO H,et al.Efficient Distributed Transaction Processing in Heterogeneous Networks[J].Proceedings of the VLDB Endowment,2023,16(6):1372-1385.
[18]YAN X N,YANG L G,ZHANG H B,et al.Carousel:Low-Latency Transaction Processing for Globally-Distributed Data[C]//Proceedings of the 2018 International Conference on Ma-nagement of Data,SIGMOD Conference.ACM,2018:231-243.
[19]MU S,NELSON L,LLOYD W,et al.Consolidating Concurrency Control and Consensus for Commits under Conflicts[C]//12th USENIX Symposium on Operating Systems Design and Implementation.2016:517-532.
[20]SUJAYA M,FAISAL N,DIVY A,et al.Unifying Consensus and Atomic Commitment for Effective Cloud Data Management[J].Proceedings of the VLDB Endowment,2019,12:611-623.
[21]ZHANG I,SHARMA N R,SZEKERES A,et al.Building consistent transactions with inconsistent replication[C]//Procee-dings of the 25th Symposium on Operating Systems Principles.ACM,2015:263-278.
[22]REN K,LI D,ABADI D J.Slog:Serializable,low-latency,geo-replicated transactions[J].Proceedings of the VLDB Endowment,2019,12(11):1747-1761.
[23]JIA X F,GAO S,ZHOU Y,et al.A data efficient cross-domain circulation technology framework for megacity governance [J].Frontiers in Data and Computing,2023,5(5):35-45.
[24]STONEBRAKER M,BRUCKNER D,ILYAS I F,et al.DataCuration at Scale:The Data Tamer System[C]//Biennial Conference on Innovative Data Systems Research(CIDR).Asilomar,CA,USA,2013.
[25]TANG N,FAN J,LI F Y,et al.RPT:Relational Pre-trainedTransformer Is Almost All You Need towards Democratizing Data Preparation[J].PVLDB,2021,14(8):1254-1261.
[26]TU J T,FAN J,WANG P,et al.Unicorn:A Unified Multi-Tasking Model for Supporting Matching Tasks in Data Integration[C]//Proceedings of the ACM on Management of Data.2023.
[27]TU J H,FAN J,TANG N,et al.Domain Adaptation for Deep Entity Resolution[C]//SIGMOD.2022:443-457.
[28]WANG P F,ZENG X C,CHEN L,et al.PromptEM:Prompt-tuning for Low-resource Generalized Entity Matching[J].PVLDB,2022,16(2):369-378.
[29]XIE T B,WU C H,SHI P,et al.UnifiedSKG:Unifying andMulti-Tasking Structured Knowledge Grounding with Text-to-Text Language Models[C]//EMNLP.2022:602-631.
[30]ZHU M P,RISCH T.Querying combined cloud-based and relational databases[C]//2011 International Conference on Cloud and Service Computing.2011.
[31]DEWITT D J,HALVERSON A,NEHME R,et al.Split Query Processing in Polybase[C]//ACM SIGMOD.2023:1255-1266.
[32]ABOUZEID A,BAJDA-PAWLIKOWSKI K,ABADI D,et al.HadoopDB:an architectural hybrid of MapReduce and DBMS technologies for analytical workloads[J].PVLDB,2009,2(1):922-933.
[33]ARMBRUST M,XIN R S,LIAN C,et al.Spark SQL:Relational Data Processing in Spark[C]//ACM SIGMOD.2015:1383-1394.
[34]JENNIE D,AARON J E,MICHAEL S,et al.The BigDAWG Polystore System[J].ACM SIGMOD Record,2015,44(2):11-16.
[35]Google.Cloud Dataprep[EB/OL].https://cloud.google.com/dataprep.
[36]HEER J,HELLERSTEIN J M,KANDEL S.Predictive interaction for data transformation[C]//Biennial Conference on Innovative Data Systems Research(CIDR).2015.
[37]STONEBRAKER M,BRUCKNER D,ILYAS I F,et al.Data Curation at Scale:The Data Tamer System[C]//Biennial Conference on Innovative Data Systems Research(CIDR).2013.
[38]GUO Z H,WU K,YAN C,et al.Releasing Locks As Early As You Can:Reducing Contention of Hotspots by Violating Two-Phase Locking[C]//Proceedings of the 2021 International Conference on Management of Data(SIGMOD’21).Association for Computing Machinery,New York,NY,USA,2021:658-670.
[39]WANG L,NEAR J P,SOMANI N,et al.Data Capsule:A New Paradigm for Automatic Compliance with Data Privacy Regulations[J].arXiv:1909.00077,2019.
[40]WANG L,KHAN U,NEAR J P,et al.PrivGuard:Privacy Re-gulation Compliance Made Easier[C]//USENIX Security Sym-posium.2022:3753-3770.
[41]ARACHCHILAGE N A G,NAMILUKO C,MARTIN A.Ataxonomy for securely sharing information among others in a trust domain[C]//8th International Conference for Internet Technology and Secured Transactions(ICITST-2013).IEEE,2013:296-304.
[42]LIN G,BIE Y,LEI M.Trust Based Access Control Policy inMulti-domain of Cloud Computing[J].Journal of Computational and Applied Mathematics,2013,8(5):1357-1365.
[43]TANG B,SANDHU R.Extending openstack access control with domain trust[C]//Network and System Security:8th International Conference.Springer International Publishing,2014:54-69.
[44]BHATTI R,BERTINO E,GHAFOOR A.A trust-based con-text-aware access control model for web-services[J].Distributed and Parallel Databases,2005,18:83-105.
[45]GRIFFIN J L,JAEGER T,PEREZ R,et al.Trusted virtual domains:Toward secure distributed services[C]//HotDep.2005:12-17.
[46]AWAN K A,DIN I U,ALMOGREN A,et al.Robusttrust-a pro-privacy robust distributed trust management mechanism for internet of things[J].IEEE Access,2019,7:62095-62106.
[47]BINJUBEIR M,AHMED A A,ISMAIL M A B,et al.Comprehensive survey on big data privacy protection[J].IEEE Access,2019,8:20067-20079.
[48]CHEN D,ZHAO H.Data security and privacy protection issues in cloud computing[C]//2012 International Conference on Computer Science and Electronics Engineering.IEEE,2012:647-651.
[49]BENANTAR M.Access control systems:security,identity ma-nagement and trust models[M].Springer Science & Business Media,2005.
[50]YAWALKAR P M,PAITHANKAR D N,PABALE A R,et al.Integrated identity and auditing management using blockchain mechanism[J].Measurement:Sensors,2023,27:100732.
[51]WANG Z,WEI K,JIANG C,et al.Research on productization and development trend of data desensitization technology[C]//2021 IEEE 20th International Conference on Trust,Security and Privacy in Computing and Communications(TrustCom).IEEE,2021:1564-1569.
[52]SAMARATI P,DE VIMERCATI S C.Access control:Policies,models,and mechanisms[M]//International School on Foundations of Security Analysis and Design.Berlin:Springer,2000:137-196.
[53]THAMBIRAJA E,RAMESH G,UMARANI D R.A survey on various most common encryption techniques[J].International Journal of Advanced Research in Computer Science and Software Engineering,2012,2(7):226-233.
[54]RABAH K.Theory and implementation of data encryptionstandard:A review[J].Information Technology Journal,2005,4(4):307-325.
[55]NADEEM A,JAVED M Y.A performance comparison of data encryption algorithms[C]//2005 International Conference on Information and Communication Technologies.IEEE,2005:84-89.
[56]ACAR A,AKSU H,ULUAGAC A S,et al.A survey on homomorphic encryption schemes:Theory and implementation[J].ACM Computing Surveys(CSUR),2018,51(4):1-35.
[57]GONG L,ZHANG L,ZHANG W,et al.The application of data encryption technology in computer network communication security[C]//AIP.2017.
[58]GOSHWE N Y.Data encryption and decryption using RSA algorithm in a network environment[J].IJCSNS,2013,13(7):9.
[59]HACIGUMUS H,IYER B,MEHROTRA S.Providing database as a service[C]//IEEE ICDE.2002:29-38.
[60]ANTONOPOULOS P,ARASU A,SINGH K D,et al.AzureSQL database always encrypted[C]//ACM SIGMOD.2020:1511-1525.
[61]ANSARI M D,GUNJAN V K,RASHID E.On security anddata integrity framework for cloud computing using tamper-proofing[C]//ICCCE.2021:1419-1427.
[62]YANG J,WEN J,JIANG B,et al.Blockchain-based sharing and tamper-proof framework of big data networking[J].IEEE Network,2020,34(4):62-67.
[63]JIAO T,SHEN D R,NIE T Z,et al.Blockchain Database:AQueryable and Tamper-proof Database[J].Journal of Software,2019,30(9):2671-2685.
[64]ZHENG Z,XIE S,DAI H N,et al.Blockchain challenges and opportunities:A survey[J].International Journal of Web and Grid Services,2018,14(4):352-375.
[65]KOLESNIKOV V.Truly efficient string oblivious transfer usingresettable tamper-proof tokens[C]//TCC.2010:327-342.
[66]BUCHMANN J,DAHMEN E,SZYDLO M.Hash-based digital signature schemes[M]//Post-Quantum Cryptography.Berlin:Springer,2009:35-93.
[67]ROUHANI S,POURHEIDARI V,DETERS R.Physical access control management system based on permissioned blockchain[C]//IEEE Smart Data.2018:1078-1083.
[68]LI T,ZHENG K,XU K.Acknowledgment Mechanisms ofTransmission Control[J/OL].http://www.jos.org.cn/jos/article/pdf/6939.
[69]Huawei[EB/OL].https://www.huawei.com/cn/huaweitech/publication/90/deterministric-ip-networking-dark-factory.
[1] ZHANG Li-zong, CUI Yuan, LUO Guang-chun, CHEN Ai-guo, LU Guo-ming and WANG Xiao-xue. Dynamic Load Balance Algorithm for Big-data Distributed Storage [J]. Computer Science, 2017, 44(5): 178-183.
[2] WANG Bo-qian, YU Qi, LIU Xin, SHEN Li, WANG Zhi-ying and CHEN Wei. Efficient and Dynamic Data Management System for Cassandra Database [J]. Computer Science, 2016, 43(7): 197-202.
[3] SHI Jun-ru, HEI Min-xing and YANG Jun. Data Management Framework for Internet of Things [J]. Computer Science, 2015, 42(Z6): 294-298.
[4] ZHU Chao,YANG Wen-bing,SUN Lin-rui,WANG Hui-long and HUANG Guan-hua. Integration of PDM and ERP Based on the Generic Bill of Materials [J]. Computer Science, 2013, 40(Z11): 401-404.
[5] SHI Guang-yuan and ZHANG Yu. Research on Fuzzy Logic-based Model of Tiered Storage [J]. Computer Science, 2013, 40(Z11): 284-287.
[6] LI Qi and WU Gang. Research of Data Management on Semantic Sensor Web [J]. Computer Science, 2013, 40(6): 1-7.
[7] SHEN Xin-peng,LI Zhan-huai,ZHAO Xiao-nan,ZENG Lei-jie. Locating Table in P2P Data Management System [J]. Computer Science, 2011, 38(3): 195-198.
[8] ZHU Yan. On Web Source Quality Pattern Mining Approaches [J]. Computer Science, 2010, 37(8): 201-207.
[9] ZHANG Shao-ping,WANG Ying-hua,LI Guo-hui. Overview of Data Management in Wireless Sensor Networks [J]. Computer Science, 2010, 37(6): 11-16.
[10] LI Jie-qiong,FENG Dan. WAN Intelligent Storage System for Next Generation Internet [J]. Computer Science, 2010, 37(10): 279-282.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!