Computer Science ›› 2016, Vol. 43 ›› Issue (6): 17-23.doi: 10.11896/j.issn.1002-137X.2016.06.003

Previous Articles     Next Articles

Managing Marine Data as Big Data:Uprising Challenges and Tentative Solutions

HUANG Dong-mei, ZHAO Dan-feng, WEI Li-fei, DU Yan-ling and WANG Zhen-hua   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Big data have been continually drawing extensive interests in both academia and industry.Currently,the scale of marine data is increasing consecutively and exponentially with the rapid development of ocean observation technologies and data acquisition methodologies.Until recently,most of the solutions focus on the generic big data,while an extensive study over marine data is still left undiscussed,since the uniqueness of marine data brings new challenges for its management.As a result,this article first outlined the characteristics of marine data as well as the fundamental architecture of marine data management.Secondly,this paper also analyzed the problems of data storage,data quality and data security as well as the corresponding tentative solutions,which will provide significant evidence and references for the future study over ocean science and engineering technology.

Key words: Marine,Big data,Data storage,Data quality,Data security

[1] Argo简讯[EB/OL].http://www.argo.org.cn
[2] Argo data center in China.http://www.argo.org.cn
[3] Durack P J,Wijffels S E,Matear R J.Ocean Salinities Reveal Strong Global Water Cycle Intensification During 1950 to 2000 [J].Science,2012,6(6080):455-458
[4] Brown C J,Smith S J,Lawton P,et al.Benthic habitat mapping:A review of progress towards improved understanding of the spatial ecology of the seafloor using acoustic techniques[J].Estuarine Coastal and Shelf Science,2011,2(3):502-520
[5] Rogers G C,Meldrum R,Baldwin R,et al.The NEPTUNE Cana-da Seismograph Network[J].Seismological Research Letters,2010,1(2):369-379
[6] Rabinovich A B,Thomson R E,Fine I V.The 2010 Chilean Tsunami off the west coast of Canada and the northwest coast of the United States[J].Pure and Applied Geophysics,2013,170(9):1529-1565
[7] Huang Dong-mei,Du Yan-ling,He Qi.Migration Algorithm for Big Marine Data in Hybrid Cloud Storage[J].Journal of Computer Research and Development,2014,1(1):199-205(in Chinese) 黄冬梅,杜艳玲,贺琪.混合云存储中海洋大数据迁移算法的研究[J].计算机研究与发展,2014,1(1):199-205
[8] Huang Dong-mei,Sun Le,Zhao Dan-feng,et al.An efficient hybrid index structure for temporal marine data[C]∥Proceeding of Conference on Web-Age Information Management.2014:187-199
[9] Zhou Xiang-min,Wang Guo-ren.Key Dimension Based High-Dimensional Data Partition Strategy[J].Journal of Software,2004,5(9):1361-1374(in Chinese) 周项敏,王国仁.基于关键维的高维空间划分策略[J].软件学报,2004,15(9):1361-1374
[10] Ren Ping,Liu Wu,Sun Dong-hong.Partition-based data cube sto-rage and parallel queries for cloud computing[C]∥Proceedings of the Ninth International Conference on Natural Computation (ICNC).2013:1183-1187
[11] Zhao Dan-feng,Jin Shun-fu,Liu Guo-hua,et al.A cryptograph index technology based on query probability in DAS model[J].Journal of Yanshan University,2008,2(6):77-482
[12] Han Lei,Sun Xu-zhan,Wu Zhi-chuan,et al.Optimization Study on Sample Based Partition on MapReduce[J].Journal of Computer Research and Development,2013,50(6):77-84(in Chinese) 韩蕾,孙徐湛,吴志川,等.MapReduce上基于抽样的数据划分最优化研究[J].计算机研究与发展,2013,50(6):77-84
[13] Xu Yu-jie,Zou Peng,Qu Wen-yu.et al.Sampling-Based Partitioning in MapReduce for Skewed Data[C]∥Proceeding of the 7th Conference on China Grid.2012:1-8
[14] Shi Sui-xiang,Lei Bo.Theory and Practice on China DigitalOcean[M].Beijing:Ocean Press,2011
[15] Fox A,Eichelberger C,Hughes J,et al.Spatio-temporal indexing in non-relational distributed databases[C]∥Proceeding of IEEE Conference on Big Data.2013:291-299
[16] Zhong Yun-qin,Fang Jin-yun,Zhao Xiao-fang.VegaIndexer:A Distributed composite index scheme for big spatio-temporal sensor data on cloud[C]∥Proceedings of the IEEE Conference on Geoscience and Remote Sensing Symposium (IGARSS).2013:1713-1716
[17] Su Chen,Beng C O,Tan K L,et al.ST2B-tree:a self-tunable spatio-temporal b+-tree index for moving objects[C]∥Procee-ding of Conference on ACM Special Interest Group Conference on Management of Data(SIGMOD).2008:29-42
[18] Kaufmann M,Manjili A A,Vagenas P,et al.Timeline index:a unified data structure for processing queries on temporal data in SAP HANA[C]∥Proceeding of Conference on ACM Special Interest Group Conference on Management Of Data(SIGMOD).2013:1173-1184
[19] Hu Xiao-cheng,Miao Qiao,Tao Yu-fei.Independent Range Sam-pling[C]∥Proceedings of the 33rd ACM Special Interest Group Conference on Management of Data.2014:246-255
[20] Zhang Jin,Chen Guo-qing,Tang Xiao-hui.Extracting Repre-sentative Information to Enhance Flexible Data Queries[J].IEEE Transactions on Neural Networks and Learning Systems,2012,3(6):928-941
[21] Thomas S,Kevin L.MapReduce Optimization Using Regulated Dynamic Prioritization[C]∥Proceedings of the SIGMETRICS.2009:299-310
[22] Gharibi W,Mousa A.Query optimization based on time scheduling approach[C]∥IEEE East-West Design & Test Sympo-sium.2013:1-7
[23] Gibson T,Smith C H M,Miller E.An improved long-term file usage prediction algorithm[C]∥Annual International Con-ference on Computer Measurement & Performance.2002:639-648
[24] Jeong J,Dubois M.Cost-sensitive cache replacement algorithms[C]∥Proceedings of the 9th International Symposium on High-Performance Computer Architecture (HPCA-9).2003,7
[25] Reed B,Darrell D.Analysis of caching algotithms for distributed file systems[C]∥Proceedings of the ACM SIGOPS Operating Systems Review.1996:12-21
[26] He Ding-shan,Zhang Xian-bo,Grider G,et al.Coordinating pa-rallel hierarchical storage management in object-based cluster file system.http://wiki.lustre.org/images/f/fc/MSST-2006-paper.pdf
[27] Ao L,Yu D,Shu J,et al.A tiered storage system for massive data:TH-TS[J].Journal of Computer Research and Development,2011,8(6):1089-1100
[28] Li Feng,Beng C O,Ozsu M T.Distributed Data ManagementUsing MapReduce[J].ACM Computing Surveys (CSUR),2014,6(3):1-41
[29] Malewicz G,Austern M H,Bik A J C,et al.Pregel:A system for large-scale graph processing[C]∥Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data.2010:135-146
[30] Valiant L G.A bridging model for parallel computation[J].Communicationg of ACM,1990,3(8):103-111
[31] Hall A,Bachmann O,Büssow R,et al.Processing a trillion cells per mouse click[J].Proceedings of the VLDB Endowment,2012,5(11):1436-1446
[32] Shank G,Wang R Y,Mostapha Z.IP-map:Representing themanufacture of an information product[C]∥Proceedings of the Information Quality Conference.2000
[33] Yair W,Wang R Y.Anchoring data quality dimensions in ontological foundations[J].Communications of the ACM,1994,39(11):86-95
[34] Zargar A,Devillers R.An operation-based communication ofspatial data quality[C]∥Proceedings of the International Conference on Advanced Geographic Information Systems & Web Services.2009:140-145
[35] ISO 2859.0.Sampling Procedures for Inspection by Attributes-Part 0[J].Introduction to the ISO 2859 attribute sampling system,International Organization for Standardization,1995:56-63
[36] Kleiner A,Talwalkar A,Sarkar P,et al.A scalable bootstrap for massive data[J].Journal of the Royal Statistical Society:Series B (Statistical Methodology),2014,6:795-816
[37] Li Lian-fa,Wang Jin-feng.Spatial Sampling model of Geographic Data[J].Progress in Natural Science,2002,12(5):99-102(in Chinese) 李连发,王劲峰.地理数据空间抽样模型[J].自然科学进展,2002,12(5):99-102
[38] Akhavan S T,Nezhad M S.Designing an optimum acceptance sampling plan using bayesian inferences and a stochastic dyna-mic programming approach[J].Transaction E-Industrial Engineering,2009,16(1):19-25
[39] Jamkhaneh E B,Gildeh B S.AOQ and ATI for Double Sampling Plan with Using Fuzzy Binomial Distribution[C]∥Proceedings of International Conference on IEEE Intelligent Computing and Cognitive Informatics (ICICCI).2010:45-49
[40] Duarte B P M.An optimization-based approach for designing attribute acceptance sampling plans[J].International Journal of Quality & Reliability Management,2008,5(8):824-841
[41] Wang Zhen-hua,Zhou Xue-nan,Huang Dong-mei.SamplingModel for Quality Inspection of Uncertain Ocean Data[J].Computer Science,2015,2(2):182-184(in Chinese) 王振华,周雪楠,黄冬梅.不确定海洋数据的质量抽样检验模型研究[J].计算机科学,2015,42(2):182-184
[42] Li Jian-zhong,Liu Xian-min.An Important Aspect of Big Data:Data Usability[J].Journal of Computer Research and Development,2013,0(6):1147-1162(in Chinese) 李建中,刘显敏.大数据的一个重要方面:数据可用性[J].计算机研究与发展,2013,50(6):1147-1162
[43] Fan Wen-fei,Geerts F,Li J,et al.Discovering conditional functional dependencies[J].IEEE Transactions on Knowledge and Data Engineering,2011,3(5):683-698
[44] Golab L,Korn F,Srivastava D.Efficient and Effective Analysis of Data Quality using Pattern Tableaux[J].IEEE on Data Engineering,2011,4(3):26-33
[45] Grahne G.The Problem of Incomplete Information in Relational Databases[M].Berlin:Springer,1991
[46] Fan Wen-fei,F Geerts,Jia Xi-bei.Conditional functional depen-dencies for capturing data inconsistencies[J].ACM Transactions on Database Systems (TODS),2008,3(2):1-48
[47] Chaves L,Buchmann W F,Bohm E K.Finding misplaces items in retail by clustering RFID data[C]∥Proceedings of the 13th International Conference on Extending Database Technology.2012:501-512
[48] Fan Wen-fei,Geerts F,Shuai M,et al.Detecting in consistencies in distributed data[C]∥Proceedings of IEEE ICDE.2010:64-75
[49] Fan Wen-fei,Geerts F,Wijsen J.Determining the currency ofdata[J].ACM Transactions on Database Systems(TODS),2012,7(4):1-46
[50] Whang S E,Menestrina D,Koutrika G,et al.Entiry resolution with interative blocking[C]∥Proceedings of the 35th SIGMOD Conference on Managemnt of Data.2009
[51] Chomicki J,Marcinkowski J.Minimal-change intergrity maintenance using tuple deletions[J].Information and Computation,2005,7(1):90-121
[52] Lin Huang,Cao Zhen-fu,Liang Xiao-hui,et al.Secure Threshold Multi Authority Attribute based Encryption without a Central Authority[M]∥Progress in Cryptology-INDOCRYPT 2008.2008:426-436
[53] Liu Zhen,Cao Zhen-fu,Huang Qiong,et al.Fully Secure Multi-Authority Ciphertext-Policy Attribute-Based Encryption withoutRandom Oracles[M]∥Computer Security-ESOR2CS 2011.2011:278-297
[54] Yang Kan,Jia Xiao-hua.Data Storage Auditing Service in Cloud Computing:Challenges,Methods and Opportunities[J].World Wide Web (WWW),2012,5(4):409-428
[55] Wang Cong,Wang Qian,Ren Kui,et al.Privacy-preserving public auditing for data storage security in cloud computing[C]∥29th IEEE Conference on Computer Communications (INFOCOM’10).2010:1-9
[56] Li Ming,Yu Shu-cheng,Ren Kui,et al.Toward privacy-assured and searchable cloud data storage services[J].IEEE Network,2013,7(4):56-62
[57] Cao Ning,Wang Cong,Li Ming,et al.Privacy-Preserving Multi-keyword Ranked Search over Encrypted Cloud Data[C]∥IEEE Transactions on Parallel and Distributed Systems.2014:829-837
[58] Liang X,Cao Z,Lin H,et al.Provably Secure and Efficient Bounded Ciphertext Policy Attribute Based Encryption[C]∥ASIACCS.2009:343-352
[59] Yang Kan,Jia Xiao-hua,Ren Kui,et al.Enabling Efficient Access Control with Dynamic Policy Updating for Big Data in the Cloud[C]∥INFOCOM 2014.2014:2013-2021
[60] Wang Hui,Lakshmanan L.Efficient secure query evaluationover encrypted XML databases[C]∥VLDB Endowment,2006:127-138
[61] Cao Ning,Wang Cong,Li Ming,et al.Privacy-Preserving Multi-keyword Ranked Search over Encrypted Cloud Data[C]∥Proceedings of IEEE INFOCOM.2011:829-837
[62] Wang Cong,Ren Kui,Yu S,et al.Achieving Usable and Privacyas-sured Similarity Search over Outsourced Cloud Data[C]∥Proceedings of IEEE INFOCOM.2012:25-30
[63] Shen E,Shi E,Waters B.Predicate privacy in encryption systems[C]∥TCC.2009:457-473
[64] Wang Cong,Ren Kui,et al.Harnessing the Cloud for Securely Outsourcing Large-scale Systems of Linear Equations[C]∥IEEE Transactions on Parallel and Distributed Systems.2013:1172-1181
[65] Wang Cong,Xu Zhen,Zhang Bin-sheng,et al.OIRS:Outsourced Image Recovery Service from Compressive Sensing with Privacy Assurance[C]∥NDSS.2013
[66] Smart N P,Vercauteren F.Fully homomorphic encryption with relatively small key and ciphertext sizes[C]∥Proc of the Public Key Cryptography.2010:420-443
[67] Brakerski Z,Gentry C G,et al.Fully Homomorphic encryption without bootstrapping[C]∥Proceedings of the 3rd Innovations in Theoretical Computer Science Conference.2012:309-325
[68] Cheon J H,Coron J S,Kim J,et al.Batch fully homomorphic encryption over the integers[C]∥EUROCRYPT.2013:315-335
[69] Li Ming,Yu Shu-cheng,Ren Kui,et al.Scalable and Secure Sharing of Personal Health Records in Cloud Computing using Attribute-based Encryption[C]∥IEEE Transactions on Parallel and Distributed Systems.2013:131-143
[70] Liu Zhen,Cao Zhen-fu,Wong D S.White-box Traceable Ciphertext-Policy Attribute-Based Encryption Supporting Any Monotone Access Structures[J].IEEE Transactions on Information Forensics and Security,2013,8(1):76-88
[71] Liu Zhen,Cao Zhen-fu,Wong D S.Blackbox Traceable CPA- BE:How to Catch People Leaking Their Keys by Selling Decryption Devices on eBay[C]∥ACM CCS.2013:4-8
[72] Ning Jian-ting,Cao Zhen-fu,Dong Xiao-lei,et al.Large Universe Ciphertext-Policy Attribute-Based Encryption with White-Box Traceability[C]∥European Symposium on Research in Computer Security.2014:55-72
[73] Yang Kan,Jia Xiao-hua,Ren Kui,et al.DAC-MACS:EffectiveData Access Control for Multi-Authority Cloud Storage Systems[C]∥IEEE Transactions on Information Forensics and Security.2013:1790-1801
[74] Cao Zhen-fu.New Directions of Modern Cryptography[M].Florida:CRC Press,2012
[75] Yang Kan,Jia Xiao-hua,Ren Kui.Attribute-based Fine-Grained Access Control with Efficient Revocation in Cloud Storage Systems[C]∥ASIACCS.2013:523-528
[76] Wei Li-fei,Zhu Hao-jin,Cao Zhen-fu,et al.Security and Privacy for Storage and Computation in Cloud Computing[J].Information Sciences,2014,8:371-386
[77] Wang Cong,Ren Kui,Yu Shu-cheng,et al.Achieving Usableand Privacy-assured Similarity Search over Outsourced Cloud Data[C]∥IEEE INFOCOM.2012:25-30
[78] Feng Deng-guo,Zhang Min,Li Hao.Big Data Security and Privacy Protection[J].Chinese Journal of Computers,2014,7(1):1-13(in Chinese) 冯登国,张敏,李昊.大数据安全与隐私保护[J].计算机学报,2014,7(1):1-13
[79] Wang Qian,Ren Kui,Meng Xiao-qiao.When Cloud Meets eBay:Towards Effective Pricing for Cloud Computing[C]∥IEEE INFOCOM.2012:936-944
[80] 曹珍富,大数据时代,如何提升政府治理能力[N].光明日报,2014(11)

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!