Computer Science ›› 2020, Vol. 47 ›› Issue (9): 110-116.doi: 10.11896/jsjkx.191000156

• Database & Big Data & Data Science • Previous Articles     Next Articles

Big Data Valuation Algorithm

ZHAO Hui-qun1, WU Kai-feng2   

  1. 1 College of Computer Science and Technology,North China University of Technology,Beijing 100144,China
    2 Beijing Key Laboratory of Large-scale Stream Data Integration and Analysis Technology,North China University of Technology,Beijing 100144,China
  • Received:2019-10-24 Published:2020-09-10
  • About author:ZHAO Hui-qun,born in 1960,Ph.D,professor.His main research interests include software architecture,big data generation,internet of things,cloud computing,and sports computing.
    WU Kai-feng,born in 1994,master.His main research interests include big data pricing,big data asset management,and big data services.
  • Supported by:
    National Natural Science Foundation of China (61672041).

Abstract: With the rapid development of information technology,the generation of data has shown an exponential growth trend.Big data has become one of the most frequently used words due to the rapid emergence of big data and its great value.It is not only an academic vocabulary,but has gradually become a commodity name.Whether from academic research or data trading needs,how to evaluate the availability of big data sets is a new issue.A big data usability evaluation model is proposed to provide refe-rence for academic and circulation fields in this paper.Combined with the 4V(Volume,Variety,Velocity,Value) characteristics of big data,the 4V characteristic distribution of the statistical data is segmented,which gives the probability model of big data based on the piecewise distribution and the availability of large data sets and weighted evaluation model.An algorithm for realizing big data block sampling and an estimation algorithm for weighting coefficients of each characteristic in the big data set evaluation model are proposed.Combined with the data availability evaluation requirements in video big data analysis,the specific applications of the proposed models and algorithms are demonstrated.The big data usability evaluation model can be used for data evalua-tion of data science experiments,and can also be used for data set pricing in big data transaction markets.In the actual evaluation work,how to standardize(commercialized) data sets,and how to determine the specific operational aspects of the video field eva-luation benchmarks are given.The application case supports the proposed model and further tests the feasibility of the model.

Key words: Big data availability evaluation, Probability model, Big data blocking algorithm, Video big data

CLC Number: 

  • TP391
[1] LI J Z,LIU X M.An important aspect of big data:data availabi-lity [J].Computer Research and Development,2013,50(6):1147-1162.
[2] WANG S,WANG H J,XI X P,et al.Architectural Big Data:Challenges,Status Quo and Prospects[J].Chinese Journal of Computers,2011,34(10):1741-1752.
[3] LIANG J Y,WANG F,DANG C Y,et al.An efficient rough feature selection algorithm with a multigranulation view[J].Interational Journal of Approximate Reasoning,2012,53:912-926.
[4] ZHOU H X,CHEN S C.A Canonical Correlation Analysis of Ordered Discrimination[J].Journal of Software,2014,25(9):2018-2025.
[5] HUO W,MENG X F.Research on Trajectory Privacy Protec-tion Technology[J].Chinese Journal of Computers,2011,34(10):1820-1830.
[6] CHENG Y X.Methodology and Practice of Data Asset Management in the Age of Big Data[J].Computer Applications and Software,2018,35(11):326-329.
[7] ZHAO Z R.Analysis of Domestic Big Data Transaction Pricing[J].Information Security & Communication Secrecy,2017(5):61-67.
[8] CHEN Y,ZHOU J E,DU J Q.A Credit Evaluation MethodBased on Transaction Data[J].Computer Applications and Software,2018,35(5):168-171.
[9] VINAYAK R,BORKAR,MICHAEL J.Big Data Platforms:What’s The Next?[J].XRDS·FALL,2012(1):44-49.
[10] WANG W,ZHANG M J,WANG J.Research on Risk FactorIdentification in Big Data Transaction Business Process [J/OL].[2019-07-08]. /kcms/detail/11.1762.G3.20190603.0844.004.html.
[11] YE Q Q,MENG X F,ZHU M J,et al.A Review of Localized Differential Privacy Research[J].Journal of Software,2018,29(7):1981-2005.
[12] WANG H L,TIAN Y L,YIN X.Big Data Confirmation Scheme Based on Blockchain[J].Computer Science,2018,45(2):15-19,24.
[13] HE C,WANG Y R.Research on the Difficulties and Countermeasures of Big Data Trading Platform in China[J].Modern Love Newspaper,2017,37(8):98-105,153.
[14] NIYATOD,ABUALSHEIKHM,PING WING,et al.Marketmodel and optimal pricing scheme of big data and internet of things(IOT)[J/OL].Arxiv,2016:1-6.
[15] DEEP S,KOUTRIS P.The design of arbitrage-free data pricing schemes[J].Schloss Dagstuhl-Leibniz-Zentrum für Informatik,2017(12):1-18.
[16] TAN X T,GU Y Y,RUAN T,et al.Confidence Interval Method for Data Set Classification Availability Evaluation[J].Computer Science,2019,46(1):78-85.
[17] WU X D,DONG B B,CAO X Z,et al.Data Governance Technology [J/OL].[2019-07-02].
[18] GUO B,LI Q,DUAN X L,et al.Personal Data Banking-A New Model of Personal Big Data Asset Management and Value-added Services Based on Bank Architecture[J].Computer Journal,2017,40(1):126-143.
[19] EMC Solution Group.Big data-as-a-service:A market and technology perspective[R].2012.
[20] LIU H F,ZHENG H,AHMAD M,et al.A new user similarity model to improve the accuracy of collaborative filtering[J].Knowledge-Based Systems,2014(56):156-166.
[21] ZHAO H Q,SUN J,ZHAO R X.A Model for Assessing the Dependability of Internetware Software Systems[C]//IEEE 39th Annual International Computers,Software & Applications Conference.2015:578-581.
[22] LE H S.Dealing with the new user cold-start problem in recommender systems:A comparative review[J].Information Systems,2016,58:87-104.
[23] KATARYA R,VERMA O P.Recent developments in affective recommender systems[J/OL].Physica A Statal Mechanics & Its Applications,2016:182-190.
[24] TOMMASO D N,JESSICA R,PAOLO T,et al.Adaptive multi-attribute diversity for recommender systems[J].Information Sciences,2017,3:234-253.
[25] MARÍA D C R H,SERGIO I,RAMÓN H R T L.DataGen-CARS:A generator of synthetic data for the evaluation of context-aware recommendation systems[J].Pervasive and Mobile Computing,2017,7:516-541.
[26] LI J Z,WANG H Z,GAO H.Research Progress in Big Data Usa-bility[J].Journal of Software,2016,27(7):1605-1625.
[27] Guiyang Big Data Trading Center.2016 China Big Data Transaction White Paper[OL].
[1] LIU Yun-heng and LIU Yao-zong. Hadoop-based Public Security Video Big Data Processing Method [J]. Computer Science, 2016, 43(Z6): 448-451.
[2] YANG Bei, ZHOU Lan-jiang, YU Zheng-tao and LIU Li-jia. Research on Semi-supervised Learning Based Approach for Lao Part of Speech Tagging [J]. Computer Science, 2016, 43(9): 103-106.
[3] YU Juan,HE Yu-yao and FENG Xiao-hua. Solving HW/SW Partitioning Problem by Improved Estimation of Distribution Algorithm [J]. Computer Science, 2014, 41(9): 285-289.
[4] LIANG Jia-rong,HUA Ren-jie. Reliability Analysis of star Network with Link Failures [J]. Computer Science, 2010, 37(6): 106-110.
Full text



[1] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[2] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[3] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[4] YANG Yu-qi, ZHANG Guo-an and JIN Xi-long. Dual-cluster-head Routing Protocol Based on Vehicle Density in VANETs[J]. Computer Science, 2018, 45(4): 126 -130 .
[5] SHI Chao, XIE Zai-peng, LIU Han and LV Xin. Optimization of Container Deployment Strategy Based on Stable Matching[J]. Computer Science, 2018, 45(4): 131 -136 .
[6] HAN Kui-kui, XIE Zai-peng and LV Xin. Fog Computing Task Scheduling Strategy Based on Improved Genetic Algorithm[J]. Computer Science, 2018, 45(4): 137 -142 .
[7] PANG Bo, JIN Qian-kun, HENIGULI·Wu Mai Er and QI Xing-bin. Routing Scheme Based on Network Slicing and ILP Model in SDN[J]. Computer Science, 2018, 45(4): 143 -147 .
[8] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151 .
[9] ZHENG Xiu-lin, SONG Hai-yan and FU Yi-peng. Distinguishing Attack of MORUS-1280-128[J]. Computer Science, 2018, 45(4): 152 -156 .
[10] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .