Computer Science ›› 2024, Vol. 51 ›› Issue (4): 19-27.doi: 10.11896/jsjkx.230400138

• Compact Data Structure • Previous Articles     Next Articles

Data Quality Measurement Framework Research and Field Measurement Framework Construction

SONG Jinyu, CHEN Lianyong, CHEN Gang   

  1. College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China
  • Received:2023-04-20 Revised:2023-07-25 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    National Natural Science Foundation of China(62207031).

Abstract: In order to activate the potential of data quality,a data quality measurement framework that takes into account information environment and technology realization is constructed,so as to improve the effectiveness of data mining and command decision.At first,the existing general and industrial data quality measurement frameworks are studied from the macro and micro levels,the data quality dimension clusters are obtained by clustering the data quality dimensions,and two types of characteristics of data quality dimension are extracted.Furthermore,the construction guidelines of data quality measurement framework for specific field are put forward.Finally,based on the requirements of data quality measurement in management field,the data quality measurement framework for management field(DQMFM) is constructed by combining construction guidelines.Besides,the data quality dimensions,measurement metrics and measurement methods of DQMFM are introduced.

Key words: Data quality, Data quality dimension, Data quality measurement, Data quality measurement framework

CLC Number: 

  • TP274
[1]CAI L,LIANG Y,ZHU Y Y,et al.History and Development Tendency of Data Quality[J].Computer Science,2018,45(4):1-10.
[2]HUANG G B,CHEN L.Comparative Study on the ForeignFramework of Scientific Data Quality Assessment [J].Library and Information,2021,1:97-107.
[3]IBRAHIM A,MOHAMED I,MOHD SATAR N S.Factors Influencing Master Data Quality:A Systematic Review[J].International Journal of Advanced Computer Science and Applications,2021,12(2):181-192.
[4]GUO Z M,ZHOU A Y.Review of data quality and data clean-sing research [J].Journal of Software,2002,13(11):2076-2082.
[5]SONG J Y,CHEN S,GUO D P,et al.Data Quality and Data Cleaning Methods[J].Command Information System and Technology,2013,4(5):63-70.
[6]WANG R Y.A Product Perspective on Total Data Quality Ma-nagement[J].Communications of the ACM,1998,41(2):58-65.
[7]MADNICK S E,WANG R Y,LEE Y W,et al.Overview and Framework for Data and Information Quality Research[J].Journal of Data and Information Quality,2009,1(1):1-22.
[8]LEE Y W,STRONG D M,KAHN B K.AIMQ:A Methodology for Information Quality Assessment[J].Information & Ma-nagement,2002,40(2):133-460.
[9]HU L L,LI J H,LIU N,et al.Practice and Some Thoughts on Quality of Scientific Data [J].e-Science Technology & Application,2012,3(2):10-8.
[10]DIAO X C,CAO J J,ZHANG J M,et al.Executing Data Quality Projects[M].Beijing:Publishing House of Electronics Industry,2010:96-138.
[11]DAI C F,LIU L H,ZENG S H,et al.On Military Data Quality Management [J].Journal of Command and Control,2016,2(4):322-328.
[12]LIU G F,NIE Y B,LIU C.Research progress of data qualityevaluation object,system,method and technology[J].Information Science,2021,39(11):8.
[13]WANG R Y,STRONG D M.Beyond Accuracy:What DataQuality Means to Data Consumers[J].Journal of Management Information Systems,1996,12(4):5-33.
[14]CHANG N.IMF data quality assessment framework and comment[J].Statistical Research,2004,1:27-30.
[15]COLEMAN L S.Measuring Data Quality for Ongoing Improvement:A Data Quality Assessment Framework[M].Translated by LU T,LI Y.Beijing:China Machine Press,2016:43-82.
[16]ISO.ISO8000 Data quality-vocabulary[S].Geneve:InternationalOrganization for Standardization,2020.
[17]lnformation technology-Evaluation indicators for data quality:GB/T 36344-2018[S].Beijing:Standards Press of China,2018.
[18]LI D Y,ZHENG L.A data quality assessment and analysisframework for production control[J].Industrial Engineering and Management,27(4):123-133.
[19]LIU J J,WANG M.Practice of Data Quality Evaluating Index Construction under Big Data[J].Computer Technology and Development,2019,29(10):46-50.
[20]CESAR G G,ANASTASIJIA N,SAMANTHA J,et al.ISO/IEC 25012-based methodology for managing data quality requirements in the development of information systems:Towards Data Quality by Design[J].Data and Knowledge Engineering,2023,145:102152.
[21]CABALLERO I,GUALO F,RODRIGUEZ M,et al.BR4DQ:A methodology for grouping business rules for data quality evaluation[J].Information Systems,2022,109:102058.
[22]Requirements of information data quality management public security:GA/T 1000-2011[S].Beijing:Standards Press of China,2011.
[23]Evaluation guide for military data quality:GJB/Z 184-2017[S].Beijing:Publishing and Distribution Department of National Military Standards,2017.
[24]HUAWEI Data Management Department.Enterprise Data atHUAWEI[M].Beijing:China Machine Press,2020:228-256.
[25]PIPINO L L,LEE Y W,WANG R Y.Data Quality Assessment [J].Communications of the ACM,2002,45(4):211-218.
[26]DAMA International.DAMA-Data Management Body of Know-ledge [M].Translated by Translation Group of DAMA China Branch.Beijing:China Machine Press,2020:346-385.
[27]LI K,ZHANG T,WU J,et al.Quality assessment of equipment test and evaluation data based on cloud model[J].Fire Control &Command Control,2022,47(9):174-179.
[28]WEI J,YE A N,YANG C.Data Quality Evaluation Index System of Command Information System[J].Command Information System and Technology,2020,11(2):85-88.
[29]Data Technologyand Products Division of Alibaba Group.The Road to Big Data-Alibaba's Big Data Practice[M].Beijing:Publishing House of Electronics Industry,2017:285-301.
[30]DATAMAN-Practice of Meituan travel data quality supervision platform [EB/OL].2018-03-21.https://tech.meituan.com/2018/03/21/mtdp-dataman.html.
[31]RIDZUAN F,WAN ZAINON W M N,ZAIRUL M.AThematic Review on Data Quality Challenges and Dimension in the Era of Big Data[C]//Proceedings of the 12th National Technical Seminar on Unmanned System Technology 2020.Singapore:Sprin-ger,2021:725-737.
[32]MISRZAIE M,BEHKAMAL B,ALLAHBAKHSH M,et al.State of the art on quality control for data streams:A systematic literature review[J].Computer Science Review,2023,48:100554.
[33]HAUG A.Understanding the differences across data qualityclassifications:a literature review and guidelines for future research[J].Industrial Management & Data Systems,2021,121(12):2651-2671.
[34]LIAW S T,GUO J G N,ANSARI S,et al.Quality assessment of real-world data repositories across the data life cycle:A literature review [J].Journal of the American Medical Informatics Association,2021,28(7):1591-1599.
[35]BATINI C,SCANNAPIECO M.Data and Information Quality Dimensions,Principles and Techniques[M].Translated by WENG N F,CAO J J,JIANG C,et al.Beijing:National Defense Industry Press,2022:17-43.
[36]BLAKE R,MANGIAMELI P.The Effects and Interactions of Data Quality and Problem Complexity on Classification[J].ACM Journal of Data and Information Quality,2011,2(2):820-828.
[37]ZHAO H T,GAO W C,JING C F,et al.A full life data quality workflow research and project practice[J].The International Archives of the Photogrammetry,Remote Sensing and Spatial Information Sciences,2021,43:327-332.
[38]VIEDT I,MADLER J,KHAYDAROV V,et al.Prescriptive and descriptive quality metrics for the quality assessment of operational data:Quality assessment for data-driven and hybrid mo-dels in the process industry[C]//INFORMATIK 2022.Bonn:Gesellschaft fur Informatik,2022:1061-1064.
[39]YALAOUI M,BOUKHEDOUMA S.A survey on data quality:principles,taxonomies and comparison of approaches[C]//Proceedings of 2021 International Conference on Information Systems and Advanced Technologies.Algeria:IEEE Press,2021:1-9.
[40]REDA O,SASSI I,ZELLOU A,et al.Towards a Data Quality Assessment in Big Data[C]//Proceedings of the 13th International Conference on Intelligent Systems:Theories and Applications.New York:Association for Computing Machinery,2020:1-6.
[41]REDMAN T C.Data Quality for the Information Age[M].USA:Artech House,1997:2-15.
[42]SIDIF,PANAHY P H S,AFFENDEY L S,et al.Data Quality:A Survey of Data Quality Dimensions[C]//Proceedings of the 2012 International Conference on Information Retrieval & Knowledge Management.Kuala Lumpur:IEEE Press,2012:300-304.
[43]PIPINO L E,LEE Y W,WNAG R Y.Data Quality Assessment[J].Communication of the ACM,2002,45(4):211-218.
[44]KNIGHT S,BURN J.Developing a framework for assessing information quality on the world wide web[J].Informing Science Journal,2005,8:159-172.
[45]VALVERDE C,MAROTTA A,et al.Towards a model andmethodology for evaluating data quality in software engineering experiments[J].Information and Software Technology,2022,151:1-17.
[46]TAN Z B,LIU C L.Information Systems Project ManagerCourse[M].Beijing:Tsinghua University Press,2017:2-5.
[47]GAO Q,YOU H L,TANG S H,et al.Research on Military Data Governance Concept and Framework[J].Information studies:Theory & Application,2019,42(12):55-59.
[48]HUANG G,YUAN M,WU X Y,et al.Data quality assessment architecture research based on metadata-driven[J].Computer Engineering and Applications,2013,49(8):114-119.
[49]LI J Z,WANG H Z,GAO H.State-of-the-Art of research on big data usability[J].Journal of Software,2016,27(7):1605-1625.
[50]DING X O,LI Y Z,WANG C,et al.Time Series Data QualityRules Discovery with Both Row and Column Dependencies[J].Journal of Software,2023,34(3):1065-1086.
[51]ZHANG Y,YANG Y F,YI R,et al.Exploration and practice of data quality governance in privacy computing scenarios[J].Big Data Research,2022,8(5):55-73.
[52]WANG M J,PAN Q M,LIU Z,et al.Survey of visualization data cleaning[J].Journal of Image and Graphics,2015,20(4):468-482.
[53]TENG D X,ZENG Z R,YANG H Y,et al.Visual quality analysis method for relational data[J].Journal of Software,2013,24(4):810-824.
[54]BORS C,GSCHWANDTNER T,KRIGLSTEIN S,et al.Visual Interactive Creation,Customization,and Analysis of Data Quality Metrics[J].Journal of Data and Information Quality,2018,10(1):3-26.
[1] ZHANG Guohao, WANG Yi, ZHOU Xi, WANG Baoquan. Deep Collaborative Truth Discovery Based on Variational Multi-hop Graph Attention Encoder [J]. Computer Science, 2024, 51(3): 109-117.
[2] ZHENG Xiao-meng, GAO Meng, TENG Jun-yuan. Research on Construction Method of Defect Prediction Dataset for Spacecraft Software [J]. Computer Science, 2021, 48(6A): 575-580.
[3] LI Zhuo, XU Zhe, CHEN Xin, LI Shu-qin. Location-related Online Multi-task Assignment Algorithm for Mobile Crowd Sensing [J]. Computer Science, 2019, 46(6): 102-106.
[4] WANG Yang, CAI Shu-qin, ZOU Xin-wen, CHEN Zi-tong. Quality-embedded Hypergraph Model for Big Data Product Manufacturing System and Decision for Production Lines [J]. Computer Science, 2019, 46(2): 11-17.
[5] CAI Li, LIANG Yu, ZHU Yang-yong and HE Jing. History and Development Tendency of Data Quality [J]. Computer Science, 2018, 45(4): 1-10.
[6] SHANG Yu-ling, CAO Jian-jun, LI Hong-mei, ZHENG Qi-bin. Co-author and Affiliate Based Name Disambiguation Approach [J]. Computer Science, 2018, 45(11): 220-225.
[7] HUANG Dong-mei, ZHAO Dan-feng, WEI Li-fei, DU Yan-ling and WANG Zhen-hua. Managing Marine Data as Big Data:Uprising Challenges and Tentative Solutions [J]. Computer Science, 2016, 43(6): 17-23.
[8] HAN Jing-yu and CHEN Ke-jia. Ranking Data Quality of Web Article Content by Extracting Facts [J]. Computer Science, 2014, 41(11): 247-251.
[9] . Data Cleaning and its General System Framework [J]. Computer Science, 2012, 39(Z11): 207-211.
[10] . Realization of Data Cleaning Based on Editing Rules and Master Data [J]. Computer Science, 2012, 39(Z11): 174-176.
[11] XU Jun-gang,PEI Ying. Overview of Data Extraction, Transformation and Loading [J]. Computer Science, 2011, 38(4): 15-20.
[12] CHEN Wei-dong,ZHANG Wei-ming. Data Quality Model and Metrics Research at Attribute Granularity [J]. Computer Science, 2010, 37(5): 139-142.
[13] CAO Jian-jun,DIAO Xing-chun,WANG Ting,WANG Fang-xiao. Research on Domain-independent Data Cleaning: A Survey [J]. Computer Science, 2010, 37(5): 26-29.
[14] HU Yan-li,ZHANG Wei-ming. Theory of Conditional Functional Dependencies and its Application for Improving Data Quality [J]. Computer Science, 2009, 36(12): 115-118.
[15] HU Yan-li , ZHANG Wei-ming, LUO Xu-hui ,XIAO Wei-dong , TANG Da-quan. Dependencies Theory and its Application for Repairing Inconsistent Data [J]. Computer Science, 2009, 36(10): 11-15.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!