计算机科学 ›› 2024, Vol. 51 ›› Issue (4): 19-27.doi: 10.11896/jsjkx.230400138

• 紧凑数据结构 • 上一篇    下一篇

数据质量测量框架研究及领域测量框架构建

宋金玉, 陈连勇, 陈刚   

  1. 陆军工程大学指挥控制工程学院 南京210007
  • 收稿日期:2023-04-20 修回日期:2023-07-25 出版日期:2024-04-15 发布日期:2024-04-10
  • 通讯作者: 陈连勇(nature68c@163.com)
  • 作者简介:(sjyhello@163.com)
  • 基金资助:
    国家自然科学基金(62207031)

Data Quality Measurement Framework Research and Field Measurement Framework Construction

SONG Jinyu, CHEN Lianyong, CHEN Gang   

  1. College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China
  • Received:2023-04-20 Revised:2023-07-25 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    National Natural Science Foundation of China(62207031).

摘要: 为激活数据质量潜能,构建兼顾信息环境与技术实现的数据质量测量框架,以提升数据挖掘和指挥决策的效用,文中从宏观层面和微观层面对现有的通用型、行业型数据质量测量框架进行梳理、研究,对数据质量维度进行“聚类”,得到数据质量维度类簇,提取了数据质量维度的两类特性,提出面向具体领域的数据质量测量框架构建准则。基于管理领域工作对数据质量的测量需求,结合构建准则构建了面向管理领域的数据质量测量框架,并明确了框架的数据质量维度、测量指标、测量方法等。

关键词: 数据质量, 数据质量维度, 数据质量测量, 数据质量测量框架

Abstract: In order to activate the potential of data quality,a data quality measurement framework that takes into account information environment and technology realization is constructed,so as to improve the effectiveness of data mining and command decision.At first,the existing general and industrial data quality measurement frameworks are studied from the macro and micro levels,the data quality dimension clusters are obtained by clustering the data quality dimensions,and two types of characteristics of data quality dimension are extracted.Furthermore,the construction guidelines of data quality measurement framework for specific field are put forward.Finally,based on the requirements of data quality measurement in management field,the data quality measurement framework for management field(DQMFM) is constructed by combining construction guidelines.Besides,the data quality dimensions,measurement metrics and measurement methods of DQMFM are introduced.

Key words: Data quality, Data quality dimension, Data quality measurement, Data quality measurement framework

中图分类号: 

  • TP274
[1]CAI L,LIANG Y,ZHU Y Y,et al.History and Development Tendency of Data Quality[J].Computer Science,2018,45(4):1-10.
[2]HUANG G B,CHEN L.Comparative Study on the ForeignFramework of Scientific Data Quality Assessment [J].Library and Information,2021,1:97-107.
[3]IBRAHIM A,MOHAMED I,MOHD SATAR N S.Factors Influencing Master Data Quality:A Systematic Review[J].International Journal of Advanced Computer Science and Applications,2021,12(2):181-192.
[4]GUO Z M,ZHOU A Y.Review of data quality and data clean-sing research [J].Journal of Software,2002,13(11):2076-2082.
[5]SONG J Y,CHEN S,GUO D P,et al.Data Quality and Data Cleaning Methods[J].Command Information System and Technology,2013,4(5):63-70.
[6]WANG R Y.A Product Perspective on Total Data Quality Ma-nagement[J].Communications of the ACM,1998,41(2):58-65.
[7]MADNICK S E,WANG R Y,LEE Y W,et al.Overview and Framework for Data and Information Quality Research[J].Journal of Data and Information Quality,2009,1(1):1-22.
[8]LEE Y W,STRONG D M,KAHN B K.AIMQ:A Methodology for Information Quality Assessment[J].Information & Ma-nagement,2002,40(2):133-460.
[9]HU L L,LI J H,LIU N,et al.Practice and Some Thoughts on Quality of Scientific Data [J].e-Science Technology & Application,2012,3(2):10-8.
[10]DIAO X C,CAO J J,ZHANG J M,et al.Executing Data Quality Projects[M].Beijing:Publishing House of Electronics Industry,2010:96-138.
[11]DAI C F,LIU L H,ZENG S H,et al.On Military Data Quality Management [J].Journal of Command and Control,2016,2(4):322-328.
[12]LIU G F,NIE Y B,LIU C.Research progress of data qualityevaluation object,system,method and technology[J].Information Science,2021,39(11):8.
[13]WANG R Y,STRONG D M.Beyond Accuracy:What DataQuality Means to Data Consumers[J].Journal of Management Information Systems,1996,12(4):5-33.
[14]CHANG N.IMF data quality assessment framework and comment[J].Statistical Research,2004,1:27-30.
[15]COLEMAN L S.Measuring Data Quality for Ongoing Improvement:A Data Quality Assessment Framework[M].Translated by LU T,LI Y.Beijing:China Machine Press,2016:43-82.
[16]ISO.ISO8000 Data quality-vocabulary[S].Geneve:InternationalOrganization for Standardization,2020.
[17]lnformation technology-Evaluation indicators for data quality:GB/T 36344-2018[S].Beijing:Standards Press of China,2018.
[18]LI D Y,ZHENG L.A data quality assessment and analysisframework for production control[J].Industrial Engineering and Management,27(4):123-133.
[19]LIU J J,WANG M.Practice of Data Quality Evaluating Index Construction under Big Data[J].Computer Technology and Development,2019,29(10):46-50.
[20]CESAR G G,ANASTASIJIA N,SAMANTHA J,et al.ISO/IEC 25012-based methodology for managing data quality requirements in the development of information systems:Towards Data Quality by Design[J].Data and Knowledge Engineering,2023,145:102152.
[21]CABALLERO I,GUALO F,RODRIGUEZ M,et al.BR4DQ:A methodology for grouping business rules for data quality evaluation[J].Information Systems,2022,109:102058.
[22]Requirements of information data quality management public security:GA/T 1000-2011[S].Beijing:Standards Press of China,2011.
[23]Evaluation guide for military data quality:GJB/Z 184-2017[S].Beijing:Publishing and Distribution Department of National Military Standards,2017.
[24]HUAWEI Data Management Department.Enterprise Data atHUAWEI[M].Beijing:China Machine Press,2020:228-256.
[25]PIPINO L L,LEE Y W,WANG R Y.Data Quality Assessment [J].Communications of the ACM,2002,45(4):211-218.
[26]DAMA International.DAMA-Data Management Body of Know-ledge [M].Translated by Translation Group of DAMA China Branch.Beijing:China Machine Press,2020:346-385.
[27]LI K,ZHANG T,WU J,et al.Quality assessment of equipment test and evaluation data based on cloud model[J].Fire Control &Command Control,2022,47(9):174-179.
[28]WEI J,YE A N,YANG C.Data Quality Evaluation Index System of Command Information System[J].Command Information System and Technology,2020,11(2):85-88.
[29]Data Technologyand Products Division of Alibaba Group.The Road to Big Data-Alibaba's Big Data Practice[M].Beijing:Publishing House of Electronics Industry,2017:285-301.
[30]DATAMAN-Practice of Meituan travel data quality supervision platform [EB/OL].2018-03-21.https://tech.meituan.com/2018/03/21/mtdp-dataman.html.
[31]RIDZUAN F,WAN ZAINON W M N,ZAIRUL M.AThematic Review on Data Quality Challenges and Dimension in the Era of Big Data[C]//Proceedings of the 12th National Technical Seminar on Unmanned System Technology 2020.Singapore:Sprin-ger,2021:725-737.
[32]MISRZAIE M,BEHKAMAL B,ALLAHBAKHSH M,et al.State of the art on quality control for data streams:A systematic literature review[J].Computer Science Review,2023,48:100554.
[33]HAUG A.Understanding the differences across data qualityclassifications:a literature review and guidelines for future research[J].Industrial Management & Data Systems,2021,121(12):2651-2671.
[34]LIAW S T,GUO J G N,ANSARI S,et al.Quality assessment of real-world data repositories across the data life cycle:A literature review [J].Journal of the American Medical Informatics Association,2021,28(7):1591-1599.
[35]BATINI C,SCANNAPIECO M.Data and Information Quality Dimensions,Principles and Techniques[M].Translated by WENG N F,CAO J J,JIANG C,et al.Beijing:National Defense Industry Press,2022:17-43.
[36]BLAKE R,MANGIAMELI P.The Effects and Interactions of Data Quality and Problem Complexity on Classification[J].ACM Journal of Data and Information Quality,2011,2(2):820-828.
[37]ZHAO H T,GAO W C,JING C F,et al.A full life data quality workflow research and project practice[J].The International Archives of the Photogrammetry,Remote Sensing and Spatial Information Sciences,2021,43:327-332.
[38]VIEDT I,MADLER J,KHAYDAROV V,et al.Prescriptive and descriptive quality metrics for the quality assessment of operational data:Quality assessment for data-driven and hybrid mo-dels in the process industry[C]//INFORMATIK 2022.Bonn:Gesellschaft fur Informatik,2022:1061-1064.
[39]YALAOUI M,BOUKHEDOUMA S.A survey on data quality:principles,taxonomies and comparison of approaches[C]//Proceedings of 2021 International Conference on Information Systems and Advanced Technologies.Algeria:IEEE Press,2021:1-9.
[40]REDA O,SASSI I,ZELLOU A,et al.Towards a Data Quality Assessment in Big Data[C]//Proceedings of the 13th International Conference on Intelligent Systems:Theories and Applications.New York:Association for Computing Machinery,2020:1-6.
[41]REDMAN T C.Data Quality for the Information Age[M].USA:Artech House,1997:2-15.
[42]SIDIF,PANAHY P H S,AFFENDEY L S,et al.Data Quality:A Survey of Data Quality Dimensions[C]//Proceedings of the 2012 International Conference on Information Retrieval & Knowledge Management.Kuala Lumpur:IEEE Press,2012:300-304.
[43]PIPINO L E,LEE Y W,WNAG R Y.Data Quality Assessment[J].Communication of the ACM,2002,45(4):211-218.
[44]KNIGHT S,BURN J.Developing a framework for assessing information quality on the world wide web[J].Informing Science Journal,2005,8:159-172.
[45]VALVERDE C,MAROTTA A,et al.Towards a model andmethodology for evaluating data quality in software engineering experiments[J].Information and Software Technology,2022,151:1-17.
[46]TAN Z B,LIU C L.Information Systems Project ManagerCourse[M].Beijing:Tsinghua University Press,2017:2-5.
[47]GAO Q,YOU H L,TANG S H,et al.Research on Military Data Governance Concept and Framework[J].Information studies:Theory & Application,2019,42(12):55-59.
[48]HUANG G,YUAN M,WU X Y,et al.Data quality assessment architecture research based on metadata-driven[J].Computer Engineering and Applications,2013,49(8):114-119.
[49]LI J Z,WANG H Z,GAO H.State-of-the-Art of research on big data usability[J].Journal of Software,2016,27(7):1605-1625.
[50]DING X O,LI Y Z,WANG C,et al.Time Series Data QualityRules Discovery with Both Row and Column Dependencies[J].Journal of Software,2023,34(3):1065-1086.
[51]ZHANG Y,YANG Y F,YI R,et al.Exploration and practice of data quality governance in privacy computing scenarios[J].Big Data Research,2022,8(5):55-73.
[52]WANG M J,PAN Q M,LIU Z,et al.Survey of visualization data cleaning[J].Journal of Image and Graphics,2015,20(4):468-482.
[53]TENG D X,ZENG Z R,YANG H Y,et al.Visual quality analysis method for relational data[J].Journal of Software,2013,24(4):810-824.
[54]BORS C,GSCHWANDTNER T,KRIGLSTEIN S,et al.Visual Interactive Creation,Customization,and Analysis of Data Quality Metrics[J].Journal of Data and Information Quality,2018,10(1):3-26.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!