Computer Science ›› 2024, Vol. 51 ›› Issue (8): 124-132.doi: 10.11896/jsjkx.230900003
• Database & Big Data & Data Science • Previous Articles Next Articles
QIAN Zekai, DING Xiaoou, SUN Zhe, WANG Hongzhi, ZHANG Yan
CLC Number:
[1]REKATSINAS T,CHU X,ILYAS I F,et al.HoloClean:Holistic Data Repairs with Probabilistic Inference[C]//Proceedings of the VLDB Endowment.2017. [2]SINGH P.Systematic review of data-centric approaches in artificial intelligence and machine learning[J].Data Science and Ma-nagement,2023,6(3):144-157. [3]PAASCHE S,GROPPE S.Enhancing data quality and process optimization for smart manufacturing lines in industry 4.0 scenarios[C]//Proceedings of The International Workshop on Big Data in Emergent Distributed Environments.2022:1-7. [4]HAO S,LI G L,FENG J H,et al.Survey of structured datacleaning methods[J].Journal of Tsinghua University(Science and Technology),2018,58(12):1037-1050. [5]ILYAS I F.Effective Data cleaning with Continuous Evaluation[J].IEEE Data Engineering Bulletin,2016,39(2):38-46. [6]LI H,TANG B,LU H,et al.Spatial data quality in the iot era:management and exploitation[C]//Proceedings of the 2022 International Conference on Management of Data.2022:2474-2482. [7]KRISHNAN S,HAAS D,FRANKLIN M J,et al.Towards relia-ble interactive data cleaning:A user survey and recommendations[C]//Proceedings of the Workshop on Human-In-the-Loop Data Analytics.2016:1-5. [8]GUO Z,ZHOU A Y.Researchon data quality and data clea-ning:a survey[J].Journal of software,2002,13(11):2076-2082. [9]DING X O,WANG H Z,ZHANG X Y,et al.Association relationships study of multi-dimensional data quality[J].Journal of Software,2016,27(7):1626-1644. [10]KRISHNAN S,WU E.Alphaclean:Automatic generation of data cleaning pipelines[J].arXiv:1904.11827,2019. [11]KRISHNAN S,FRANKLIN M J,GOLDBERG K,et al.Boostclean:Automated error detection and repair for machine learning[J].arXiv:1711.01299,2017. [12]ABEDJAN Z,CHU X,DENG D,et al.Detecting data errors:Where are we and what needs to be done?[J].Proceedings of the VLDB Endowment,2016,9(12):993-1004. [13]FARIHA A,TIWARI A,MELIOU A,et al.Coco:Interactiveexploration of conformance constraints for data understanding and data cleaning[C]//Proceedings of the 2021 International Conference on Management of Data.2021:2706-2710. [14]QAHTAN A,TANG N,OUZZANI M,et al.Pattern functional dependencies for data cleaning[C]//Proceedings of the VLDB Endowment.2020. [15]KRISHNAN S,WANG J,WU E,et al.Activeclean:Interactive data cleaning for statistical modeling[J].Proceedings of the VLDB Endowment,2016,9(12):948-959. [16]XI Y,WANG N,CHEN X,et al.EasyDR:a human-in-the-loop error detection&repair platform for holistic table cleaning[J].Proceedings of the VLDB Endowment,2022,15(12):3578-3581. [17]DE SA C,ILYAS I F,KIMELFELD B,et al.A Formal Framework for Probabilistic Unclean Databases[C]//22nd International Conference on Database Theory.2019. [18]REZIG E K,OUZZANI M,AREF W G,et al.Horizon:scalable dependency-driven data cleaning[J].Proceedings of the VLDB Endowment,2021,14(11):2546-2554. [19]MAHDAVI M,ABEDJAN Z.Baran:Effective error correctionvia a unified context representation and transfer learning[J].Proceedings of the VLDB Endowment,2020,13(12):1948-1961. [20]PENG J,SHEN D,TANG N,et al.Self-supervised and Inter-pretable Data Cleaning with Sequence Generative Adversarial Networks[J].Proceedings of the VLDB Endowment,2022,16(3):433-446. [21]ILYAS I F,CHU X.Data Cleaning[M].Morgan & Claypool,2019:49-54. [22]MARQUES F S L.Discovering Denial Constraints Using Boo-lean Patterns[C]//Companion of the 2023 International Confe-rence on Management of Data.2023:281-283. [23]RAY B,GHOSH S,AHMED S,et al.Outlier detection using an ensemble of clustering algorithms[J].Multimedia Tools and Applications,2022,81(2):2681-2709. [24]LI X,DONG X L,LYONS K,et al.Truth Finding on the Deep Web:Is the Problem Solved?[C]//Proceedings of the VLDB Endowment.2012. [25]LEWIS M,LIU Y,GOYAL N,et al.BART:Denoising Se-quence-to-Sequence Pre-training for Natural Language Generation,Translation,and Comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7871-7880. |
[1] | PENG Bo, LI Yaodong, GONG Xianfu. Improved K-means Photovoltaic Energy Data Cleaning Method Based on Autoencoder [J]. Computer Science, 2024, 51(6A): 230700070-5. |
[2] | WANG Chundong, DU Yingqi, MO Xiuliang, FU Haoran. Enhanced Federated Learning Frameworks Based on CutMix [J]. Computer Science, 2023, 50(11A): 220800021-8. |
[3] | LIANG Haowei, WANG Shi, CAO Cungen. Study on Short Text Classification with Imperfect Labels [J]. Computer Science, 2023, 50(1): 185-193. |
[4] | WANG Jun, WANG Xiu-lai, PANG Wei, ZHAO Hong-fei. Research on Big Data Governance for Science and Technology Forecast [J]. Computer Science, 2021, 48(9): 36-42. |
[5] | LIU Zhen-peng, SU Nan, QIN Yi-wen, LU Jia-huan, LI Xiao-fei. FS-CRF:Outlier Detection Model Based on Feature Segmentation and Cascaded Random Forest [J]. Computer Science, 2020, 47(8): 185-188. |
[6] | XU He, WU Hao, LI Peng. Design of Temporal-spatial Data Processing Algorithm for IoT [J]. Computer Science, 2020, 47(11): 310-315. |
[7] | LIU Jin-shuo, LIU Bi-wei, ZHANG Mi, LIU Qing. Fault Prediction of Power Metering Equipment Based on GBDT [J]. Computer Science, 2019, 46(6A): 392-396. |
[8] | WANG Xiao-xia, SUN De-cai. Q-sample-based Local Similarity Join Parallel Algorithm [J]. Computer Science, 2019, 46(12): 38-44. |
[9] | SUN De-cai and WANG Xiao-xia. MapReduce Based Similarity Self-join Algorithm for Big Dataset [J]. Computer Science, 2017, 44(5): 20-25. |
[10] | GU Yun-hua, GAO Bao, ZHANG Jun-yong and DU Jie. RFID Data Cleaning Algorithm Based on Tag Velocity and Sliding Sub-window [J]. Computer Science, 2015, 42(1): 144-148. |
[11] | WANG Wan-liang,GU Xi-ren and ZHAO Yan-wei. RFID Uncertain Data Cleaning Algorithm Based on Dynamic Tags [J]. Computer Science, 2014, 41(Z6): 383-386. |
[12] | CHEN Jing-yun,ZHOU Liang and DING Qiu-lin. Cleaning Method Research of RFID Data Stream Based on Improved Kalman Filter [J]. Computer Science, 2014, 41(3): 202-204. |
[13] | . Data Cleaning and its General System Framework [J]. Computer Science, 2012, 39(Z11): 207-211. |
[14] | . Realization of Data Cleaning Based on Editing Rules and Master Data [J]. Computer Science, 2012, 39(Z11): 174-176. |
[15] | CAO Jian-jun,DIAO Xing-chun,WANG Ting,WANG Fang-xiao. Research on Domain-independent Data Cleaning: A Survey [J]. Computer Science, 2010, 37(5): 26-29. |
|