计算机科学 ›› 2010, Vol. 37 ›› Issue (4): 224-.

• 人工智能 • 上一篇    下一篇

一种基于总线模型的数据清洗方法

杨梦宁,赵鹏,张小洪,李朋   

  1. (重庆大学软件学院 重庆400044)
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(60975015),重庆市科委科技攻关计划项目(2009AC2057) ,重庆市科委自然科学基金((2009BB2364) ,重庆大学青年骨干教师创新能力培育基金资助。

Data Clean Method Based on Bus Model

YANG Meng-ning,ZHAO Peng,ZHANG Xiao-hong,LI Peng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 数据清洗是保证数据质量的重要环节。已有的清洗方法往往过于依赖特定应用,不容易得到重用。从提高数据清洗方法的可重用性和可扩展性的角度出发,提出一种基于总线模型可复用的数据清洗框架。具有相对独立功能的清洗工具以组件的形式,通过适配器挂接到清洗总线上,通过总线控制对清洗组件实现清洗。最后用具体应用来描述基于总线模型的数据清洗方法的工作流程。实践结果证明该方法具有良好的性能和应用价值。

关键词: 数据清洗,总线模型,组件,可复用

Abstract: Data cleansing is an important part for ensuring data quality. The existing cleaning methods are often too dependent on a specific application, can not be reused. In order to improve the reusability and scalability of the clean meth od, a data clean framework was build which is based on bus model and reusable. The data clean tool which has independent clean function is registered on the bus through the adapter. The clean function is finished by calling the clean components which is registered on the bus. Finally, how the method works in the really scene was described. The method was proved has good value of application.

Key words: Data clean,Bus model, Component, Reusable

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!