Computer Science ›› 2014, Vol. 41 ›› Issue (8): 85-89.doi: 10.11896/j.issn.1002-137X.2014.08.018

Previous Articles     Next Articles

Study on Schema Matching with Uncertain Column Names and Data Values

HUANG Dong-mei,FENG Kai,ZHAO Dan-feng and GUO Ying-xin   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Schema matching is an important research in the field of data integration.The uncertainty of column names and data values is a common situation.The common method at present dealing with schema matching problem is based on mutual information and Euclidean distance.But this method does not solve the mistaken matching problem caused by the identity or the high similarity of the attributes.To solve this problem,this paper proposed multiple iterative screening method,which firstly,in two relation models,fixes some of the corrects attribute pairs in one time and then selects the best optimized attribute pair.Secondly,this paper lodged the method based on conditional mutual information,which utilizes the best optimized attribute pair to calculate the conditional mutual information of un-matched attributes and further calculates the Euclidean distance between each attribute.Finally,the matching result was acquired.The wrong matching problem was solved.The experiment result indicates the given algorithm is correct and effective.

Key words: Uncertainty,Schema matching,Conditional mutual information

[1] 翁年凤,刁兴春,曹建军,等.不确定模式匹配研究综述[J].计算机科学, 2011,38(12):1-5
[2] Doan A H,Halevy A Y.Semantic integration research in the database community:A brief survey [J].AI magazine,2005,26(1):83
[3] Kang J,Naughton J F.On schema matching with opaque column names and data values[J].International Conference on Management of Data:Proceedings of the 2003 ACM SIGMOD international conference on Management of data,2003,9(12):205-216
[4] Jaiswal A,Miller D J,Mitra P.Schema matching and embedded value mapping for databases with opaque column names and mixed continuous and discrete-valued data fields[J].ACM Transactions on Database Systems (TODS),2013,38(1):2
[5] Rabinovich B,Last M.Uninterpreted Semi-Automatic SchemaMatching Approach Using Inter-Attribute Dependencies[C]∥NATO Workshop on Semantic Interoperability Framework.Oslo.Norway.2011
[6] 吕锋,王虹,刘皓春.信息理论与编码[M].北京:人民邮电出版社,2004:1-200
[7] 王萼芳,石生明.高等数学(第三版)[M].北京:高等教育出版社,2003
[8] Chen W,Guo H,Zhang F,et al.Mining schema matching between heterogeneous databases[C]∥2012 2nd International Conference on Consumer Electronics,Communications and Networks (CECNet).IEEE,2012:1128-1131

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!