计算机科学 ›› 2016, Vol. 43 ›› Issue (1): 61-63, 80.doi: 10.11896/j.issn.1002-137X.2016.01.014

• CRSSC-CWI-CGrC2015 • 上一篇    下一篇

用Shannon熵度量两个数据集的一致性

车晓雅,米据生   

  1. 河北师范大学数学与信息科学学院 石家庄050024,河北师范大学数学与信息科学学院 石家庄050024
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61170107,3,61300121,7,61502144),河北省高校创新团队领军人才培育计划项目(LJRC022),河北省自然科学基金(A2014205157,A2013208175)资助

Measuring Consistency of Two Datasets Using Shannon Entropy

CHE Xiao-ya and MI Ju-sheng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 粗糙集理论的基本思想是根据已知数据自身的不可分辨关系,通过一对近似算子,对某一给定概念进行近似表示。这种思想被应用在研究一个数据集对于另一个数据集的分类一致性上。提出了一种测量两个数据集一致性的新方法,并用Shannon熵定义了分类一致性。考虑到不同数据临近关系的影响,引入了模糊概念将测量对象由清晰分类转化为模糊分类,进而构造了一个广义的一致性度量,这种方法可以产生稳定的可判结果,有效地阻止建模技术中常出现的“黑箱”现象。

关键词: 一致性程度,不可辨识关系,模糊划分,Shannon熵

Abstract: The basic idea of rough set theory is based on an indiscernibility relation,and through a pair of approximate operators,it can approximatively represent a given concept.It is used in the study of a data set for classification consistency to another data set.This paper presented a new approach to measure consistency degree of two datasets,and defined classification consistency by Shannon entropy.Taking the influence of neighborhood relations of different data into account,a general consistency measure was defined by introducing the expert knowledge into a fuzzy inference system,then we constructed a consistent generalized metric.Moreover,this method can prevent the “ black box ” phenomenon encountered in many modeling techniques and produce robust and interpretable results.

Key words: Consistency degree,Indiscernibility relation,Fuzzy partition,Shannon entrop

[1] Agresti A,Finlay B.Statistical Methods for the Social Sciences(third edition)[M].Prentice Hall,New Jersey,1997
[2] Chen C B,Wang L Y.Rough set-based clusing with refinement using Shannon’s entropy theory[J].Computer,Mathematics with Applications,2006,52(10/11):1563-1576
[3] Dubois D,Prade H,Yager R R.Fuzzy Information Engineering:A Guide Tour of Application[M].Wiley,New York,1997
[4] Hollins M,Faldowski R,Rao S,et al.Perceptual dimensions of tactile surface texture:a multidimensional scaling analysis[J].Perception and Psychophys,1997,54 (6):697-705
[5] Jolliffe I T.Principal Component Analysis(2rd edition)[M].Information Publishier Science,New York,2002
[6] Le Dien S.Hierarchical multiple factoranalysis:application tothe comparison of sensory profiles[J].Food Quality Preference,2003,14(5/6):397-403
[7] Pawlak Z.Rough sets[J].International Journal Computer and Information Sciences,1982,11(5):341-356
[8] Pawlak Z.Rough set theory and its applications in data analysis[J].Cybernet System,1998,29(7):661-688
[9] Polkowski L,Skowron A.Rough mereology and analytical morphology:new developments in rough set Theory[C]∥ De Glass M,Pawlak Z.eds.Proceedings of WOCFAI-95,Second World Conference on Fundamntals of Artificial Intelligence.Angkor,Paris,1995:343-354
[10] Qian Y H,Liang J Y,Dang C Y.Consistency measure,inclusion degree and fuzzy measure in decision tables[J].Fuzzy Sets and Systems,2008,159(18):2353-2377
[11] Tripathy B C,Ray G C.On mixed fuzzy to pological spaces and countability[J].Soft Computing,2012,16(10):1691-1695
[12] Weisgerg S.Applied Linear Regression(third edition)[M].John & Sons,New York,2005
[13] Xue Z,Zeng X,Koehl L,et al.Measuring consistency of twodatasets using fuzzy techniques and the concept of indiscerni-bility[J].Engineering Applications of Artificial Intelligence,2014,36:54-63
[14] Yao Y Q,Mi J S.Hybrid Monotonic Measure on Intutionistic Fuzzy Sets[J].Computer Science,2010,37(1):255-257(in Chinese)姚燕青,米据生.直觉模糊集上的混合单调包含度[J].计算机科学,2010,37(1):255-257

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[3] 厉柏伸,李领治,孙涌,朱艳琴. 基于伪梯度提升决策树的内网防御算法[J]. 计算机科学, 2018, 45(4): 157 -162 .
[4] 王欢,张云峰,张艳. 一种基于CFDs规则的修复序列快速判定方法[J]. 计算机科学, 2018, 45(3): 311 -316 .
[5] 孙启,金燕,何琨,徐凌轩. 用于求解混合车辆路径问题的混合进化算法[J]. 计算机科学, 2018, 45(4): 76 -82 .
[6] 张佳男,肖鸣宇. 带权混合支配问题的近似算法研究[J]. 计算机科学, 2018, 45(4): 83 -88 .
[7] 伍建辉,黄中祥,李武,吴健辉,彭鑫,张生. 城市道路建设时序决策的鲁棒优化[J]. 计算机科学, 2018, 45(4): 89 -93 .
[8] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[9] 钟菲,杨斌. 基于主成分分析网络的车牌检测方法[J]. 计算机科学, 2018, 45(3): 268 -273 .
[10] 史雯隽,武继刚,罗裕春. 针对移动云计算任务迁移的快速高效调度算法[J]. 计算机科学, 2018, 45(4): 94 -99, 116 .