计算机科学 ›› 2017, Vol. 44 ›› Issue (9): 74-77.doi: 10.11896/j.issn.1002-137X.2017.09.015

• CRSSC-CWI-CGrC 2016 • 上一篇    下一篇

一种基于邮件头信息的三支决策邮件过滤方法

袁国鑫,于洪   

  1. 重庆邮电大学计算智能重庆市重点实验室 重庆400065,重庆邮电大学计算智能重庆市重点实验室 重庆400065
  • 出版日期:2018-11-13 发布日期:2018-11-13
  • 基金资助:
    本文受国家自然科学基金(61379114,61533020)资助

Method of Three-way Decision Spam Filtering Based on Head Information of E-mail

YUAN Guo-xin and YU Hong   

  • Online:2018-11-13 Published:2018-11-13

摘要: 提出一种基于邮件头信息的三支决策垃圾邮件过滤方法。该方法使用一种新的属性重要度度量方法,并用该度量方法将邮件头信息属性依据重要度大小进行排序,然后按属性重要度的大小顺序对邮件计算贝叶斯概率并进行三支决策。当信息较少以致不足以决策时,按属性重要度大小顺序增加新的属性信息以帮助进一步的决策,直到得到最后的邮件分类。对比实验结果表明,该方法是合理且有效的。

关键词: 邮件头信息,属性重要性,三支决策,垃圾邮件过滤

Abstract: A method of three-way decision spam filtering was proposed in this paper based on the head information of E-mail.The head information is sorted by a new measurement of attribute significance.Bayesian probability based on the most significant attributes is computed to do the actions of three-way decisions.When the information is not enough to make decisions,more attribute information is added to the computing of Bayesian probability until the final decisions are made.The results of comparative experiments show that the new method is reasonable and effective.

Key words: Head information of E-mail,Attribute significance,Three-way decisions,Spam filtering

[1] Internet Society of China.China anti-spam survey report in the first quarter of 2014 [J].China Internet,2014(7):59-67.(in Chinese) 中国互联网协会.2014年第一季度反垃圾邮件调查报告[J].互联网天地,2014(7):59-67.
[2] CHEN Z X.Review of spam filtering technology[J].Application Research of Computers,2009,26(5):1612-1615(in Chinese) 陈志贤.垃圾邮件过滤技术研究综述[J].计算机应用研究,2009,26(5):1612-1615.
[3] YAO Y Y.Three-Way Decision:An Interpretation of Rules in Rough Set Theory[M]∥Rough Sets and Knowledge Technology.Springer Berlin Heidelberg,2009:642-649.
[4] YAO Y Y.Three-way decisions with probabilistic rough sets[J].Information Sciences,2010,180(3):341-353.
[5] YAO Y Y.An Outline of a Theory of Three-Way Decisions[M]∥Rough Sets and Current Trends in Computing.Springer Berlin Heidelberg,2012:1-17.
[6] LI J L,DENG X F,YAO Y Y.Multistage Email Spam Filtering Based on Three-Way Decisions[M]∥Rough Sets and Know-ledge Technology.Springer Berlin Heidelberg,2013:313-324.
[7] ZHOU B,YAO Y Y,LUO J G.Cost-sensitive three-way email spam filtering [J].Journal of Intelligent Information Systems,2014,2(1):19-45.
[8] JIA Y X,SHANG L.Three-Way Decision Versus Two-Way Decisions on Filtering Spam Email[M]∥Transactions on Rough Sets XVIII.Springer Berlin Heidelberg,2014:69-91.
[9] DENG W B,HONG Z Y.Two stage email filtering methodbased on rough set[J].Journal of Computer Applications,2010,30(8):2006-2009,8.
[10] PAWLAK Z.Rough sets [J].International Journal of Computer and Information Sciences,1982,11(5):341-356.
[11] YAO Y Y.Decision-Theoretic Rough Set Models[C]∥International Conference on Rough Sets and Knowledge Technology.New York:Springer-Verlag,2007:1-12.
[12] MARINOFF L.The Middle Way:Finding Happiness in a World of Extremes[M].New York:Sterling,2007.
[13] 贾修一,等.三支决策理论与应用[M].南京:南京大学出版社,2012.
[14] 于洪等.三支决策:复杂问题求解方法与实践[M].北京:科学出版社,2015.
[15] PAWLAK Z.Rough sets:Theoretical aspects of reasoning about data[M].London:Kluwer Academic Publishers,1991.
[16] LIANG J Y,QIAN Y H.Information granules and entropy theory in information systems [J].Science in China-Series,2008,51(3):1427-1444.
[17] YE J,ZHU H S,LI M.Kind of attribute importance definedmethod and its application in attribute reduction [J].Application Research of Computers,2016,33(7):2075-2078.(in Chinese) 叶军,朱华生,黎敏.一种属性重要性定义方法及其在约简中的应用[J].计算机应用研究,2016,33(7):2075-2078.
[18] Text Retrieval Conference.2006 TREC Public Spam Corpora[EB/OL].[2016-04-21].http://plg.uwaterloo.ca/cgi-bin/cgiwrap/gvcormac/foo06.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!