计算机科学 ›› 2009, Vol. 36 ›› Issue (8): 217-219.

• 人工智能 • 上一篇    下一篇

基于覆盖算法的垃圾邮件过滤

段震,王倩倩,张燕平,张铃   

  1. (安徽大学计算智能与信号处理重点实验室 合肥 230039)
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金(60675031),973计划(2004CB318108,2007BC311003)资助。

Spam Filtering Based on Covering Algorithm

DUAN Zhen,WANG Qian-qian,ZHANG Yan-ping,ZHANG Ling   

  • Online:2018-11-16 Published:2018-11-16

摘要: 电子邮件系统分类的正确性与风险性是评价邮件系统好坏的关键因素,邮件过滤是文本分类问题的一种特殊应用。将神经网络中的覆盖算法引入到邮件过滤中,结合多种特征降维方法进行邮件分类实验,并与SVM方法进行了比较。给出一个结合覆盖算法、合适的特征选择与降维方法的分类器,可以实现较好的效果。另外,根据垃圾邮件过滤在实际使用中的最小风险性的要求,从风险角度分析了覆盖算法对测试样本进行分类时的过程。根据分析结果提出对其拒识样本的处理过程进行改进,通过改变非垃圾邮件所属覆盖的影响范围降低了垃圾邮件过滤时的风险。

关键词: 垃圾邮件过滤,覆盖算法,特征选择,特征降维

Abstract: The correction rate and the risk rate of classification are important factors for evaluating an E-Mail system's performance,and spam filtering is a particular application of text categorization. This paper introduced covering algorithm (CA) of NN into spam filtering, and used several feature reduction methods to classify E-Mail. Comparing with SVM, the results of experiments indicated that it is an effective method to realize a spam filter using the combination of covering algorithm, appropriated feature selection and reduction methods. For the need of minimum risk of spam filtering,we proposed an improvement of one process in the handling of rejection samples by employing cross cover algorithm according to the result of analysis. The results show that this method can reduce the risk by changing the area which is affected by normal mail.

Key words: Spam filtering, Covering algorithm, Feature selection, Feature reduction

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!