计算机科学 ›› 2006, Vol. 33 ›› Issue (5): 107-109.

• • 上一篇    下一篇

一种基于后缀数组聚类(SAC)的中文垃圾邮件过滤方法

  

  • 出版日期:2018-11-17 发布日期:2018-11-17

  • Online:2018-11-17 Published:2018-11-17

摘要: 贝叶斯算法在垃圾邮件过滤中应用广泛,但在中文垃圾邮件过滤中性能较低。本文通过聚类的思想,提出一种基于后缀数组聚类(SAC)的中文邮件特征项抽取方法,并给出了不同特征项抽取方法下贝叶斯算法的中文垃圾邮件过滤实验数据对比。实验表明,该方法显著提高了中文垃圾邮件的过滤性能。

关键词: 朴素贝叶斯 垃圾邮件过滤 后缀数组

Abstract: The naivebayes algorithm has widely been applied to spam filtering. However,it has unsatisfactory performance in Chinese email filtering. Using clutering, this paper proposes a suffix array clustering based token extraction method for Chinese email,named

Key words: Naive-bayes, Spare filtering, Suffix array clustering

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!