Computer Science ›› 2011, Vol. 38 ›› Issue (12): 182-186.

Previous Articles     Next Articles

Extracting Name Aliases of Mailbox Users from Email Bodies

  

  • Online:2018-12-01 Published:2018-12-01

Abstract: Mining user identity information from emails is an important research topic in data mining. Most approaches extract users' names only from the email headers, but names appearing in email bodies are usually more suitable for representing the sender's or recipient’s identity. This paper focused on extracting users’name aliases in the body of plain-text emails. Firstly,to effectively elicit salutation and signature block from email bodies,a salutation and signature blocks locating algorithm based on statistical and rules restricted methods was proposed. I}hen to extract all valid aliases in the salutation and signature lines, a novel approach was proposed based on name boundary word template built on the characteristics of alias neighboring words,which can verify and amend aliases identified by named entity recognition or part-of-speech tagging tools. Results on Enron corpus indicate that the approaches proposed can efficiently and automatically extract user's aliases from email bodies.

Key words: Entity resolution, Email body, Alias Extraction, Salutation and signature blocks locating, Name boundary word template

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!