计算机科学 ›› 2015, Vol. 42 ›› Issue (12): 23-25.

• 第十三届全国软件与应用学术会议 • 上一篇    下一篇

一种基于邮件列表的软件问答信息抽取方法

罗宇翔,邹艳珍,金庸,谢冰   

  1. 北京大学信息科学技术学院软件所 北京1000871 高可信软件技术教育部重点实验室 北京100871,北京大学信息科学技术学院软件所 北京1000871 高可信软件技术教育部重点实验室 北京100871,北京大学信息科学技术学院软件所 北京1000871 高可信软件技术教育部重点实验室 北京100871,北京大学信息科学技术学院软件所 北京1000871 高可信软件技术教育部重点实验室 北京100871
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家高技术研究发展计划(863)(2013AA01A605),国家重点基础研究发展规划(973)(2011CB302604),国家自然科学基金(61103024)资助

Mailing List Based QA Information Extraction Approach

LUO Yu-xiang, ZOU Yan-zhen, JIN Yong and XIE Bing   

  • Online:2018-11-14 Published:2018-11-14

摘要: 开源项目通常会提供邮件列表来帮助用户更好地理解和使用开源项目。但由于邮件的数量巨大、邮件内容组织繁杂、问题不明确、答案定位困难等问题,用户在邮件查询过程中定位一个特定的软件问答信息要花费大量的时间和精力。为此,提出一种基于邮件列表的软件问答信息抽取方法。该方法通过对邮件的简单分类与标注,实现自动的问题句抽取和答案邮件选取,从而提升了用户进行邮件列表查询以及开源软件项目学习的效率。最后,通过实验验证了该方法的有效性。

关键词: 软件复用,数据挖掘,邮件列表,软件问答

Abstract: Open source projects often provide mailing lists to help users better understand and use open source software.However,developers often spend a lot of time to retrieve the emails when they want to find a special answer,because there are a huge number of emails with unclear question and complex organization.User usually take a lot of email conversations before they get a right answer.In the paper,we proposed and implemented a question & answer information extraction approach based on open source software’s mailing list.It can automatically extract the question sentence and the corresponding best answer from the emails,which can help users search mailing list and learn open source software more effectively.We also did some experiments to verify the availability and the efficiency of our approach.

Key words: Software reuse,Data mining,Mailing list,Software question & answer

[1] 金庸.基于邮件列表的软件问答信息抽取工具的设计与实现 [D].北京:北京大学,2014 Jin Yong.A design and implementation of software R&A extraction tool based on maillists[D].Beijing:Peking University,2014
[2] Fournier-Viger P.Spmf:A sequential pattern mining framework .http://www.philippe-fournier-viger.com/spmf,2011
[3] 肖仁财.序列模式挖掘算法研究与实现[ D].南京:江苏大学,2007 Xiao Ren-cai.A research and implementation of sequential pattern mining algorithm[D].Nanjing:Jiangsu University,2007
[4] Belkin,Nicholas J,et al.Query length in interactive information retrieval[C]∥Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2003
[5] Salton,Gerard,Wong A,et al.A vector space model for automatic indexing [J].Communications of the ACM, 1975,18(11):613-620
[6] Cong G,Wang L,Lin C Y,et al.Finding question-answer pairs from online forums[C]∥Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.ACM,2008:467-474
[7] Wang Kai,Chua T-S.Exploiting salient patterns for question detection and question retrieval in community-based question answering [C]∥Proceedings of the 23rd International Conference on Computational Linguistics.2010

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!