计算机科学 ›› 2015, Vol. 42 ›› Issue (9): 159-164.doi: 10.11896/j.issn.1002-137X.2015.09.031
李晓晨,江贺,任志磊
LI Xiao-chen, JIANG He and REN Zhi-lei
摘要: 在软件仓库挖掘领域, 通常 将软件工程任务转换成数据挖掘问题进行解决。领域特征的使用严重影响了软件任务的解决效果。然而,如何根据特定任务从软件仓库数据中提取有价值的特征,在软件仓库挖掘领域尚缺乏系统的研究。数据驱动特征提取方法是一种新的特征提取方法。对于给定的软件工程任务,该方法从任务的数据集中选取部分数据(如源代码、缺陷报告等),招募若干志愿者人工完成该任务,并要求志愿者说明在人工完成特定软件工程任务时所考虑的因素。通过分析这些因素,可以提取所需的领域特征。以缺陷报告摘要任务为例进行实验,结果表明新方法能够发现高效的领域特征,并取得比现有方法更好的预测效果。
[1] Xie T,Pei J,Hassan A E.Mining software engineering data[C]∥Proceedings of the 29th International Conference on Software Engineering(ICSE’2007).2007:172-173 [2] Hassan A E,Xie T.Software intelligence:the future of miningsoftware engineering data[C]∥Proceedings of the FSE/SDP workshop on Future of Software Engineering Research(FoSER’2010).2010:161-166 [3] Xie T,Thummalapenta S,Lo D,et al.Data mining for software engineering [J].Computer,2009,42(8):55-62 [4] Srinivasa K G,Venugopal K R,Patnaik L M.Feature extraction using fuzzy c-means clustering for data mining systems[J].International Journal of Computer Science and Network Security,2006,6(3A):230-236 [5] Sun C,Lo D,Khoo S C,et al.Towards more accurate retrieval of duplicate bug reports[C]∥Proceedings of 2011 26th IEEE/ACM International Conference on Automated Software Engineering(ASE’11).2011:253-262 [6] Anvik J,Hiew L,Murphy G C.Who should fix this bug? [C]∥Proceedings of the 28th International Conference on Software Engineering(ICSE’06).2006:361-370 [7] Jeong G,Kim S,Zimmermann T.Improving bug triage with bug tossing graphs[C]∥Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering(FSE’09).2009:111-120 [8] Xuan J,Jiang H,Ren Z,et al.Developer prioritization in bug repositories[C]∥Proceedings of the 34th International Confe-rence on Software Engineering(ICSE’12).2012:25-35 [9] Mani S,Catherine R,Sinha V S,et al.Ausum:approach for unsupervised bug report summarization[C]∥Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering(FSE’12).2012:11-21 [10] Lotufo R,Malik Z,Czarnecki K.Modelling the ‘hurried’ bug report reading process to summarize bug reports[C]∥Proceedings of the 28th IEEE International Conference on Software Maintenance(ICSM’12).2012:430-439 [11] Runeson P,Alexandersson M,Nyholm O.Detection of duplicate defect reports using natural language processing[C]∥Procee-dings of the 29th International Conference on Software Enginee-ring(ICSE’07).2007:499-510 [12] Rastkar S,Murphy G C,Murray G.Summarizing software artifacts:a case study of bug reports[C]∥ Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering(ICSE’10).2010,1:505-514 [13] Yin S,Ding S,Xie X,et al.A review on basic data-driven approaches for industrial process monitoring [J].IEEE Transactions on Industrial Electronics,2014,61(11):6418-6428 [14] Yin S,Wang G,Karimi H R.Data-driven design of robust fault detection system for wind turbines [J].Mechatronics,2014,24(4):298-306 [15] Rastkar S,Murphy G,Murray G.Automatic Summarization of Bug Reports[J].IEEE Transactions on Software Engineering,2014,40(4):366-380 [16] 王青,伍书剑,李明树.软件缺陷预测技术[J].软件学报,2008,19(7):1565-1580 Wang Q,Wu S J,Li M S.Software defect prediction [J].Journal of Software,2008,19(7):1565-1580 [17] Murray G,Carenini G.Summarizing spoken and written conversations[C]∥Proceedings of the Conference on Empirical Me-thods in Natural Language Processing(EMNLP’08).2008:773-782 [18] Chen Y W,Lin C J.Combining SVMs with various feature selection strategies[M]∥Feature Extraction.Springer Berlin Heidelberg,2006:315-324 [19] Xuan J,Jiang H,Ren Z,et al.Solving the large scale next release problem with a backbone-based multilevel algorithm[J].IEEE Transactions on Software Engineering,2012,38(5):1195-1212 [20] Srinivasa K G,Venugopal K R,Patnaik L M.Feature extraction using fuzzy c-means clustering for data mining systems[J].International Journal of Computer Science and Network Security,2006,6(3A):230-236 [21] Salton G,Wong A,Yang C S.A vector space model for automa-tic indexing[J].Communications of the ACM,1975,18(11):613-620 [22] Aggarwal C C,Zhai C.A survey of text clustering algorithms[M]∥Mining Text Data.Springer US,2012:77-128 [23] s,cl S,,Güngr T.Comparison of text feature selection policies and using an adaptive framework [J].Expert Systems with Applications,2013,40(12):4871-4886 [24] Sourcy P,Mineau G W.Beyond TFIDF weighting for text categorization in the vector space model[C]∥Proceedings of the 19th international joint conference on Artificial intelligence(IJCAI’05).2005:1130-1135 |
No related articles found! |
|