Computer Science ›› 2019, Vol. 46 ›› Issue (12): 201-207.doi: 10.11896/jsjkx.181001856

• Software & Database Technology • Previous Articles     Next Articles

Software Feature Extraction Method Based on Overlapping Community Detection

LIU Chun, ZHANG Guo-liang   

  1. (School of Computer and Information Engineering,Henan University,Kaifeng,Henan 475001,China)
  • Received:2018-10-08 Online:2019-12-15 Published:2019-12-17

Abstract: Extracting software features from natural language of product descriptions has gained a lot of attentions in recent years.In light that the sentences in the descriptions can describe the semantics of software features more precisely and one sentence may be concerned about more than one software feature,this paper proposed a feature identification method by detecting the overlapping clusters of these sentences in the natural language descriptions.Based on the overlapping community detection algorithm (LMF),the proposed method defines a metric to measure the similarity between each pair of sentences in the descriptions,builds a sentence similarity network accordingly,and then detects the overlapping sentence communities in such network.Each sentence community is a cluster which implies one software feature,and contains all the sentences potentially describing the implied feature.Further,in order to help people better understand the characteristics of sentence communities,the proposed method designs corresponding algorithms to select the communities with the lowest entropy from all sentence communities in turn,and to select the most representative sentences from the selected communities that have not been selected by other communities as descriptors of the features contained in the community.The natural language product descriptions from Soft pedia.com were crawled as experimental data.Experimental results show that the proposed method has better performance in accuracy and time consumption.

Key words: Natural language, Feature extraction, Overlapping community detection

CLC Number: 

  • TP311
[1] KANG K.Feature-Oriented Domain Analysis (FODA) Feasibi- lity Study[J].Technical Report Software Engineering Institute Carnegie Mellon University,1990,4(4):206-207.
[2] BERGER C.Kano’s methods for understanding customer-de- fined quality[J].Center for Quality Management Journal,1993,2(4):3-36.
[3] FERRARI A,SPAGNOLO G O,DELL’ORLETTA F.Mining commonalities and variabilities from natural language documents[C]//International Software Product Line Conference.New York:ACM,2013:116-120.
[4] HARIRI N,CASTROHERRERA C,MIRAKHORLI M,et al.Supporting Domain Analysis through Mining and Recommending Features from Online Product Listings[J].IEEE Transactions on Software Engineering,2013,39(12):1736-1752.
[5] LIU Y,LIU L,LIU H,et al.Mining domain knowledge from app descriptions[J].Journal of Systems & Software,2017,1(23):1-19.
[6] BAKAR N H,KASIRUN Z M,SALLEH N.Feature extraction approaches from natural language requirements for reuse in software product lines:A systematic literature review[J].Journal of Systems & Software,2015,106(C):132-149.
[7] BAKAR N H,KASIRUN Z M,SALLEH N,et al.Extracting features from online software reviews to aid requirements reuse[J].Applied Soft Computing,2016,49:1297-1315.
[8] JOHANN T,STANIK C,ALIREZA M A B,et al.SAFE:A Simple Approach for Feature Extraction from App Descriptions and App Reviews[C]//Requirements Engineering Conference.IEEE,2017:21-30.
[9] CHEN N,LIN J,HOI S C H,et al.AR-miner:mining informative reviews for developers from mobile app marketplace[C]//International Conference on Software Engineering.ACM,2014:767-778.
[10] VU P M,NGUYEN T T,PHAM H V,et al.Mining User Opini- ons in Mobile App Reviews:A Keyword-Based Approach (T)[J].Computer Science,2015,9(13):749-759.
[11] VU P M,PHAM H V,NGUYEN T T,et al.Phrase-based extraction of user opinions in mobile app reviews[C]//IEEE/ACM International Conference on Automated Software Engineering.IEEE,2016:726-731.
[12] GUZMAN E,MAALEJ W.How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews[C]//Requirements Engineering Conference.IEEE,2014:153-162.
[13] MOGOTSI I C,CHRISTOPHER D.Manning,Prabhakar Rag- havan,and Hinrich Schütze:Introduction to information retrieval[J].Information Retrieval,2010,13(2):192-195.
[14] LANCICHINETTI A,FORTUNATO S,KERTéSZ J.Detecting the overlapping and hierarchical community structure of complex networks[J].New Journal of Physics,2008,11(3):19-44.
[15] BEIL F,ESTER M,XU X.Frequent term-based text clustering[C]//Eighth International Conference on Knowledge Discovery and Data Mining.ACM,2002:436-442.
[16] BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
[17] DHILLON I S,MODHA D S.Concept Decompositions for Large Sparse Text Data Using Clustering[J].Machine Lear-ning,2000,42(1/2).
[18] BEZDEK J C.Pattern Recognition with Fuzzy Objective Function Algorithms[J].Advanced Applications in Pattern Recognition,1981,22(1171):203-239.
[19] NIU N,SAVOLAINEN J,NIU Z,et al.A Systems Approach to Product Line Requirements Reuse[J].IEEE Systems Journal,2014,8(3):827-836.
[20] LIAN X,CLELAND-HUANG J,ZHANG L.Mining Associations Between Quality Concerns and Functional Requirements[C]//Requirements Engineering Conference.IEEE,2017:292-301.
[21] MEFTEH M,BOUASSIDA N,BENABDALLAH H.Mining Feature Models from Functional Requirements[J].Computer Journal,2016,59(12).
[22] YU Y,WANG H,YIN G,et al.Mining and recommending software features across multiple web repositories[C]//Asia-Pacific Symposium on Internetware.ACM,2013:1-9.
[23] SARRO F,ALSUBAIHIN A A,HARMAN M,et al.Feature lifecycles as they spread,migrate,remain,and die in App Stores[C]//Requirements Engineering Conference.IEEE,2015:76-85.
[1] LIU Yang, JIN Zhong. Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism [J]. Computer Science, 2021, 48(1): 197-203.
[2] TONG Xin, WANG Bin-jun, WANG Run-zheng, PAN Xiao-qin. Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing [J]. Computer Science, 2021, 48(1): 258-267.
[3] LU Long-long, CHEN Tong, PAN Min-xue, ZHANG Tian. CodeSearcher:Code Query Using Functional Descriptions in Natural Languages [J]. Computer Science, 2020, 47(9): 1-9.
[4] TIAN Ye, SHOU Li-dan, CHEN Ke, LUO Xin-yuan, CHEN Gang. Natural Language Interface for Databases with Content-based Table Column Embeddings [J]. Computer Science, 2020, 47(9): 60-66.
[5] BAO Yu-xuan, LU Tian-liang, DU Yan-hui. Overview of Deepfake Video Detection Technology [J]. Computer Science, 2020, 47(9): 283-292.
[6] WANG Liang, ZHOU Xin-zhi, YNA Hua. Real-time SIFT Algorithm Based on GPU [J]. Computer Science, 2020, 47(8): 105-111.
[7] XUE Lei, TANG Xu-qing. Algorithm for Detecting Overlapping Communities Based on Centered Cliques [J]. Computer Science, 2020, 47(8): 157-163.
[8] LIANG Zheng-you, HE Jing-lin, SUN Yu. Three-dimensional Convolutional Neural Network Evolution Method for Facial Micro-expression Auto-recognition [J]. Computer Science, 2020, 47(8): 227-232.
[9] YANG Wei-chao, GUO Yuan-bo, LI Tao, ZHU Ben-quan. Method Based on Traffic Fingerprint for IoT Device Identification and IoT Security Model [J]. Computer Science, 2020, 47(7): 299-306.
[10] ZHANG Ying, ZHANG Yi-fei, WANG Zhong-qing and WANG Hong-ling. Automatic Summarization Method Based on Primary and Secondary Relation Feature [J]. Computer Science, 2020, 47(6A): 6-11.
[11] LAN Zhang-li, SHEN De-xing, CAO Juan and ZHANG Yu-xin. Content-independent Method for Basis Image Extraction and Image Reconstruction [J]. Computer Science, 2020, 47(6A): 226-229.
[12] ZHOU Li-peng, MENG Li-min, ZHOU Lei, JIANG Wei and DONG Jian-ping. Fall Detection Algorithm Based on BP Neural Network [J]. Computer Science, 2020, 47(6A): 242-246.
[13] YUAN De-yu, ZHANG Yi-fan, GAO Jian and SUN Hai-chun. Abnormal User Detection Method in Sina Weibo Based on User Feature Extraction [J]. Computer Science, 2020, 47(6A): 364-368.
[14] ZHANG Hao-yang and ZHOU Liang. Application of Improved GHSOM Algorithm in Civil Aviation Regulation Knowledge Map Construction [J]. Computer Science, 2020, 47(6A): 429-435.
[15] WU Xiao-kun, ZHAO Tian-fang. Application of Natural Language Processing in Social Communication:A Review and Future Perspectives [J]. Computer Science, 2020, 47(6): 184-193.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[4] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[5] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[6] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[7] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[8] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .
[9] SHI Chao, XIE Zai-peng, LIU Han and LV Xin. Optimization of Container Deployment Strategy Based on Stable Matching[J]. Computer Science, 2018, 45(4): 131 -136 .
[10] PANG Bo, JIN Qian-kun, HENIGULI·Wu Mai Er and QI Xing-bin. Routing Scheme Based on Network Slicing and ILP Model in SDN[J]. Computer Science, 2018, 45(4): 143 -147 .