Computer Science ›› 2019, Vol. 46 ›› Issue (12): 201-207.doi: 10.11896/jsjkx.181001856

• Software & Database Technology • Previous Articles     Next Articles

Software Feature Extraction Method Based on Overlapping Community Detection

LIU Chun, ZHANG Guo-liang   

  1. (School of Computer and Information Engineering,Henan University,Kaifeng,Henan 475001,China)
  • Received:2018-10-08 Online:2019-12-15 Published:2019-12-17

Abstract: Extracting software features from natural language of product descriptions has gained a lot of attentions in recent years.In light that the sentences in the descriptions can describe the semantics of software features more precisely and one sentence may be concerned about more than one software feature,this paper proposed a feature identification method by detecting the overlapping clusters of these sentences in the natural language descriptions.Based on the overlapping community detection algorithm (LMF),the proposed method defines a metric to measure the similarity between each pair of sentences in the descriptions,builds a sentence similarity network accordingly,and then detects the overlapping sentence communities in such network.Each sentence community is a cluster which implies one software feature,and contains all the sentences potentially describing the implied feature.Further,in order to help people better understand the characteristics of sentence communities,the proposed method designs corresponding algorithms to select the communities with the lowest entropy from all sentence communities in turn,and to select the most representative sentences from the selected communities that have not been selected by other communities as descriptors of the features contained in the community.The natural language product descriptions from Soft pedia.com were crawled as experimental data.Experimental results show that the proposed method has better performance in accuracy and time consumption.

Key words: Feature extraction, Natural language, Overlapping community detection

CLC Number: 

  • TP311
[1]KANG K.Feature-Oriented Domain Analysis (FODA) Feasibi- lity Study[J].Technical Report Software Engineering Institute Carnegie Mellon University,1990,4(4):206-207.
[2]BERGER C.Kano’s methods for understanding customer-de- fined quality[J].Center for Quality Management Journal,1993,2(4):3-36.
[3]FERRARI A,SPAGNOLO G O,DELL’ORLETTA F.Mining commonalities and variabilities from natural language documents[C]//International Software Product Line Conference.New York:ACM,2013:116-120.
[4]HARIRI N,CASTROHERRERA C,MIRAKHORLI M,et al.Supporting Domain Analysis through Mining and Recommending Features from Online Product Listings[J].IEEE Transactions on Software Engineering,2013,39(12):1736-1752.
[5]LIU Y,LIU L,LIU H,et al.Mining domain knowledge from app descriptions[J].Journal of Systems & Software,2017,1(23):1-19.
[6]BAKAR N H,KASIRUN Z M,SALLEH N.Feature extraction approaches from natural language requirements for reuse in software product lines:A systematic literature review[J].Journal of Systems & Software,2015,106(C):132-149.
[7]BAKAR N H,KASIRUN Z M,SALLEH N,et al.Extracting features from online software reviews to aid requirements reuse[J].Applied Soft Computing,2016,49:1297-1315.
[8]JOHANN T,STANIK C,ALIREZA M A B,et al.SAFE:A Simple Approach for Feature Extraction from App Descriptions and App Reviews[C]//Requirements Engineering Conference.IEEE,2017:21-30.
[9]CHEN N,LIN J,HOI S C H,et al.AR-miner:mining informative reviews for developers from mobile app marketplace[C]//International Conference on Software Engineering.ACM,2014:767-778.
[10]VU P M,NGUYEN T T,PHAM H V,et al.Mining User Opini- ons in Mobile App Reviews:A Keyword-Based Approach (T)[J].Computer Science,2015,9(13):749-759.
[11]VU P M,PHAM H V,NGUYEN T T,et al.Phrase-based extraction of user opinions in mobile app reviews[C]//IEEE/ACM International Conference on Automated Software Engineering.IEEE,2016:726-731.
[12]GUZMAN E,MAALEJ W.How Do Users Like This Feature? A Fine Grained Sentiment Analysis of App Reviews[C]//Requirements Engineering Conference.IEEE,2014:153-162.
[13]MOGOTSI I C,CHRISTOPHER D.Manning,Prabhakar Rag- havan,and Hinrich Schütze:Introduction to information retrieval[J].Information Retrieval,2010,13(2):192-195.
[14]LANCICHINETTI A,FORTUNATO S,KERTéSZ J.Detecting the overlapping and hierarchical community structure of complex networks[J].New Journal of Physics,2008,11(3):19-44.
[15]BEIL F,ESTER M,XU X.Frequent term-based text clustering[C]//Eighth International Conference on Knowledge Discovery and Data Mining.ACM,2002:436-442.
[16]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3:993-1022.
[17]DHILLON I S,MODHA D S.Concept Decompositions for Large Sparse Text Data Using Clustering[J].Machine Lear-ning,2000,42(1/2).
[18]BEZDEK J C.Pattern Recognition with Fuzzy Objective Function Algorithms[J].Advanced Applications in Pattern Recognition,1981,22(1171):203-239.
[19]NIU N,SAVOLAINEN J,NIU Z,et al.A Systems Approach to Product Line Requirements Reuse[J].IEEE Systems Journal,2014,8(3):827-836.
[20]LIAN X,CLELAND-HUANG J,ZHANG L.Mining Associations Between Quality Concerns and Functional Requirements[C]//Requirements Engineering Conference.IEEE,2017:292-301.
[21]MEFTEH M,BOUASSIDA N,BENABDALLAH H.Mining Feature Models from Functional Requirements[J].Computer Journal,2016,59(12).
[22]YU Y,WANG H,YIN G,et al.Mining and recommending software features across multiple web repositories[C]//Asia-Pacific Symposium on Internetware.ACM,2013:1-9.
[23]SARRO F,ALSUBAIHIN A A,HARMAN M,et al.Feature lifecycles as they spread,migrate,remain,and die in App Stores[C]//Requirements Engineering Conference.IEEE,2015:76-85.
[1] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[2] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[3] ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[4] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[5] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[6] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[7] LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73.
[8] GAO Yuan-hao, LUO Xiao-qing, ZHANG Zhan-cheng. Infrared and Visible Image Fusion Based on Feature Separation [J]. Computer Science, 2022, 49(5): 58-63.
[9] ZUO Jie-ge, LIU Xiao-ming, CAI Bing. Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion [J]. Computer Science, 2022, 49(3): 197-203.
[10] REN Shou-peng, LI Jin, WANG Jing-ru, YUE Kun. Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction [J]. Computer Science, 2022, 49(2): 265-271.
[11] ZHANG Hu, BAI Ping. Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification [J]. Computer Science, 2022, 49(2): 279-284.
[12] CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107.
[13] CHEN Xiang-tao, ZHAO Mei-jie, YANG Mei. Overlapping Community Detection Algorithm Based on Subgraph Structure [J]. Computer Science, 2021, 48(9): 244-250.
[14] ZHANG Shi-peng, LI Yong-zhong. Intrusion Detection Method Based on Denoising Autoencoder and Three-way Decisions [J]. Computer Science, 2021, 48(9): 345-351.
[15] WANG Li-mei, ZHU Xu-guang, WANG De-jia, ZHANG Yong, XING Chun-xiao. Study on Judicial Data Classification Method Based on Natural Language Processing Technologies [J]. Computer Science, 2021, 48(8): 80-85.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!