Computer Science ›› 2019, Vol. 46 ›› Issue (12): 201-207.doi: 10.11896/jsjkx.181001856

• Software & Database Technology • Previous Articles     Next Articles

Software Feature Extraction Method Based on Overlapping Community Detection

LIU Chun, ZHANG Guo-liang   

  1. (School of Computer and Information Engineering,Henan University,Kaifeng,Henan 475001,China)
  • Received:2018-10-08 Online:2019-12-15 Published:2019-12-17

Abstract: Extracting software features from natural language of product descriptions has gained a lot of attentions in recent years.In light that the sentences in the descriptions can describe the semantics of software features more precisely and one sentence may be concerned about more than one software feature,this paper proposed a feature identification method by detecting the overlapping clusters of these sentences in the natural language descriptions.Based on the overlapping community detection algorithm (LMF),the proposed method defines a metric to measure the similarity between each pair of sentences in the descriptions,builds a sentence similarity network accordingly,and then detects the overlapping sentence communities in such network.Each sentence community is a cluster which implies one software feature,and contains all the sentences potentially describing the implied feature.Further,in order to help people better understand the characteristics of sentence communities,the proposed method designs corresponding algorithms to select the communities with the lowest entropy from all sentence communities in turn,and to select the most representative sentences from the selected communities that have not been selected by other communities as descriptors of the features contained in the community.The natural language product descriptions from Soft were crawled as experimental data.Experimental results show that the proposed method has better performance in accuracy and time consumption.

Key words: Natural language, Feature extraction, Overlapping community detection

CLC Number: 

  • TP311
