Computer Science ›› 2020, Vol. 47 ›› Issue (3): 48-53.doi: 10.11896/jsjkx.190700146

• Intelligent Software Engineering • Previous Articles     Next Articles

Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi- channel Pyramid Convolution

KANG Yan,CUI Guo-rong,LI Hao,YANG Qi-yue,LI Jin-yuan,WANG Pei-yao   

  1. (College of Software, Yunnan University, Kunming 650091, China)
  • Received:2019-07-22 Online:2020-03-15 Published:2020-03-30
  • About author:KANG Yan,born in 1972,master supervisor,is member of China Computer Federation (CCF).Her main research interests include machine learning and so on. CUI Guo-rong,born in 1995,master.His main research interests include natural language processing and so on.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61762092, 61762089) and Yunnan Provincial Key Laboratory of Software Engineering Open Fund Project (2017SE204).

Abstract: With the rapid increasing in the number of software and the increasing variety of types,how to mine the text characteristics of software requirements and cluster the characteristics of software requirements has become a major challenge in the field of software engineering.The clustering of software requirements texts provides a reliable guarantee for the software development process while reducing the potential risks and negative impacts of the requirements analysis phase.However,the software requirements text has the characteristics of high dispersion,high noise,and sparse data.At present,the work related to clustering is limited to a single type of text,and the functional semantics of software requirements are rarely considered.In view of the characteristics of the demand text and the limitations of the traditional clustering method,this paper proposed a software demand clustering algorithm (SA-MPCN&SOM) combining the self-attention mechanism and multi-channel pyramid convolution.The method captures the global features through the self-attention mechanism,and then extract the required text features from the depth of the different windows based on multi-channel pyramid convolution.Thus,the perceived text fragments are multiplied,and finally the multiplexed text features are clustered using SOM.The experimental results on the software demand data show that the proposed method can better mine the demand features,cluster the demand features,and outperform other feature extraction methods and clustering algorithms.

Key words: Demand analysis, Text clustering, Self-attention, Pyramid convolution, Text feature

CLC Number: 

  • TP309
[1]TONG Z X,MA P J,DING X,et al.Requirement Research on Demand Clustering and Demand Optimization Method Based on Natural Language Understanding [J].High Technology Letters,2015,25(3):257-269.
[2]MÖLLER K H.Ausgangsdaten für Qualit?tsmetriken-Eine Fundgrube für Analysen[M]∥Software-Metriken in der Praxis.Springer,1996.
[3]BOEHM B W,ROSS R.Theory-W Software Project Management:Principles and Examples[J].IEEE Transactions on Software Engineering,1989,15(7):902-916.
[4]DAVIS A M.Software requirements:objects,functions,and states [M].PTR Prentice Hall,1993.
[5]XIAO W N,ZHANG W Q,WANG L L.Research on a Software Needs Analysis Risk Assessment Model Based on BP Neural Network [J].Computer Science,2011,38(4):199-202.
[6]WANG Y M,HAN F,WANG H P,et al.Software Demand Risk Model Based on Grey Clustering Evaluation and Its Application[J].Computer Engineering and Design,2006(18):3497-3500.
[7]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[M]∥Encyclopedia of Systems Biology.New York:Springer,2013.
[8]VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need[J].arXiv:1706.03762,2017.
[9]MARGHNY H.MOHAMED,MOHAMMED M.Abdelsamea:Self Organization Map based Texture Feature Extraction for Efficient Medical Image Categorization[J].arXiv:1408.4143,2014.
[10]MARTIN J,KLEINROCK L.Excerpts from:An Information Systems Manifesto[J].Communications of the ACM,1985,28(3):252-255.
[11]ZHAO W,ZHANG L,MEI H,et al.A Program Clustering Method Based on Functional Demand Hierarchy Condensation[J].Journal of Software,2006(8):1661-1668.
[12]JIANG B,YE L Y,PAN W F,et al.Service clustering method based on demand function semantics [J].Journal of Computers,2018,41(6):1035-1046.
[13]SUN Z Y,LIU G S.Research on neural network clustering algorithm for short texts [J].Computer Science,2018,45 (S1):392-395.
[14]HU W S,YANG J F,ZHAO M.Demand analysis based on grey clustering algorithm [J].Computer Science,2016,43 (S1):471-475.
[15]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[J].arXiv:1301.3781,2013.
[16]COATES A,NG A Y.Learning feature representations with K-means[J].Lecture Notes in Computer Science,2012,7700:561-580.
[17]ZEPEDA-MENDOZA M L,RESENDIS-ANTONIO O.Hierarchical Agglomerative Clustering[M]∥Encyclopedia of Systems Biology.New York:Springer,2013.
[18]COMON P.Independent component analysis,A new concept [J].Signal Processing,1994,36(3):287-314.
[19]BLEI D M,NG A Y,JORDAN M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2012,3:993-1022.
[1] ZHANG Peng-fei, LI Guan-yu, JIA Cai-yan. Truncated Gaussian Distance-based Self-attention Mechanism for Natural Language Inference [J]. Computer Science, 2020, 47(4): 178-183.
[2] YU Shan-shan, SU Jin-dian, LI Peng-fei. Sentiment Classification Method for Sentences via Self-attention [J]. Computer Science, 2020, 47(4): 204-210.
[3] ZHANG Yi-jie, LI Pei-feng, ZHU Qiao-ming. Event Temporal Relation Classification Method Based on Self-attention Mechanism [J]. Computer Science, 2019, 46(8): 244-248.
[4] FAN Zi-wei, ZHANG Min, LI Zheng-hua. BiLSTM-based Implicit Discourse Relation Classification Combining Self-attention
Mechanism and Syntactic Information
[J]. Computer Science, 2019, 46(5): 214-220.
[5] HUANG Jian-yi, LI Jian-jiang, WANG Zheng, FANG Ming-zhe. Single-Pass Short Text Clustering Based on Context Similarity Matrix [J]. Computer Science, 2019, 46(4): 50-56.
[6] ZHANG Xiao-yang, QIN Gui-he, ZOU Mi, SUN Ming-hui and GAO Qing-yang. Research on Recommendation Method of Restaurant Based on LDA Model [J]. Computer Science, 2017, 44(7): 180-184, 214.
[7] ZHANG Qun, WANG Hong-jun and WANG Lun-wen. Short Text Clustering Algorithm Combined with Context Semantic Information [J]. Computer Science, 2016, 43(Z11): 443-446, 450.
[8] WANG You-hua and CHEN Xiao-rong. Improved Text Clustering Algorithm Based on Kolmogorov Complexity [J]. Computer Science, 2016, 43(5): 243-246.
[9] LIU Liang-liang and CAO Cun-gen. Chinese Real-word Error Automatic Proofreading Based on Combining of Local Context Features [J]. Computer Science, 2016, 43(12): 30-35.
[10] LI Zhao, LI Xiao, WANG Chun-mei, LI Cheng and YANG Chun. Text Clustering Method Study Based on MapReduce [J]. Computer Science, 2016, 43(1): 246-250, 269.
[11] ZHU Ye-hang, LI Yan-ling, CUI Meng-tian and YANG Xian-wen. Clustering Algorithm CARDBK Improved from K-means Algorithm [J]. Computer Science, 2015, 42(3): 201-205.
[12] LIU Yi-song and YANG Yu-cheng. Semantic Web Service Discovery Based on Text Clustering and Similarity of Concepts [J]. Computer Science, 2013, 40(11): 211-214.
[13] WANG Gang,ZHONG Guo-xiang. Study on Text Clustering Algorithm Based on Similarity Measurement of Ontology [J]. Computer Science, 2010, 37(9): 222-224.
[14] ZHU Zheng-yu LI Li-pei LUO Ying ZHOU Zhi ZHU Qing-sheng (Department of Computer Science, Chongqing University, Chongqing 400044, China). [J]. Computer Science, 2009, 36(5): 244-246.
[15] NIU Yun, DAI Guan-zhong, LIANG Ya-lin (College of Automation, Northwesten Polyteehnical University, Xi ' an 710072, China). [J]. Computer Science, 2009, 36(1): 121-125.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] XIANG Ying-zhuo, TAN Ju-xian, HAN Jie-si, SHI Hao. Survey of Graph Matching Algorithms[J]. Computer Science, 2018, 45(6): 27 -31,45 .
[2] LI Zhen-tao, MENG Xiang-ru , ZHAO Zhi-yuan, SU Yu-ze. Virtual Network Reconfiguration Algorithm for Nodes Load Balancing[J]. Computer Science, 2018, 45(7): 95 -98, 121 .
[3] QIN Meng-na, CHEN Jun-jie, GUO Hao. Multi-feature Fusion Classification Method Based on High-order Minimum Spanning Tree Brain Network[J]. Computer Science, 2018, 45(7): 293 -298, 314 .
[4] HAN Xiu-ping, WANG Zhi, PEI Dan. Study on Wi-Fi Fingerprint Anonymization for Users in Wireless Networks[J]. Computer Science, 2018, 45(8): 7 -12 .
[5] LI Fang-wei HUANG Xu ZHANG Hai-bo LIU Kai-jian HE Xiao-fan. Cluster-based Radio Resource Allocation Mechanism in D2D Networks[J]. Computer Science, 2018, 45(9): 123 -128, 165 .
[6] XIAO Chang-shi, MAO Yi-han, YUAN Hai-wen and WEN Yuan-qiao. Design and Simulation of Intelligent Control Algorithm for Quad-rotors under Wind Disturbance[J]. Computer Science, 2018, 45(5): 310 -316 .
[7] SHI Jin-ping,LI Jin,HE Feng-zhen. Diversity Recommendation Approach Based on Social Relationship and User Preference[J]. Computer Science, 2018, 45(6A): 423 -427 .
[8] WANG Zhen-chao, SONG Bo-yao, BAI Li-sha. AODV Routing Strategy Based on Joint Coding and Load Balancing[J]. Computer Science, 2018, 45(10): 99 -103 .
[9] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[10] WANG Zheng-li, XIE Tian, HE Kun and JIN Yan. 0-1 Knapsack Variant with Time Scheduling[J]. Computer Science, 2018, 45(4): 53 -59 .