计算机科学 ›› 2022, Vol. 49 ›› Issue (6A): 150-158.doi: 10.11896/jsjkx.210500065
康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩
KANG Yan, WU Zhi-wei, KOU Yong-qi, ZHANG Lan, XIE Si-yu, LI Hao
摘要: 随着软件数量和种类的快速增长,有效地挖掘软件需求的文本特征,并对软件功能性需求的文本特征进行分类,成为软件工程领域的一大挑战。软件功能性需求分类为整个软件开发过程提供了可靠的保障,并减小了需求分析阶段潜在的风险和负面影响。但是,软件需求文本的高分散性、高噪声、数据稀疏等特点限制了软件需求分析的有效性。提出双层词汇图卷积网络模型,创新性地对软件需求文本进行图建模,建立软件需求的图神经网络,有效捕获单词的知识边以及单词与文本之间的关系;并提出深度集成学习模型,集成多个深度学习分类模型,对软件需求文本进行分类。在数据集Windows_a和数据集Windows_b的实验中,融合Bert和图卷积的深度集成学习模型的准确率分别达到96.73%和95.60%,其明显优于其他文本分类模型,充分证明融合Bert和图卷积的深度集成学习模型能有效判别软件需求文本的功能特性,提高软件需求文本分类的准确性。
中图分类号:
[1] ERNST N A,MYLOPOULOS J.On the perception of software quality requirements during the project lifecycle[C]//16th International Working Conference(REFSQ 2010).Springer Berlin Heidelberg,2010:143-157. [2] NIU NEASTERBROOK S.Extracting and modeling productline functional requirements[C]//16th IEEE International Requirements Engineering Conference.2008:155-164. [3] KNAUSS E,DAMIAN D,POO-CAAMANO G,et al.Detecting and classifying patterns of requirements clarifications[J].IEEE Computer Society,2012:251-260. [4] KO Y,PARK S,SEO J,et al.Using classification techniques for informal requirements in the requirements analysis-supporting system[J].Information & Software Technology,2007,49(11/12):1128-1140. [5] RAHIMI N,EASSA F,ELREFAEI L.An Ensemble Machine Learning Technique for Functional Requirement Classification[J].Symmetry,2020,12(10):1601. [6] HU W S,YANG J F,ZHAO M.Demand analysis based on greyclustering algorithm[J].Computer Science,2016,43(S1):471-475. [7] MARTIN J,KLEINROCK L.Excerpts from:An InformationSystems Manifesto[J].Communications of the ACM,1985,28(3):252-255. [8] ABAD Z,KARRAS O,GHAZI P,et al.What Works Better? A Study of Classifying Requirements[C]//2017 IEEE 25th International Requirements Engineering Conference.IEEE,2017:496-501. [9] TIUN S,MOKHTAR U A,BAKAR S H,et al.Classification of functional and non-functional requirement in software requirement using Word2vec and fast Text[J].Journal of Physics:Conference Series,2020,1529(4):042077. [10] KIM Y.Convolutional Neural Networks for Sentence Classification[J].arXiv:1408.5882,2014. [11] YAO L,MAO C,LUO Y.Graph Convolutional Networks for Text Classification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):7370-7377. [12] DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2018. [13] KIP F N,WELLING M.Semi-Supervised Classification withGraph Convolutional Networks[J].arXiv:1609.02907,2016. [14] JEONG C,JANG S,SHIN H,et al.A Context-Aware Citation Recommendation Model with BERT and Graph Convolutional Networks[J].arXiv:1903.06464,2019. [15] RASCHKA S.Ensemble Vote Classifier-mlxtend[EB/OL].http://rasbt.github.io/mlxtend/user_guide/classifier/EnsembleVoteClassifier/. [16] GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation[J].arXiv:1311.2524,2013. [17] JOHNSON R,TONG Z.Deep Pyramid Convolutional NeuralNetworks for Text Categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017. [18] WARSTADT A,SINGH A,BOWMAN S R.Neural network acceptability judgments[J].arXiv:1805.12471,2018. [19] SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing(EMNLP).2013:1631-1642. [20] KANG Y,CUI G R,LI H,et al.Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi-channel Pyramid Convolution[J].Computer Science,2020,47(3):48-53. [21] YAO L,MAO C,LUO Y.Graph convolutional networks fortext classification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):7370-7377. [22] LU Z,DU P,NIE J Y.VGCN-BERT:augmenting BERT with graph embedding for text classification[C]//European Confe-rence on Information Retrieval.Cham:Springer,2020:369-382. [23] HOCHREITER,SEPP,SCHMIDHUBER,et al.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [24] DEVLIN J,CHANG M W,LEE K,et al.:Bert Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [25] LEVER J,KRZYWINSKI M,ALTMAN N.Classification evaluation[J].Nature Methods,2016,13(8):603-604. |
[1] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[2] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[3] | 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航. 监督和半监督学习下的多标签分类综述 Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning 计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111 |
[4] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[5] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[6] | 于家畦, 康晓东, 白程程, 刘汉卿. 一种新的中文电子病历文本检索模型 New Text Retrieval Model of Chinese Electronic Medical Records 计算机科学, 2022, 49(6A): 32-38. https://doi.org/10.11896/jsjkx.210400198 |
[7] | 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓. 一种可快速迁移的领域知识图谱构建方法 Fast and Transmissible Domain Knowledge Graph Construction Method 计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018 |
[8] | 林夕, 陈孜卓, 王中卿. 基于不平衡数据与集成学习的属性级情感分类 Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning 计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205 |
[9] | 余本功, 张子薇, 王惠灵. 一种融合多层次情感和主题信息的TS-AC-EWM在线商品排序方法 TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information 计算机科学, 2022, 49(6A): 165-171. https://doi.org/10.11896/jsjkx.210400238 |
[10] | 邵欣欣. TI-FastText自动商品分类算法 TI-FastText Automatic Goods Classification Algorithm 计算机科学, 2022, 49(6A): 206-210. https://doi.org/10.11896/jsjkx.210500089 |
[11] | 王宇飞, 陈文. 基于DECORATE集成学习与置信度评估的Tri-training算法 Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment 计算机科学, 2022, 49(6): 127-133. https://doi.org/10.11896/jsjkx.211100043 |
[12] | 郭雨欣, 陈秀宏. 融合BERT词嵌入表示和主题信息增强的自动摘要模型 Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement 计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101 |
[13] | 邓朝阳, 仲国强, 王栋. 基于注意力门控图神经网络的文本分类 Text Classification Based on Attention Gated Graph Neural Network 计算机科学, 2022, 49(6): 326-334. https://doi.org/10.11896/jsjkx.210400218 |
[14] | 韩红旗, 冉亚鑫, 张运良, 桂婕, 高雄, 易梦琳. 基于共同子空间分类学习的跨媒体检索研究 Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning 计算机科学, 2022, 49(5): 33-42. https://doi.org/10.11896/jsjkx.210200157 |
[15] | 刘硕, 王庚润, 彭建华, 李柯. 基于混合字词特征的中文短文本分类算法 Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words 计算机科学, 2022, 49(4): 282-287. https://doi.org/10.11896/jsjkx.210200027 |
|