计算机科学 ›› 2024, Vol. 51 ›› Issue (10): 178-186.doi: 10.11896/jsjkx.230800191
刘耀1, 秦迅2, 刘天吉2
LIU Yao1, QIN Xun2, LIU Tianji2
摘要: 针对在项目开发过程中新需求来临时,需要对自然语言处理工具和资源解析插件进行重新需求分析、重复开发等问题,提出了一套面向业务的资源按需解析方案。首先,提出了一种从需求到代码的资源按需解析方法,针对需求文本本身进行需求概念标引模型的构建。构建的需求概念标引模型的准确率、召回率、F1值等指标均高于其他分类模型。然后,针对需求文本与代码的关联,建立从需求文本到代码库类别的映射机制。对于模型的映射结果,使用前K准确率(percision@K)作为评价指标,最终准确率达到60%,具有一定的实用价值。综上所述,探索了一套具有需求解析能力、实现需求与代码关联的资源按需解析关键技术,并贯穿需求文本分类、需求代码库分类、代码库检索到插件生成的整个流程,形成了完整的“需求-代码-插件-解析”的业务闭环,通过实验验证了所提方法对于资源按需解析的有效性,为业务需求分析与软件复用提供了思路,与现有用于业务需求的解析和代码生成的大语言模型相比,所提方法聚焦于具体业务领域内的含有业务特点的插件代码复用全流程的实现。
中图分类号:
[1]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al.Neural architectures for named entity recognition[J].arXiv:1603.01360,2016. [2]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [3]LECUN Y,BOTTOU L.Gradient-based learning applied to do-cument recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324. [4]BENGIO Y,SIMARD P,FRASCONI P.Learning long-term dependencies with gradient descent is difficult[J].IEEE transactions on neural networks,1994,5:157-166. [5]KIM Y.Convolutional neural networks for sentence classification[J].arXiv:1408.5882,2014. [6]SHI B,XIANG B,CONG Y.An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,39(11):2298-2304. [7]HOCHREITER S,SCHMIDHUBER J.Long Short-Term Me-mory[J].Neural Computation,1997,9:1735-1780. [8]GERS F A,SCHMIDHUBER J,CUMMINS F A.Learning to Forget:Continual Prediction with LSTM[J].Neural Computation,2000,12:2451-2471. [9]GRAVES A,SCHMIDHUBER J.Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J].Neural Networks,2005,18(7):602-610. [10]CHUNG J,GULCEHRE C,CHO K,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J].arXiv:1412.3555,2014. [11]PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua-lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long Papers).2018:2227-2237. [12]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[EB/OL].https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf. [13]LU J,YING W,SUN X,et al.Interactive Query Reformulation for Source-code Search with Word Relations[J].IEEE Access,2018,6:75660-75668. [14]MCMILLAN C,GRECHANIK M,POSHYVANYK D,et al.Portfolio:finding relevant functions and their usage[C]//Proceedings of the 33rd International Conference on Software Engineering(ICSE 2011).Waikiki,Honolulu,HI,USA,2011(5):111-120. [15]LV F,ZHANG H,LOU J,et al.Codehow:Effective code search based on API understanding and extendedboolean model[C]//30th IEEE/ACM International Conference on Automated Software Engineering.ASE 2015,Lincoln,NE,USA,2015:260-270. [16]RAHMAN M M,CHANCHAL R.Nlp2api:Query reformula-tion for code search using crowdsourced knowledge and extra-large data analytics [C]//2018 IEEE International Conference on Software Maintenance and Evolution(ICSME).IEEE,2018:714-714. [17]HUSAIN H,WU H,GAZIT T,et al.Codesearchnet challenge:Evaluating the state of semantic code search[J].arXiv:1909.09436,2020. [18]GU X,ZHANG H,KIM S.Deep code search[C]//Proceedings of the 40th International Conference on Software Engineering.ICSE 2018,2018:933-944. [19]YIN P,NEUBIG G.A syntactic neural model for general purposecode generation[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL 2017)Vancouver,Canada,Volume 1:Long Papers,Association for Computational Linguistics.2017:440-450. [20]ZHANG J,WANG X,ZHANG H,et al.A novel neural source code representation based on abstract syntax tree[C]//Procee-dings of the 41st International Conference on Software Enginee-ring(ICSE 2019).Montreal,QC,Canada.IEEE / ACM,2019:783-794. [21]WAN Y,SHU J,SUI Y,et al.Multi-modal attention networklearning for semantic source code retrieval[C]//34th IEEE/ACM International Conference on Automated Software Engineering(ASE 2019).San Diego,CA,USA.IEEE,2019:13-25. [22]YANG H.BERT meets chinese word segmentation[J].arXiv:1909.09292,2019. [23]SCHICK T,SCHÜTZE H.Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2020:255-269. |
|