计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 50-59.doi: 10.11896/jsjkx.230600051
葛慧斌1, 王德鑫1, 郑涛2, 张婷3, 熊德意1
GE Huibin1, WANG Dexin1, ZHENG Tao2, ZHANG Ting3, XIONG Deyi1
摘要: 深度学习平台在新一代人工智能的发展中扮演着重要的角色。近年来,以昇腾平台为代表的国产人工智能软硬件系统快速发展,为国产深度学习平台的发展开辟出了新的道路。与此同时,为了发现并解决昇腾系统存在的潜在漏洞,昇腾平台积极开展常用深度学习模型的迁移工作。从自然语言处理算法角度切入,针对机器阅读理解、神经机器翻译、序列标注和文本分类四大自然语言处理任务,以昇腾平台的高性能硬件芯片为基础,探究迁移ALBERT,RNNSearch,BERT-CRF和TextING这4类典型的自然语言处理模型。基于以上迁移研究,发现和整理了昇腾平台架构设计在自然语言处理研究与业务上的主要不足,即计算图节点动态空间的分配特性、资源算子下沉设备侧、图算融合以及混合精度训练4个方面的问题,并为以上问题提出了相应的解决方案,并进行了实验验证。最后,为国产深度学习平台的发展提出未来优化的方向和相关建议。
中图分类号:
[1]LING S,NGUYEN K,ROUX-LANGLOIS A,et al.A lattice-based group signature scheme with verifier-local revocation [J].Theoretical Computer Science,2018,730(19):1-20. [2]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.Human Language Technologies,Volume 1,Minneapolis,2019:4171-4186. [3]JIANG C,CHEN T S.Chinese AI starts from “core”[J].Zhong Guan Cun,2019(2):48-51. [4]Anonymous.Huawei Released AI Processor Ascend 910[J].Office Information,2019,24(19):25. [5]RAN Y L.AI Chip Industry And Trends[J].Big Data Time,2019(4):40-45. [6]Anonymous.Alibaba Released Self-Developed AI Chip Han-Guang 800[J].Intelligent Building & Smart City,2019(10):5. [7]MA Y J,YU D H,WU T,et al.PaddlePaddle:An Open-Source Deep Learning Platform From Industrial Practice[J].Frontiers of Data and Domputing,2019,1(1):105-115. [8]YU F.Research on the Next-Generation Deep Learning Framework[J].Big Data Research,2020,6(4):69-78. [9]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align And Translate[C]//Procee-dings of the International Conference on Learning Representations,2015. [10]LAN Z Z,CHEN M,GOODMAN S,et al.ALBERT:A Lite BERT for Self-Supervised Learning of Language Representations[C]//Proceedings of the International Conference on Learning Representations.Addis Ababa,2020:1-17. [11]ZHANG Y F,YU X L,CUI Z Y,et al.Every Document Owns Its Structure:Inductive Text Classification via Graph Neural Networks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:334-339. [12]SUTSKEVER I,VINYALS O,LE Q.Sequence to Sequence Learning with Neural Networks[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge,2014:3104-3112. [13]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [14]CHUNG J Y,GÜLCEHRE C,CHO K,et al.Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[C]//Proceedings of Advances in Neural Information Proces-sing Systems Deep Learning and Representation Learning Workshop.2014. [15]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas,2016:770-778. [16]MICIKEVICIUS P,NARANG S,ALBEN J,et al.Mixed precision training[C]//Proceedings of International Conference on Learning Representations.Vancouver,2018. [17]RAJPURKAR P,JIA R,LIANG P.Know What You Don’t Know:Unanswerable Questions for SQuAD[C]//Proceedings of the 56th Annual Meeting of the Association for Computa-tional Linguistics.Melbourne,2018:784-789. [18]WANG A,SINGH A,MICHAEL J,et al.GLUE:A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding[C]//Proceedings of the 2018 EMNLP Workshop:Analyzing and Interpreting Neural Networks for NLP.Brussels,2018:353-355. [19]PANG B,LEE L.A Sentimental Education:Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts[C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics.2004:271-278. [20]SANG E F T K,MEULDER F D.Introduction to the CoNLL-2003 Shared Task:Language-Independent Named Entity Recognition[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL.2003:142-147. [21]PAPINENI K,ROUKOS S,WARD T,et al.Bleu:A Method for Automatic Evaluation of Machine Translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computa-tional Linguistics.2002:311-318. |
|