Computer Science ›› 2022, Vol. 49 ›› Issue (7): 148-163.doi: 10.11896/jsjkx.211200018
• Artificial Intelligence • Previous Articles Next Articles
HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu
CLC Number:
[1]LIU P F,QIU X P,HUANG X J.Recurrent neura lnetwork for text classification with multi-task learning[C]//Proceedings of the 2016 Conference on IJCAI.2016:2073-2879. [2]KRIZHEVSKY A,SUSKEVER I,HINTON G E.ImageNetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.London:MIT Press,2012:1097-1105. [3]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[J].arXiv:1409.0473v7,2014. [4]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.2019:4171-4186. [5]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[J].arXiv:1301.3781v1,2013. [6]PENNINGTON J,SOCHER R,MANNING C D.GloVe:Global Vectors for Word Representation [C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543. [7]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[J].arXiv:1607.01759,2016. [8]PETERS M,NEUMANN M,LYYER M,et al.Deep Contextualized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics.2018:2227-2237. [9]SHI X,CHEN Z,WANG H,et al.Convolutional LSTM Net-work:A Machine Learning Approach for Precipitation Nowcas-ting[J].arXiv:1506.04214,2015. [10]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by g-enerative pre-training[OL].[2022-04-15].https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf. [11]WANG A,SINGH A,MICHAEL J,et al.GLUE:A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding[J].arXiv:1804.07461,2018. [12]HOCHREITER S,SCHMIDHUBER J.Long Short-Term Me-mory[J].Neural Computation,1997,9(8):1735-1780. [13]CHO K,MERRIENBOER B V,GULCEHRE C,et al.Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[C]//Proceedings of the 2014 Confe-rence on Empirical Methods in Natural Language Processing (EMNLP).2014:1724-1734. [14]VASWANI A,SHAZEER N,PARMAR N,et al.Attention Is All You Need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6000-6010. [15]WU Y,SCHUSTER M,CHEN Z,et al.Google's neural machine translation system:Bridging the gap between human and machine translation[J].arXiv:1609.08144,2016. [16]SUN Y,WANG SH,LI Y K,et al.ERNIE:enha-nced representation through knowledge integration[J].arXiv:1904.09223,2019. [17]WEI J,REN X,LI X,et al.NEZHA:Neural Co-ntextualizedRepresentation for Chinese Language Understanding[J].arXiv:1909.00204,2019. [18]CUI Y,CHE W,LIU T,et al.Revisiting Pre-Trained Models for Chinese Natural Language Processing[J].arXiv:2004.13922,2020. [19]LIU Y H,OTT M,GOYAL N,et al.RoBERTa:A Robustly Optimized BERT Pretraining Approach.[J].arXiv:1907.11692,2019. [20]ERLANGSHEN Pre-training model [OL].[2021-11-15].https://huggingface.co/IDEA-CCNL/Erlangshen-1.3B. [21]PEKRS Pre-training model [OL].[2021-11-16].https://mp.weixin.qq.com/s/r85W7T26vy6_IIRAWY1ZKA. [22]LAI Y,LIU Y,FENG Y,et al.Lattice-BERT:Leveraging Multi-Granularity Representations in Chinese Pre-Trained Language Models[J].arXiv:2104.07204,2021. [23]ZHANG Z,GU Y,HAN X,et al.CPM-2:Large-Scale Cost-Effective Pre-Trained Language Models[J].arXiv:2106.10715,2021. [24]MOTIAN Pre-training model[OL].[2021-06-24]https://mp.weixin.qq.com/s/HQL0Hk49UR6kVNtrvcXEGA. [25]ZHANG R,PANG C,ZHANG C,et al.Correcting ChineseSpelling Errors with Phonetic Pre-Training[C]//Findings of the Association for Computational Linguistics:ACL-IJCNLP.2021:2250-2261. [26]SHAW P,USZKOREIT J,VASWANI A.Self-Attention withRelative Position Representations[J].arXiv:1803.02155,2018. [27]IOFFE S,SZEGEDY C.Batch normalization:accelerating deep network training by reducing int-ernal covariate shift[C]//International Conference on Machine Learning.2015:448-456. [28]BA J L,KIROS J R,HINTON G E.Layer norm-alization[J].arXiv:1607.06450,2016. [29]BERTSG Pre-training model [OL].[2021-03-15].https://baijiahao.baidu.com/s?id=1695185167027662850&wfr=spider&for=pc. [30]DING M,YANG Z,HONG W,et al.CogView:Mastering Text-to-Image Generation via Transf- ormers[J].arXiv:2105.13290,2021. [31]SHAZEER N,MISHOSEINI N,MAZIARZ K,et al.Outra-geously Large Neural Networks:The Sparsely-Gated Mixture-of-Experts Layer[J].arXiv:1701.06538,2017. [32]LIN J,MEN R,YANG A,et al.M6:A Chinese Multimodal Pretrainer[J].arXiv:2103.00823,2021. [33]YANG A,LIN J,MEN R,et al.M6-T:Exploring Sparse Expert Models and Beyond[J].arXiv:2105.15082.2021. [34]DIAO S Z,BAI J X,SONG Y,et al.ZEN:Pre-training Chinese Text Encoder Enhanced by N-gram Representations[C]//Fin-dings of the Association for Computational Linguistics.2020:4729-4740. [35]SONG Y,ZHANG T,WANG Y,et al.ZEN 2.0:Continue Trainingand Adaption for N-gram En-hanced Text Encoders[J].arXiv:2105.01279,2021. [36]ZHANG X,LI P,LI H.AMBERT:A Pre-Trained LanguageModel withMulti-Grained Tokenization[J].arXiv:2008.11869,2020. [37]GUO W,ZHAO M,ZHANG L,et al.LICHEE:Improving Language Model Pre-Training with Multi-Grained Tokenization[J].arXiv:2108.00801,2021. [38]WoBERT Pre-training model [OL].[2020-09-18].https://ke-xue.fm/archives/7758. [39]PLUG Pre-training model [OL].[2021-04-19].https://mp.weixin.qq.com/s/-aV6Hh-BFoW41HQop_Z02w. [40]WANG W,BI B,YAN M,et al.StructBERT:IncorporatingLanguage Structures into Pre-Training for Deep Language Understanding[J].arXiv:1908.04577,2019. [41]BI B,LI C,WU C,et al.PALM:Pre-training an Autoencoding &Autoregressive Language Model for Context-conditionedGene-ration[J].arXiv:2004.07159,2020. [42]SHAO Y,GENG Z,LIU Y,et al.CPT:A Pre-Trained Unba-lanced Transformer for Both Chinese Language Understanding and Generation[J].arXiv:2109.05729,2021. [43]SUN Y,WANG S,FENG S,et al.ERNIE 3.0:Large-ScaleKnowledge Enhanced Pre-Training for Language Understanding and Generation[J].arXiv:2107.02137,2021. [44]WANG S,SUN Y,XIANG Y,et al.ERNIE 3.0 Titan:Exploring Larger-scale Knowledge Enhan-ced Pre-training for Language Understanding and Generation[J].arXiv:2112.12731,2021. [45]SHEN Z.Pre-training model [OL].[2021-09-30].https://www.jiqizhixin.com/articles/2021-09-30-2. [46]SUN Z,LI X,SUN X,et al.ChineseBERT:Chi-nese Pretraining Enhanced by Glyph and Pinyin Information[C]//Proceedings of the 59th Annual Meeting of the Association for Computational L-inguistics and the 11th International Joint Conference on Na-tural Language Processing(Volume 1:Long Papers).2021:2065-2075. [47]ZHANG Z,ZHANG H,CHEN K,et al.Mengzi:TowardsLightweight yet Ingenious Pre-Trained Models for Chinese[J].arXiv:2110.06696,2021. [48]SHEN N.Pre-training model [OL].[2021-10-20].https://mp.weixin.qq.com/s/coW_OIbRA4lwVLZaRyxO9Q. [49]HUO Y,ZHANG M,LIU G,et al.WenLan:Bri-dging Visionand Language by Large-Scale Multi-Modal Pre-Training[J].arXiv:2103.06561,2021. [50]OORD A,VINYALS O,KAVUKCUOGLU K.Neural discreterepresentation learning[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6309-6318. [51]LIU J,ZHU X,LIU F,et al.OPT:Omni-Percept-ion Pre-Trai-ner for Cross-Modal Understanding and Generation[J].arXiv:2107.00249,2021. [52]ZHANG Z,HAN X,ZHOU H,et al.CPM:A Large-Scale Ge-nerative Chinese Pre-Trained Language Model[J].arXiv:2012.00413,2020. [53]WU S,ZHAO X,YU T,et al.Yuan 1.0:Large-Scale Pre-Trained Language Model in Zero-Shot and Few-Shot Learning[J].arXiv:2110.04725,2021. [54]SUN Y,WANG S,LI Y,et al.ERNIE 2.0:AContinual Pre-Training Framework for Language Understanding[J].Procee-dings of the AAAI Co-nference on Artificial Intelligence,2020,34(5):8968-8975. [55]XIAO C,HU X,LIU Z,et al.Lawformer:A Pre-Trained Language Model for Chinese Legal Long Documents[J].arXiv:2105.03887,2021. [56]BELTAGY I,PETERS M E,COHAN A.Long-former:TheLong-Document Transformer[J].arXiv:2004.05150,2020. [57]ZENG W,REN X,SU T,et al.PanGu-$\alpha$:Large-Scale Autoregressive Pretrained Chinese Language Models with Auto-Parallel Computation[J].arXiv:2104.12369,2021. [58]MICIKEVICIUS P,NARANG S,ALBEN J,et al.Mixed Precision Training[J].arXiv:1710.03740,2017. [59]LESTER B,AL-RFOU R,CONSTANT N.Thepower of scale for parameter-efficient prompt tuning[J].arXiv:2104.08691,2021. [60]BAO S,HE H,WANG F,et al.PLATO-2:Tow-ards Building an Open-Domain Chatbot via Curr-iculum Learning[J].arXiv:2006.16779,2020. [61]BAO S,HE H,WANG F,et al.PLATO-XL:Ex-ploring theLarge-Scale Pre-Training of Dialogue Generation[J].arXiv:2109.09519,2021. [62]WANG Y,KE P,ZHENG Y,et al.A Large-Scale ChineseShort-Text Conversation Dataset[J].arXiv:2008.03946,2020. [63]ZHOU H,KE P,ZHANG Z,et al.EVA:An Op-en-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training[J].arXiv:2108.01547,2021. [64]LIU Z,HUANG D,HUANG D,et al.FinBERT:A Pre-trained Financial Language Representati- on Model for Financial Text Mining[C]//Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence.2021:4513-4519. [65]TAL-EduBERT Pre-training model [OL].[2021-01-26].https://github.com/tal-tech/edu-bert. [66]GUWENBERT Pre-training model [OL].[2021-08-31]https://github.com/ethan-yt/guwenbert. [67]BERT-CCPoem Pre-training model [OL].[2021-7-5]https://github.com/THUNLP-AIPoet/BERT-CCPoem. [68]ZHANG N,JIA Q,YIN K,et al.Conceptualized Representation Learning for Chinese Biomedical Text Mining[J].arXiv:2008.10813,2020. [69]HUI B,SHI X,GENG R,et al.Improving Text-to-SQL withSchema Dependency Learning[J].arXiv:2103.04399,2021. [70]LAN Z,CHEN M,GOODMAN S,et al.ALBE-RT:A LiteBERT for Self-Supervised Learning of Language Representations[J].arXiv:1909.11942,2019. [71]YANG Z,DAI Z,YANG Y,et al.XLNet:Gene-ralized Auto-regressive Pretraining for Language Understanding[J].arXiv:1906.08237,2019. [72]CLARK K,LUONG M T,LE Q V,et al.ELEC-TRA:Pre-Training Text Encoders as Discrimina-tors Rather Than Gene-rators[J].arXiv:2003.10555,2020. [73]SU J,LU Y,PAN S,et al.RoFormer:Enhanced Transformer with Rotary Position Embedding[J].arXiv:2104.09864,2021. [74]DAI Z,YANG Z,YANG,Y,et al.Transformer-XL:Attentive Language Models Beyond a Fixed-Length Context[J].arXiv:1901.02860,2019. [75]RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer[J].arXiv:1910.10683,2019. [76]ZHANG J,ZHAO Y,SALEH M,et al.PEGASU-S:Pre-Trai-ning with Extracted Gap-Sentences for Abstractive Summarization[J].arXiv:1912.08777,2019. [77]LEWIS M,LIU Y,GOYAL N,et al.BART:Denoising Se-quence-to-Sequence Pre-training for Natural Language Generation,Translation,and Comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7871-7880. [78]DONG L,YANG N,WANG W,et al.Unified Language ModelPre-Training for Natural Lang-uage Understanding and Generation[J].arXiv:1905.03197,2019. [79]SU J L.SimBERT Pretraining model [OL].[2020-05-18].https://www.spaces.ac.cn/archives/7427. [80]SU J L.RoFormer-Sim Pretraining model [OL].[2021-06-11].https://www.spaces.ac.cn/archives/8454. [81]XU L,ZHANG X,DONG Q.CLUECorpus2020:A Large-Scale Chinese Corpus for Pre-Training Language Model[J].arXiv:2003.01355,2020. [82]XU L,HU H,ZHANG X,et al.CLUE:A Chinese Language Understanding Evaluation Benchmark[J].arXiv:2004.05986,2020. [83]ZHANG N,CHEN M,BI Z,et al.CBLUE:A Chinese Bio-medical Language Understanding Evaluation Benchmark[J].ar-Xiv:2106.08087,2021. [84]YAO Y,DONG Q,GUAN J,et al.CUGE:A Chinese Language Understanding and Generation Evaluation Benchmark[J].ar-Xiv:2112.13610,2021. |
[1] | XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171. |
[2] | RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207. |
[3] | TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305. |
[4] | WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293. |
[5] | HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329. |
[6] | JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335. |
[7] | SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177. |
[8] | YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236. |
[9] | HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78. |
[10] | CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126. |
[11] | ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169. |
[12] | SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235. |
[13] | ZHU Wen-tao, LAN Xian-chao, LUO Huan-lin, YUE Bing, WANG Yang. Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN [J]. Computer Science, 2022, 49(6A): 378-383. |
[14] | WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423. |
[15] | MAO Dian-hui, HUANG Hui-yu, ZHAO Shuang. Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance [J]. Computer Science, 2022, 49(6A): 523-530. |
|