Computer Science ›› 2023, Vol. 50 ›› Issue (6): 251-260.doi: 10.11896/jsjkx.220500100
• Artificial Intelligence • Previous Articles Next Articles
GUO Wei, HUANG Jiahui, HOU Chenyu, CAO Bin
CLC Number:
[1]HAN X,ZHAO W,DING N,et al.Ptr:Prompt tuning with rules for text classification[J].AI Open,2022,3,182-192. [2]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[J].Advances in Neural Information Processing Systems,2013,26,3111-3119. [3]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J].Advances in Neural Information Processing Systems,2017,30,6000-6010. [4]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pretraining ofdeep bidirectionaltransformers for language understanding[J].arXiv:1810.04805,2019. [5]KOVALEVA O,ROMANOV A,ROGERS A,et al.Revealing the Dark Secrets of BERT[C]//Proceedings of the 2019 Confe-rence on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).Hongkong:Association for Computational Linguistics,2019:4365-4374. [6]NORTHCUTT C,JIANG L,CHUANG I.Confident learning:Estimating uncertainty in dataset labels[J].Journal of Artificial Intelligence Research,2021,70:1373-1411. [7]ZHANG H,CISSE M,DAUPHIN Y N,et al.mixup:Beyondempirical risk minimization[C]//International Conference on Learning Representations.Canada:OpenReview.net,2018:1-13. [8]TIAN X X.An Improved Algorithm of Active Learning Based on Multiclass Classification[D].Baoding:Hebei University,2017. [9]GORDON M,DUH K,ANDREWS N.Compressing BERT:Studying the Effects of Weight Pruning on Transfer Learning[C]//Proceedings of the 5thWorkshop on Representation Learning for NLP.On-line:Association for Computational Linguistics,2020:143-155. [10]LAN Z,CHEN M,GOODMAN S,et al.ALBERT:A LiteBERT for Self-supervised Learning of La-nguage Representations[C]//International Conference on Learning Representations.Formerly Addis Ababa ETHIOPIA:2019:1-17. [11]GOU J,YU B,MAYBANK S J,et al.Knowle-dge distillation:A survey[J].International Journal of Computer Vision,2021,34:1-31. [12]LIU W,ZHOU P,ZHAO Z,et al.Fastbert:a self-distilling bert with adaptive inference time[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Online:Association for Computational Linguistics,2020:6035-6044. [13]TANAKA D,IKAMI D,YAMASAKI T,et al.Joint optimization framework for learning withnois-y labels[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:5552-5560. [14]LIN S,JI R,CHEN C,et al.ESPACE:Acceler-ating convolu-tional neural networks via eliminating spatial and channel redundancy[C]//Thirty-First AAAI Conference on Artificial Intelligence.San Francisco:AAAI Press,2017:1424-1430. [15]ZAFRIR O,BOUDOUKH G,IZSAK P,et al.Q8bert:Quantized 8bit bert[C]//2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition(EMC2-NIPS).Vancouver:IEEE,2019:36-39. [16]JIAO X Q,YIN Y C,SHANG L F,et al.TinyBERT:Distilling BERT for Natural Language Understanding[C]//Findings of the Association for Computational Linguistics(EMNLP 2020)2020:4163-4174. [17]SUN S,CHENG Y,GAN Z,et al.Patient Knowledge Distil-lation for BERT Model Compression[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).Hongkong:Association for Computational Linguistics,2019:4323-4332. [18]SANH V,DEBUTL,CHAUMONDJ,et al.DistilBERT,a distilled version of BERT:smaller,faster,cheaper and lighter[J].arXiv:1910.01108,2019. [19]QIU Y Y,LI H Z,LI S,et al.Revisiting correl-ations between intrinsic and extrinsic evaluations of word embeddings[C]//Chinese Computational Linguistics and Natural Language Proces-sing Based on Naturally Annotated Big Data.Cham:Springer,2018.209-221. [20]SCHÜTZE H,MANNING C D,RAGHAVAN P.Introductionto information retrieval[M].Cambridge:Cambridge University Press,2008. [21]CUI Y,CHE W,LIU T,et al.Pre-training with whole wordmasking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514. [22]LIU Y,OTT M,GOYAL N,et al.RoBERTa:A Robustly Optimized BERT Pretraining Approach[J].arXiv:1907.11692,2019. [23]LI J,LIU X,ZHAO H,et al.BERT-EMD:Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).Online:Association for Computational Linguistics,2020:3009-3018. |
[1] | QI Xuanlong, CHEN Hongyang, ZHAO Wenbing, ZHAO Di, GAO Jingyang. Study on BGA Packaging Void Rate Detection Based on Active Learning and U-Net++ Segmentation [J]. Computer Science, 2023, 50(6A): 220200092-6. |
[2] | ZHAO Jiangjiang, WANG Yang, XU Yingying, GAO Yang. Extractive Automatic Summarization Model Based on Knowledge Distillation [J]. Computer Science, 2023, 50(6A): 210300179-7. |
[3] | ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169. |
[4] | CHU Yu-chun, GONG Hang, Wang Xue-fang, LIU Pei-shun. Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 [J]. Computer Science, 2022, 49(6A): 337-344. |
[5] | HOU Xia-ye, CHEN Hai-yan, ZHANG Bing, YUAN Li-gang, JIA Yi-zhen. Active Metric Learning Based on Support Vector Machines [J]. Computer Science, 2022, 49(6A): 113-118. |
[6] | CHENG Xiang-ming, DENG Chun-hua. Compression Algorithm of Face Recognition Model Based on Unlabeled Knowledge Distillation [J]. Computer Science, 2022, 49(6): 245-253. |
[7] | HUANG Yu-jiao, ZHAN Li-chao, FAN Xing-gang, XIAO Jie, LONG Hai-xia. Text Classification Based on Knowledge Distillation Model ELECTRA-base-BiLSTM [J]. Computer Science, 2022, 49(11A): 211200181-6. |
[8] | ZHANG Da-lin, ZHANG Zhe-wei, WANG Nan, LIU Ji-qiang. AutoUnit:Automatic Test Generation Based on Active Learning and Prediction Guidance [J]. Computer Science, 2022, 49(11): 39-48. |
[9] | XIAO Zheng-ye, LIN Shi-quan, WAN Xiu-an, FANGYu-chun, NI Lan. Temporal Relation Guided Knowledge Distillation for Continuous Sign Language Recognition [J]. Computer Science, 2022, 49(11): 156-162. |
[10] | MIAO Zhuang, WANG Ya-peng, LI Yang, WANG Jia-bao, ZHANG Rui, ZHAO Xin-xin. Robust Hash Learning Method Based on Dual-teacher Self-supervised Distillation [J]. Computer Science, 2022, 49(10): 159-168. |
[11] | HUANG Zhong-hao, YANG Xing-yao, YU Jiong, GUO Liang, LI Xiang. Mutual Learning Knowledge Distillation Based on Multi-stage Multi-generative Adversarial Network [J]. Computer Science, 2022, 49(10): 169-175. |
[12] | ZHANG Ren-zhi, ZHU Yan. Malicious User Detection Method for Social Network Based on Active Learning [J]. Computer Science, 2021, 48(6): 332-337. |
[13] | YU Liang, WEI Yong-feng, LUO Guo-liang, WU Chang-xing. Knowledge Distillation Based Implicit Discourse Relation Recognition [J]. Computer Science, 2021, 48(11): 319-326. |
[14] | WANG Ti-shuang, LI Pei-feng, ZHU Qiao-ming. Chinese Implicit Discourse Relation Recognition Based on Data Augmentation [J]. Computer Science, 2021, 48(10): 85-90. |
[15] | WANG Run-zheng, GAO Jian, HUANG Shu-hua, TONG Xin. Malicious Code Family Detection Method Based on Knowledge Distillation [J]. Computer Science, 2021, 48(1): 280-286. |
|