Computer Science ›› 2020, Vol. 47 ›› Issue (3): 162-173.doi: 10.11896/jsjkx.191000167
• Artificial Intelligence • Previous Articles Next Articles
LI Zhou-jun,FAN Yu,WU Xian-jie
CLC Number:
[1]HE K,ZHANG X,REN S,et al.Deep residual learning for ima- ge recognition[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [2]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781. [3]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]∥Advances in Neural Information Processing Systems.2013:3111-3119. [4]ABADI M,BARHAM P,CHEN J,et al.Tensorflow:a system for large-scale machine learning[J].arXiv:1605.08695. [5]LE Q,MIKOLOV T.Distributed representations of sentences and documents[C]∥International Conference on Machine Learning.2014:1188-1196. [6]DENG L,YU D.Deep learning:methods and applications[J].Foundations and Trends in Signal Processing,2014,7(3/4):197-387. [7]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[J].arXiv:1802.05365. [8]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[J/OL].https://s3-us-west-2.amazonaws.com/openai-assets/researchcovers/languageunsupervised/language understanding paper.pdf,2018. [9]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805. [10]YANG Z,DAI Z,YANG Y,et al.XLNet:Generalized Autoregressive Pretraining for Language Understanding[J].arXiv:1906.08237. [11]YOSINSKI J,CLUNE J,BENGIO Y,et al.How transferable are features in deep neural networks?[C]∥Advances in Neural Information Processing Systems.2014:3320-3328. [12]OQUAB M,BOTTOU L,LAPTEV I,et al.Learning and transferring mid-level image representations using convolutional neural networks[C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2014:1717-1724. [13]GLOROT X,BORDES A,BENGIO Y.Domain adaptation for large-scale sentiment classification:A deep learning approach[C]∥Proceedings of the 28th International Conference on Machine Learning (ICML-11).2011:513-520. [14]CHEN M,XU Z,WEINBERGER K,et al.Marginalized denoi- sing autoencoders for domain adaptation[J].arXiv:1206.4683. [15]GANIN Y,USTINOVA E,AJAKAN H,et al.Domain-adversarial training of neural networks[J].The Journal of Machine Learning Research,2016,17(1):2096-2030. [16]SZEGEDY C,IOFFE S,VANHOUCKE V,et al.Inception-v4,inception-resnet and the impact of residual connections onlear-ning[C]∥AAAI.2017:12. [17]WU Z,SHEN C,HENGEL A V D.Wider or Deeper:Revisiting the ResNet Model for Visual Recognition[J].arXiv:1611.10080. [18]SINGH S,HOIEM D,FORSYTH D.Swapout:Learning an ensemble of deep architectures[C]∥Advances in Neural Information Processing Systems.2016:28-36. [19]REN S,HE K,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[C]∥Advances in Neural Information Processing Systems.2015:91-99. [20]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[J].arXiv:1608.06993. [21]HE K,ZHANG X,REN S,et al.Identity mappings in deep residual networks[C]∥European Conference on Computer Vision.Cham:Springer,2016:630-645. [22]LEDIG C,THEIS L,HUSZÁR F,et al.Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network[J].arXiv:1609.04802. [23]PETERS M,AMMAR W,BHAGAVATULA C,et al.Semi-supervised sequence tagging with bidirectional language models[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1756-1765. [24]KIROS R,ZHU Y,SALAKHUTDINOV R R,et al.Skip- thought vectors[C]∥Advances in Neural Information Proces-sing Systems.2015:3294-3302. [25]VINCENT P,LAROCHELLE H,BENGIO Y,et al.Extracting and composing robust features with denoising autoencoders[C]∥Proceedings of the 25th International Conference on Machine Learning.ACM,2008:1096-1103. [26]BENGIO Y,DUCHARME R,VINCENT P,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3(6):1137-1155. [27]PENNINGTON J,SOCHER R,MANNING C.Glove:Global vectors for word representation[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543. [28]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[J].arXiv:1607.01759. [29]CHEN D,MANNING C.A fast and accurate dependency parser using neural networks[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).2014:740-750. [30]BORDES A,USUNIER N,GARCIA-DURAN A,et al.Translating embeddings for modeling multi-relational data[C]∥Advances in Neural Information Processing Systems.2013:2787-2795. [31]TAI K S,SOCHER R,MANNING C D.Improved semantic representations from tree-structured long short-term memory networks[J].arXiv:1503.00075. [32]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]∥Proceedings of the 22nd ACM SIGKDD international Conference on Knowledge Discovery and Data Mi-ning.ACM,2016:855-864. [33]TANG J,QU M,WANG M,et al.Line:Large-scale information network embedding[C]∥Proceedings of the 24th International Conference on World Wide Web.International World Wide Web Conferences Steering Committee.2015:1067-1077. [34]NICKEL M,KIELA D.Poincaré embeddings for learning hierarchical representations[C]∥Advances in Neural Information Processing Systems.2017:6338-6347. [35]KAHNG M,ANDREWS P Y,KALRO A,et al.A cti v is:Vi- sual exploration of industry-scale deep neural network models[J].IEEE Transactions on Visualization and Computer Graphics,2018,24(1):88-97. [36]YANG X,MACDONALD C,OUNIS I.Using word embeddings in twitter election classification[J].Information Retrieval Journal,2018,21(2/3):183-207. [37]MNIH A,HINTON G.Three new graphical models for statistical language modelling[C]∥Proceedings of the 24th International Conference on Machine Learning.ACM,2007:641-648. [38]MNIH A,HINTON G E.A scalable hierarchical distributed language model[C]∥Advances in Neural Information Processing Systems.2009:1081-1088. [39]COLLOBERT R,WESTON J,BOTTOU L,et al.Natural language processing (almost) from scratch[J].Journal of Machine Learning Research,2011,12(1):2493-2537. [40]MIKOLOV T,KARAFIÁT M,BURGET L,et al.Recurrent neural network based language model[C]∥Eleventh Annual Conference of the International Speech Communication Association.2010. [41]GUTMANN M U,HYVÄRINEN A.Noise-contrastive estimation of unnormalized statistical models,with applications tona-tural image statistics[J].Journal of Machine Learning Research,2012,13:307-361. [42]DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by latent semantic analysis[J].Journal of the American Society for Information Science,1990,41(6):391-407. [43]GOLUB G H,REINSCH C.Singular value decomposition and least squares solutions[M]∥Linear Algebra.Berlin:Springer,1971:134-151. [44]HARRIS Z S.Distributional structure[J].Word,1954,10(2/3):146-162. [45]JOZEFOWICZ R,VINYALS O,SCHUSTER M,et al.Exploring the limits of language modeling[J].arXiv:1602.02410. [46]HOWARD J,RUDER S.Universal language model fine-tuning for text classification[J].arXiv:1801.06146. [47]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]∥Advances in Neural Information Processing Systems.2017:5998-6008. [48]LIU P J,SALEH M,POT E,et al.Generating wikipedia by summarizing long sequences[J].arXiv:1801.10198. [49]SONG K,TAN X,QIN T,et al.Mass:Masked sequence to sequence pre-training for language generation[J].arXiv:1905.02450. [50]DONG L,YANG N,WANG W,et al.Unified Language Model Pre-training for Natural Language Understanding and Generation[J].arXiv:1905.03197. [51]SUN Y,WANG S,LI Y,et al.ERNIE:Enhanced Representation through Knowledge Integration[J].arXiv:1904.09223. [52]ZHANG Z,HAN X,LIU Z,et al.ERNIE:Enhanced Language Representation with Informative Entities[J].arXiv:1905.07129. [53]LIU X,HE P,CHEN W,et al.Multi-task deep neural networks for natural language understanding[J].arXiv:1901.11504. [54]SUN Y,WANG S,LI Y,et al.Ernie 2.0:A continual pre-trai- ning framework for language understanding[J].arXiv:1907.12412. [55]HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531. [56]CUI Y,CHE W,LIU T,et al.Pre-Training with Whole Word Masking for Chinese BERT[J].arXiv:1906.08101. [57]JOSHI M,CHEN D,LIU Y,et al.SpanBERT:Improving pre-training by representing and predicting spans[J].arXiv:1907.10529. [58]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti- mized BERT pretraining approach[J].arXiv:1907.11692. [59]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8). [60]DAI Z,YANG Z,YANG Y,et al.Transformer-xl:Attentive language models beyond a fixed-length context[J].arXiv:1901.02860. [61]NIVEN T,KAO H Y.Probing neural network comprehension of natural language arguments[J].arXiv:1907.07355. [62]MCCOY R T,PAVLICK E,LINZEN T.Right for the Wrong Reasons:Diagnosing Syntactic Heuristics in Natural Language Inference[J].arXiv:1902.01007. [63]WOLF T,DEBUT L,SANH V,et al.Transformers:State-of-the-art Natural Language Processing[J].arXiv:1910.03771. [64]Bright.GitHub repository[OL].https://github.com/bright- mart/albert_zh. [65]LAN Z,CHEN M,GOODMAN S,et al.ALBERT:A Lite BERT for Self-supervised Learning of Language Representations[J].arXiv:1909.11942. |
[1] | YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236. |
[2] | HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163. |
[3] | LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73. |
[4] | ZHAO Dan-dan, HUANG De-gen, MENG Jia-na, DONG Yu, ZHANG Pan. Chinese Entity Relations Classification Based on BERT-GRU-ATT [J]. Computer Science, 2022, 49(6): 319-325. |
[5] | HAN Hong-qi, RAN Ya-xin, ZHANG Yun-liang, GUI Jie, GAO Xiong, YI Meng-lin. Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning [J]. Computer Science, 2022, 49(5): 33-42. |
[6] | LIU Shuo, WANG Geng-run, PENG Jian-hua, LI Ke. Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words [J]. Computer Science, 2022, 49(4): 282-287. |
[7] | LI Yu-qiang, ZHANG Wei-jiang, HUANG Yu, LI Lin, LIU Ai-hua. Improved Topic Sentiment Model with Word Embedding Based on Gaussian Distribution [J]. Computer Science, 2022, 49(2): 256-264. |
[8] | ZHANG Hu, BAI Ping. Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification [J]. Computer Science, 2022, 49(2): 279-284. |
[9] | LIU Kai, ZHANG Hong-jun, CHEN Fei-qiong. Name Entity Recognition for Military Based on Domain Adaptive Embedding [J]. Computer Science, 2022, 49(1): 292-297. |
[10] | HOU Hong-xu, SUN Shuo, WU Nier. Survey of Mongolian-Chinese Neural Machine Translation [J]. Computer Science, 2022, 49(1): 31-40. |
[11] | LI Zhao-qi, LI Ta. Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining [J]. Computer Science, 2022, 49(1): 59-64. |
[12] | LIU Chuang, XIONG De-yi. Survey of Multilingual Question Answering [J]. Computer Science, 2022, 49(1): 65-72. |
[13] | CHEN Zhi-yi, SUI Jie. DeepFM and Convolutional Neural Networks Ensembles for Multimodal Rumor Detection [J]. Computer Science, 2022, 49(1): 101-107. |
[14] | WANG Li-mei, ZHU Xu-guang, WANG De-jia, ZHANG Yong, XING Chun-xiao. Study on Judicial Data Classification Method Based on Natural Language Processing Technologies [J]. Computer Science, 2021, 48(8): 80-85. |
[15] | PAN Fang, ZHANG Hui-bing, DONG Jun-chao, SHOU Zhao-yu. Aspect Sentiment Analysis of Chinese Online Course Review Based on Efficient Transformer [J]. Computer Science, 2021, 48(6A): 264-269. |
|