Computer Science ›› 2023, Vol. 50 ›› Issue (2): 374-383.doi: 10.11896/jsjkx.220300147
• Interdiscipline & Frontier • Previous Articles
LIANG Jiali, HUA Baojian, SU Shaobo
CLC Number:
[1]ABADI M,BARHAM P,CHEN J,et al.Tensorflow:A system for large-scale machine learning[C]//12th {USENIX} Sympo-sium on Operating Systems Design and Implementation({Osdi} 16).Savannah,GA,USA:USENIX Association,2016:265-283. [2]PASZKE A,GROSS S,MASSA F,et al.Pytorch:An imperative style,high-performance deep learning library[J].Advances in Neural Information Processing Systems,2019,32:8026-8037. [3]CHEN T,LI M,LI Y,et al.Mxnet:A flexible and efficient machine learning library for heterogeneous distributed systems[J].arXiv:1512.01274,2015. [4]JIA Y,SHELHAMER E,DONAHUE J,et al.Caffe:Convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia.New York:Association for Computing Machinery,2014:675-678. [5]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,2015:3431-3440. [6]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,2016:770-778. [7]KARPATHY A,TODERICI G,SHETTY S,et al.Large-scalevideo classification with convolutional neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE,2014:1725-1732. [8]TRAN D,BOURDEV L,FERGUS R,et al.Learning spatiotem-poral features with 3d convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.NW Washington DC,United States:IEEE Computer Society,2015:4489-4497. [9]RAGAN-KELLEY J,BARNES C,ADAMS A,et al.Halide:A language and compiler for optimizing parallelism,locality,and recomputation in image processing pipelines[C]//Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI).2013:519-530. [10]CHEN T,MOREAU T,JIANG Z,et al.{TVM}:An automated end-to-end optimizing compiler for deep learning[C]//13th {USENIX} Symposium on Operating Systems Design and Implementation({OSDI} 18).Berkeley:{USENIX} Association,2018:578-594. [11]LIU C,YANG H,SUN R,et al.swtvm:Exploring the automated compilation for deep learning on sunway architecture[J].arXiv:1904.07404,2019. [12]VASILACHE N,ZINENKO O,THEODORIDIS T,et al.Tensor comprehensions:Framework-agnostic high-performance machine learning abstractions[J].arXiv:1802.04730,2018. [13]GAO W,FANG J,ZHAO W,et al.SwATOP:Automatically optimizing deep learning operators on SW26010 many-core processor[C]//Proceedings of the 48th International Conference on Parallel Processing.New York:ACM,2019:1-10. [14]ZHAO J,LI B,NIE W,et al.AKG:automatic kernel generation for neural processing units using polyhedral transformations[C]//Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation.New York:ACM,2021:1233-1248. [15]CHOQUETTE J,GANDHI W.Nvidia A100 GPU:Performance &innovation for GPU computing[C]//2020 IEEE Hot Chips 32 Symposium(HCS).Piscataway:IEEE,2020:1-43. [16]CHEN T,DU Z,SUN N,et al.Diannao:A small-footprint high-throughput accelerator for ubiquitous machine-learning[J].ACM SIGARCH Computer Architecture News,2014,42(1):269-284. [17]SRIVASTAVA N,RONG H,BARUA P,et al.T2S-Tensor:Productively generating high-performance spatial hardware for dense tensor computations[C]//2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM).Piscataway:IEEE,2019:181-189. [18]MOREAU T,CHEN T,VEGA L,et al.A Hardware-Software Blueprint for Flexible Deep Learning Specialization[J].IEEE Micro,2019,39(5):8-16. [19]CHEN Y,CHEN T,XU Z,et al.DianNao family:Energy-efficient hardware accelerators for machine learning[J].Communications of the ACM,2016,59(11):105-112. [20]LIAO H,TU J,XIA J,et al.Ascend:a Scalable and Unified Architecture for Ubiquitous Deep Neural Network Computing:Industry Track Paper[C]//2021 IEEE International Symposium on High-Performance Computer Architecture(HPCA).Pisca-taway:IEEE,2021:789-801. [21]LIAO H,TU J,XIA J,et al.Davinci:A scalable architecture for neural network computing[C]//2019 IEEE Hot Chips 31 Symposium(HCS).Piscataway:IEEE,2019:1-44. [22]JOUPPI N P,YOUNG C,PATIL N,et al.In-datacenter performance analysis of a tensor processing unit[C]//Proceedings of the 44th Annual International Symposium on Computer Architecture.New York:Association for Computing Machinery,2017:1-12. [23]MCKINLEY K S,CARR S,TSENG C W.Improving data locality with loop transformations[J].ACM Transactions on Programming Languages and Systems(TOPLAS),1996,18(4):424-453. [24]HAMMAMI E,SLAMA Y.An overview on loop tiling tech-niques for code generation[C]//2017 IEEE/ACS 14th International Conference on Computer Systems and Applications(AICCSA).Piscataway:IEEE,2017:280-287. [25]COCIORVA D,WILKINS J W,LAM C,et al.Loop optimization for a class of memory-constrained computations[C]//Procee-dings of the 15th International Conference on Supercomputing.New York,United States:Association for Computing Machi-nery,2001:103-113. [26]ZHAO J,HORSNELL M,LUJÁN M,et al.Adaptive loop tiling for a multi-cluster cmp[C]//International Conference on Algorithms and Architectures for Parallel Processing.Berlin:Sprin-ger,2008:220-232. [27]NUZMAN D,ROSEN I,ZAKS A.Auto-vectorization of interleaved data for SIMD[J].ACM SIGPLAN Notices,2006,41(6):132-143. [28]EICHENBERGER A E,WU P,O'BRIEN K.Vectorization for SIMD architectures with alignment constraints[J].ACM Sig-plan Notices,2004,39(6):82-93. [29]NUZMAN D,HENDERSON R.Multi-platform auto-vectorization[C]//International Symposium on Code Generation and Optimization(CGO'06).Piscataway:IEEE,2006:281-294. [30]BHASKARACHARYA S G,DEMOUTH J,GROVER V.Automatic Kernel Generation for Volta Tensor Cores[J].arXiv:2006.12645,2020. [31]TAVARAGERI S,HEINECKE A,AVANCHA S,et al.PolyDL:Polyhedral Optimizations for Creation of High-performance DL Primitives[J].ACM Transactions on Architecture and Code Optimization(TACO),2021,18(1):1-27. [32]WENG J,JAIN A,WANG J,et al.UNIT:Unifying Tensorized Instruction Compilation[C]//Proceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization.2021:77-89. [33]ROESCH J,LYUBOMIRSKY S,WEBER L,et al.Relay:A new ir for machine learning frameworks[C]//Proceedings of the 2nd ACM SIGPLAN International Workshop on Machine Learning and Programming Languages.New York,United States:Asso-ciation for Computing Machinery,2018:58-68. |
[1] | BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207. |
[2] | LIU Hang, PU Yuanyuan, LYU Dahua, ZHAO Zhengpeng, XU Dan, QIAN Wenhua. Polarized Self-attention Constrains Color Overflow in Automatic Coloring of Image [J]. Computer Science, 2023, 50(3): 208-215. |
[3] | CHEN Liang, WANG Lu, LI Shengchun, LIU Changhong. Study on Visual Dashboard Generation Technology Based on Deep Learning [J]. Computer Science, 2023, 50(3): 238-245. |
[4] | ZHANG Yi, WU Qin. Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention [J]. Computer Science, 2023, 50(3): 246-253. |
[5] | YING Zonghao, WU Bin. Backdoor Attack on Deep Learning Models:A Survey [J]. Computer Science, 2023, 50(3): 333-350. |
[6] | DONG Yongfeng, HUANG Gang, XUE Wanruo, LI Linhao. Graph Attention Deep Knowledge Tracing Model Integrated with IRT [J]. Computer Science, 2023, 50(3): 173-180. |
[7] | HUA Xiaofeng, FENG Na, YU Junqing, HE Yunfeng. Shooting Event Detection of Free Kick in Soccer Video Based on Rule Reasoning [J]. Computer Science, 2023, 50(3): 181-190. |
[8] | MEI Pengcheng, YANG Jibin, ZHANG Qiang, HUANG Xiang. Sound Event Joint Estimation Method Based on Three-dimension Convolution [J]. Computer Science, 2023, 50(3): 191-198. |
[9] | ZOU Yunzhu, DU Shengdong, TENG Fei, LI Tianrui. Visual Question Answering Model Based on Multi-modal Deep Feature Fusion [J]. Computer Science, 2023, 50(2): 123-129. |
[10] | WANG Pengyu, TAI Wenxin, LIU Fang, ZHONG Ting, LUO Xucheng, ZHOU Fan. Self-supervised Flight Trajectory Prediction Based on Data Augmentation [J]. Computer Science, 2023, 50(2): 130-137. |
[11] | GUO Nan, LI Jingyuan, REN Xi. Survey of Rigid Object Pose Estimation Algorithms Based on Deep Learning [J]. Computer Science, 2023, 50(2): 178-189. |
[12] | LI Junlin, OUYANG Zhi, DU Nisuo. Scene Text Detection with Improved Region Proposal Network [J]. Computer Science, 2023, 50(2): 201-208. |
[13] | HUA Jie, LIU Xueliang, ZHAO Ye. Few-shot Object Detection Based on Feature Fusion [J]. Computer Science, 2023, 50(2): 209-213. |
[14] | CAI Xiao, CEHN Zhihua, SHENG Bin. SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing [J]. Computer Science, 2023, 50(1): 105-113. |
[15] | WANG Bin, LIANG Yudong, LIU Zhe, ZHANG Chao, LI Deyu. Study on Unsupervised Image Dehazing and Low-light Image Enhancement Algorithms Based on Luminance Adjustment [J]. Computer Science, 2023, 50(1): 123-130. |
|