计算机科学 ›› 2015, Vol. 42 ›› Issue (11): 32-36.doi: 10.11896/j.issn.1002-137X.2015.11.005
安小景,张云泉,贾海鹏
AN Xiao-jing, ZHANG Yun-quan and JIA Hai-peng
摘要: 随着GPU计算能力及可编程性的不断增强,采用GPU作为通用加速器对应用程序进行性能加速已经成为提升程序性能的主要模式。直方图生成算法是计算机视觉的常用算法,在图像处理、模式识别、图像搜索等领域都有着广泛的应用。随着图像处理规模的扩大和实时性要求的提高,通过GPU提升直方图生成算法性能的需求也越来越强。在GPU计算平台关键优化方法和技术的基础上,完成了直方图生成算法在GPU计算平台上的实现及优化。实验结果表明,通过使用直方图备份、访存优化、数据本地化及规约优化等优化方法,直方图生成算法在AMD HD7850 GPU计算平台上的性能相对于优化前的版本达到了1.8~13.3倍的提升;相对于CPU版本,在不同数据规模下也达到了7.2~210.8倍的性能提升。
[1] Jia Hai-peng.Research of Parallel Optimization Technicals onGPU Computing Platforms[D].Qingdao:Ocean University of China,2013 [2] Shame R,Kennedy R A.Efficient histogram algorithms forNVIDIA CUDA compatibledevice[C]∥ICSPCS2007.New York:IEEE,2007:418-422 [3] Di Peng,Hu Chang-jun,Li Jian-jiang.Efficient Method for Histogram Generetionon GPU[D].Beijing:University of Science and Technology,2011 [4] Gómez-Luna J,González-Linares J M,Benavides J I,et al.Anoptimized approach to histogram computation on GPU[J].Machine vision and applications,2013,24(5):899-908 [5] Zhang Yuan-quan,ZhangXian-yi,Jia Hai-peng,et al.Heterogeneous Computing with OpenCL[M].Tsinghua University press,2012 [6] AMD GRAPHICS CORES NEXT(GCN)Architecture Whitepaper [J/OL].https://www.amd.com/Documents/GCN_Architecture_whitepaper.pdf [7] Munshi A,Gaster B,Mattson T G,et al.OpenCL programming guide[M].Pearson Education,2011 [8] AMD R & D center in Shanghai.Cross platform multicore and manycore Programming Notes--int the way of OpenCL.http://down.51cto.com/data/964762 [9] AMD.AMD Accelerated Parallel Processing OpenCLTM Pro-graming Guide.http://developer.amd.com/wordpress/media/2013/07/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide-rev-2.7.pdf [10] Zhang Jing.OpenCV2 Computer vision programming manual[M].Science Press Limited liability company,2013 [11] Jia Hai-peng,Zhang Yun-quan,Long Guo-ping,et al.GPURoofline:A Model for Guiding Performance Optimizations on GPUs[C]∥Proceeding of International European Conference on Para-llel and Distributed Computing.Rhodes Island,Greece,2012:920-932 [12] Jia H,Zhang Y,Wang W,et al.Accelerating viola-jones faccedetection algorithm on gpus[C]∥2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS).IEEE,2012:396-403 [13] NVIDAI.GPU-ACCELERATED APPLICATIONS.ht-tp://www.nvidia.com/object/media-and-entertainment.html [14] NVIDIA.NVIDIA’s Next Generation CUDATM Compute Achitecture:Kepler GK110.http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf |
No related articles found! |
|