计算机科学 ›› 2020, Vol. 47 ›› Issue (8): 105-111.doi: 10.11896/jsjkx.190700036

所属专题: 高性能计算

• 高性能计算 • 上一篇    下一篇

基于GPU的实时SIFT算法

汪亮, 周新志, 严华   

  1. 四川大学电子信息学院 成都 610065
  • 出版日期:2020-08-15 发布日期:2020-08-10
  • 通讯作者: 严华(yanhua@scu.edu.cn)
  • 作者简介:lucienwl@163.com
  • 基金资助:
    国家自然科学基金项目(61403265)

Real-time SIFT Algorithm Based on GPU

WANG Liang, ZHOU Xin-zhi, YNA Hua   

  1. School of Electronics and Information Engineering, Sichuan University, Chengdu 610065, China
  • Online:2020-08-15 Published:2020-08-10
  • About author:WANG Liang, born in 1993, postgra-duate.His main research interests include computer vision and parallel computing.
    YAN Hua, born in 1971, Ph.D, professor.His main research interests include intelligent algorithm, storage system and path planning.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61403265).

摘要: 针对SIFT特征提取算法过程复杂且实时性低的缺陷, 提出了一种基于GPU的实时尺度不变特征变换(Scale-inva-riant feature transform, SIFT)的优化算法——CUDA Optimized SIFT(CoSift)。该算法首先利用CUDA流并发构建SIFT尺度空间, 在此过程中充分利用了CUDA存储器模型中的高速存储器来提高数据访问速度, 并对二维高斯卷积核进行降维优化来减少计算量, 然后设计了基于warp的直方图算法策略, 重新平衡了特征描述过程中的工作负载。其与CPU版本的常用算法及GPU版本的改进算法的对比实验表明, CoSift算法在未降低特征提取准确性的前提下, 极大地提高了实时性能, 且对大尺寸图像有相对更高的优化效果, 在使用单块 GTX 1080Ti的GPU环境下, 该算法可以在7.7~8.6ms(116.28~129.87fps)内提取出关键点。CoSift算法有效降低了传统SIFT算法过程的复杂性, 提升了实时性能, 能较好地应用于对SIFT算法实时性要求较高的场景。

关键词: CUDA, 并行加速, 尺度不变特征变换, 实时, 特征提取

Abstract: Aiming at the complex and low real-time defects of the SIFT feature extraction algorithm, a real-time SIFT algorithm based on GPU is proposed, called CoSift (CUDA Optimized SIFT).Firstly, the algorithm uses the CUDA stream concurrency mechanism to construct the SIFT scale space.In this process, the high-speed memory in the CUDA memory model is fully utilized to improve data access speed, and the two-dimensional Gaussian convolution kernel is optimized to reduce the amount of computation.Then, the warp-based histogram policy is designed to rebalance the workload during the characterization process.Compared with the traditional algorithm of the CPU version and the improved algorithm of the GPU version, the proposed algorithm greatly improves the real-time performance of the SIFT algorithm without reducing the accuracy of feature extraction, and has a relatively higher optimization effect on large-size images.CoSift can extract features within 7.7~8.8ms (116.28~129.87fps) on the GTX1080Ti.The algorithm effectively reduces the complexity of the traditional SIFT algorithm process, improves the real-time performance, and is convenient to be applied in scenarios where the real-time requirement of SIFT algorithm is higher.

Key words: CUDA, Feature extraction, Parallel acceleration, Real-time, Scale invariant feature transformation

中图分类号: 

  • TP391.4
[1]LOWE D G.Distinctive Image Features from Scale-Invariant
Keypoints[J].International Journal of Computer Vision, 2004, 60(2):91-110.
[2]KE N Y, SUKTHANKAR R.PCA-SIFT:A More Distinctive Representation for Local Image Descriptors[C]∥Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.IEEE Computer Society, 2004.
[3]BAY H, TUYTELAARS T, GOOL L V.SURF:Speeded UpRobust Features[C]∥European Conference on Computer Vision.2006.
[4]DU C, YUAN J, DONG J, et al.GPU based Parallel Optimization for Real Time Panoramic Video Stitching[J].Pattern Recognition Letters, 2018, 133(5):62-69.
[5]ACHARYA K A, BABU R V, VADHIYAR S S.A Real-Time Implementation of SIFT Using GPU[J].Journal of Real-Time Image Processing, 2014, 14(8):267-277.
[6]ZHOU Y, MEI K, XIANG J, et al.Parallelization and Optimization of SIFT on GPU Using CUDA[C]∥IEEE International Conference on High Performance Computing & Communications & IEEE International Conference on Embedded & Ubiquitous Computing.2014.
[7]LI Z, JIA H, ZHANG Y.HartSift:A High-Accuracy and Real-Time SIFT Based on GPU[C]∥2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS).IEEE Computer Society, 2017.
[8]NVIDIA Corporation.CUDA Programming Guide 9.0[OL].https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html.
[9]NVIDIA Corporation.CUDA Toolkit Documentation v9.0[OL].https://docs.nvidia.com/cuda/cuda-c-programming-guide.
[10]TIAN W, XU F, WANG H Y, et al. Fast Scale Invariant Feature Transform Algorithm Based on CUDA.Computer Engineering, 2010, 36(8):219-221.
[11]YAN J H, HANG Y Q, XU J F, et al.Quick Realization of CUDA-Based Registration of High-Resolution Digital Video Images[J].Chinese Journal of Scientific Instrument, 2014, 35(2):380-386.
[12]RAGHU R P K, SURESH M, JOHN M.An Approach to Parallelization of SIFT Algorithm on GPUs for Real-Time Applications[J].Journal of Computer and Communications, 2016, 4(17):18-50.
[13]JIANG C, GENG Z X, LOU B, et al.Parallel Processing Re-search on SIFT Feature Matching Algorithm Based on GPU[J].Computer Science, 2013, 40(12):295-297, 307.
[14]WU C.SiftGPU :A GPU Implementation of Scale InvariantFeature Transform (SIFT)[OL].http://cs.unc.edu/~ccwu/siftgpu.
[15]BJRKMAN M, BERGSTRM N, KRAGIC D.Detecting, Segmenting and Tracking Unknown Objects Using Multi-label MRF Inference[J].Computer Vision and Image Understanding, 2014, 118:111-127.
[16]ZHANG K, YANG H Y, SHI L Y.Panorama Generation ofSIFT and Stitch Line Based on CUDA[J].Computer Technology and Development, 2015(9):22-26.
[17]ZHI X, YAN J, HANG Y, et al.Realization of CUDA-BasedReal-Time Registration and Target Localization for High-Resolution Video Images[J].Journal of Real-Time Image Proces-sing, 2016, 16:1025-1036.
[18]The Oxford Buildings Dataset[OL].http://www.robots.ox.ac.uk/~vgg/data/oxbuilding.
[1] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[2] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[3] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[4] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[5] 汪晋, 刘江.
基于GPU的并行DILU预处理技术
GPU-based Parallel DILU Preconditioning Technique
计算机科学, 2022, 49(6): 108-118. https://doi.org/10.11896/jsjkx.210300259
[6] 高元浩, 罗晓清, 张战成.
基于特征分离的红外与可见光图像融合算法
Infrared and Visible Image Fusion Based on Feature Separation
计算机科学, 2022, 49(5): 58-63. https://doi.org/10.11896/jsjkx.210200148
[7] 徐涛, 陈奕仁, 吕宗磊.
基于改进YOLOv3的机坪工作人员反光背心检测研究
Study on Reflective Vest Detection for Apron Workers Based on Improved YOLOv3 Algorithm
计算机科学, 2022, 49(4): 239-246. https://doi.org/10.11896/jsjkx.210200119
[8] 李嘉睿, 凌晓波, 李晨曦, 李子木, 杨家海, 张蕾, 吴程楠.
基于贝叶斯攻击图的动态网络安全分析
Dynamic Network Security Analysis Based on Bayesian Attack Graphs
计算机科学, 2022, 49(3): 62-69. https://doi.org/10.11896/jsjkx.210800107
[9] 左杰格, 柳晓鸣, 蔡兵.
基于图像分块与特征融合的户外图像天气识别
Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion
计算机科学, 2022, 49(3): 197-203. https://doi.org/10.11896/jsjkx.201200263
[10] 耿海军, 王威, 尹霞.
基于混合软件定义网络的单节点故障保护方法
Single Node Failure Routing Protection Algorithm Based on Hybrid Software Defined Networks
计算机科学, 2022, 49(2): 329-335. https://doi.org/10.11896/jsjkx.210100051
[11] 任首朋, 李劲, 王静茹, 岳昆.
基于集成回归决策树的lncRNA-疾病关联预测方法
Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction
计算机科学, 2022, 49(2): 265-271. https://doi.org/10.11896/jsjkx.201100132
[12] 张师鹏, 李永忠.
基于降噪自编码器和三支决策的入侵检测方法
Intrusion Detection Method Based on Denoising Autoencoder and Three-way Decisions
计算机科学, 2021, 48(9): 345-351. https://doi.org/10.11896/jsjkx.200500059
[13] 冯霞, 胡志毅, 刘才华.
跨模态检索研究进展综述
Survey of Research Progress on Cross-modal Retrieval
计算机科学, 2021, 48(8): 13-23. https://doi.org/10.11896/jsjkx.200800165
[14] 张丽倩, 李孟航, 高珊珊, 张彩明.
面向计算机辅助舌诊关键问题的解决方案综述
Summary of Computer-assisted Tongue Diagnosis Solutions for Key Problems
计算机科学, 2021, 48(7): 256-269. https://doi.org/10.11896/jsjkx.200800223
[15] 暴雨轩, 芦天亮, 杜彦辉, 石达.
基于i_ResNet34模型和数据增强的深度伪造视频检测方法
Deepfake Videos Detection Method Based on i_ResNet34 Model and Data Augmentation
计算机科学, 2021, 48(7): 77-85. https://doi.org/10.11896/jsjkx.210300258
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!