计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 241100059-7.doi: 10.11896/jsjkx.241100059

• 人工智能 • 上一篇    下一篇

面向BEVFormer的高效训练后平衡量化策略

张晓玄, 唐小勇   

  1. 长沙理工大学计算机与通信工程学院 长沙 410114
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 唐小勇(tangxy@csust.edu.cn)
  • 作者简介:1098206574@qq.com
  • 基金资助:
    国家自然科学基金(61972146)

Balanced Quantization Strategy for Efficient Post-training Quantization of BEVFormer

ZHANG Xiaoxuan, TANG Xiaoyong   

  1. School of Computer and Communications Engineering,Changsha University of Science & Technology,Changsha 410114,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation of China(6197214).

摘要: 通过鸟瞰全景视角,BEVFormer在自动驾驶领域展现出卓越的性能。然而,在资源受限的设备上,其高内存占用和计算复杂度为实时部署带来了严峻的挑战。BEVFormer中ReLU激活值的分布从零到正无穷,呈现出不均匀的特点,传统量化指标如余弦相似度和均方误差(MSE)无法充分描述这种特性。针对此问题,提出了一种训练后平衡量化策略,该策略专门针对BEVFormer中的线性层和ReLU激活值的量化进行了优化。在线性层权重和输出的量化中采用预定义量化区间,同时对ReLU激活值使用特定区间量化方法,以确保关键值的精确表示。此外,该方法基于Hessian矩阵优化技术实现缩放因子动态调整,利用Hessian矩阵最小化量化误差,并稳定训练过程。实验结果显示,平衡量化策略显著提升了计算效率,同时保证了精度。在nuScenes测试集中,8位量化仅导致NDS下降不到1个百分点,保持了BEVFormer的性能表现。

关键词: BEVFormer, ReLU激活值, 线性层权重, 平衡量化策略, Hessian 矩阵

Abstract: BEVFormer’s bird’s-eye view(BEV) representation achieves strong results in autonomous driving applications.However,its high memory use and computational demands make real-time deployment difficult on resource-constrained devices.BEVFormer’s ReLU activation values vary widely,creating an uneven distribution that traditional quantization metrics,such as cosine similarity and mean square error(MSE),struggle to address effectively.To overcome these limitations,this paper introduces a new post-training quantization(PTQ) method,the Balanced Quantization Strategy.This method is specifically optimized for BEVFormer,focusing on quantizing linear layers and ReLU activations.For linear layers,it uses predefined quantization ranges,while ReLU activations are quantized with customized ranges to retain key value accuracy.Further,Hessian matrix optimization dyna-mically adjusts scaling factors,reducing quantization errors and stabilizing the quantization process.Results show that the Balanced Quan-tization Strategy improves computational efficiency with minimal accuracy loss.In testing on the nuScenes dataset,the proposed 8-bit quantization method achieves less than a 1% drop in NDS,maintaining BEVFormer’s high performance.

Key words: BEVFormer, ReLU activation, Outputs of the linear layers, Balanced Quantization Strategy, Hessian matrix

中图分类号: 

  • TP391
[1]HUANG D Q,HUANG H F,HUANG D Y,et al.A Survey on BEV Perception Learning in Autonomous Driving [J/OL].Computer Engineering and Applications,1-23 [2024-11-06].http://kns.cnki.net/kcms/detail/11.2127.TP.20241031.1529.016.html.
[2]MA Y X,WANG T,BAI X Y,et al.Vision-Centric BEV Perception:A Survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(12):10978-10997.
[3]LI Z,WANG W,LI H,et al.Bevformer:Learning bird’s-eye-view representation from multi-camera images via spatiotemporal transformers[C]//European Conference on Computer Vision.Cham:Springer,2022:1-18.
[4]YANG C,CHEN Y,TIAN H,et al.Bevformer v2:Adaptingmodern image back-bones to bird’s-eye-view recognition via perspective supervi-sion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2023:17830-17839.
[5]MA Y,WANG T,BAI X,et al.Vision-centric bev perception:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(12):10978-10996.
[6]LIU Z,TANG H,AMINI A,et al.Bevfusion:Multi-task multi-sensor fusion with unified bird’s-eye view representation[C]//2023 IEEE Inter-national Conference on Robotics and Automation(ICRA).IEEE,2023:2774-2781.
[7]WANG C,WANG Z,XU X,et al.Towards Accurate Post-training Quantization for Diffusion Models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2024.
[8]DIAO H,LI G,XU S,et al.Attention Round for Post-Training Quantization[J].Neurocomputing,2022,521:364-373.
[9]BANNER R,NAHSHAN Y,SOUDRY D.Post training 4-bit quantization of convolutional networks for rap-id-deployment[C]//Advances in Neural Information Processing Systems.2019:1-12.
[10]CHOUKROUN Y,KRAVCHIK E,YANG F,et al.Low-bitquantization of neural networks for efficient inference[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).2019:1-12.
[11]CHENG Y,WANG D,ZHOU P,et al.A Survey of Model Compression and Acceleration for Deep Neural Networks[J].arXiv:1710.09282,2017.
[12]YAO Z,AMINABADI R Y,ZHANG M,et al.ZeroQuant:Efficient and Affordable Post-Training Quantization for Large-Scale Transformers[C]//Advances in Neural Information Processing Systems 35(NeurIPS 2022).2022:27168-27183.
[13]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017.
[14]LIU Z,WANG Y,HAN K,et al.Post-training quantization for vision transformer[C]//Advances in Neural Information Processing Systems.2021:28092-28103.
[15]YUAN Z,XUE C,CHEN Y,et al.PTQ4ViT:Post-trainingQuantization for Vision Transformers with Twin Uniform Quantization[C]//2021 European Conference on Computer Vision(ECCV).2021:1-12.
[16]JIAO J,JIE Z,CHEN S,et al.MSMDFusion:Multi-Depth Seed Based Fusion of LiDAR and Camera Data for Enhanced 3D Object Detection[J].IEEE Transactions on Intelligent Transportation Systems,2023,25:1-12.
[17]YIN Z,ZHOU X,KRÄHENBÜHL P.Multimodal Virtual Point 3D Detection:Establishing Fine-Grained Feature Alignment Across Different Modalities[C]//Proceedings of the IEEE International Conference on Computer Vision(ICCV).2021:1-12.
[18]BAI H,HU Z,HUANG Q,et al.TransFusion:A Robust Li-DAR-Camera Fusion Method for 3D Object Detection Using Transformers[J].IEEE Transactions on Intelligent Transportation Systems,2022,25:1-12.
[19]CHEN L.Dense Projection Fusion(DPFusion):A Multi-Stage,Fine-Grained Fusion Technique for 3D Object Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46:35-56.
[20]DOE J.Piecewise Linear Quantization(PWLQ)[J].Neural Networks,2024,58(C):123-145.
[21]NAGEL J,LEUNG T,SCHMIDT T,Casale P.AdaRound:Adaptive rounding for post-training quantization[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:1-12.
[22]LI B,CHEN Q,XU H,et al.BRECQ:Bit-reconstruction quantization for ultra low-bit quantization aware training[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:1-12.
[23]HUBARA I,COURBARIAUX M,SOUDRY D,et al.Block-Wise Mixed-Precision Quantization:Enabling High Efficiency for Practical ReRAM-based DNN Accelerators[J].IEEE Transactions on Circuits and Systems I:Regular Papers,2021,68(11):1-12.
[24]BONDARENKO Y,NAGEL M,BLANKEVOORT T.Under-standing and Overcoming the Challenges of Efficient Transformer Quantization[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:7947-7969.
[25]FRANTAR M,HUBARA I,SOUDRY D,et al.CompressingDeep Neural Networks by Combining Weight Pruning with Post-Training Quantization[C]//Proceedings of the 36th Inernational Conference on Neural Information Processing Systems.2022:12760-12773.
[26]XIAO G,LIN J,SEZNEC M,et al.SmoothQuant:Accurate and Efficient Post-Training Quantization for Large Language Models[C]//40th International Conference on Machine Learning(ICML).2023:10-12.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!