计算机科学 ›› 2025, Vol. 52 ›› Issue (9): 249-258.doi: 10.11896/jsjkx.241000108

• 计算机图形学&多媒体 • 上一篇    下一篇

融合显著视口提取与跨层注意力的全景图像质量评价方法

林恒, 纪庆革   

  1. 中山大学计算机学院 广州 510006
    广东省大数据分析与处理重点实验室 广州 510006
  • 收稿日期:2024-10-21 修回日期:2025-02-19 出版日期:2025-09-15 发布日期:2025-09-11
  • 通讯作者: 纪庆革(issjqg@mail.sysu.edu.cn)
  • 作者简介:(linh265@mail2.sysu.edu.cn)
  • 基金资助:
    广东省自然科学基金(2016A030313288)

Panoramic Image Quality Assessment Method Integrating Salient Viewport Extraction andCross-layer Attention

LIN Heng, JI Qingge   

  1. School of Computer Science and Engineering,Sun Yat-sen University,Guangzhou 510006,China
    Guangdong Key Laboratory of Big Data Analysis and Processing,Guangzhou 510006,China
  • Received:2024-10-21 Revised:2025-02-19 Online:2025-09-15 Published:2025-09-11
  • About author:LIN Heng,born in 2000,postgraduate.His main research interests include computer vision and image quality assessment.
    JI Qingge,born in 1966,Ph.D,associate professor,is a senior member of CCF(No.07014S).His main research in-terests include computer vision,computer graphics and virtual reality.
  • Supported by:
    Natural Science Foundation of Guangdong Province,China(2016A030313288).

摘要: 全景图像作为沉浸式多媒体的重要内容形式,提供360度水平和180度垂直视角的视觉体验,直接影响用户在虚拟现实(Virtual Reality,VR)中的沉浸感。为解决全景图像质量评价中投影失真和多尺度特征利用不充分的问题,提出了一种显著视口注意力网络(Salient Viewport Attention Network,SVA-Net)。该网络由显著性引导的视口提取模块、跨层注意力依赖模块和多通道融合回归模块组成,旨在缓解投影失真问题,同时高效提取多尺度特征并增强特征表达能力。实验结果表明,SVA-Net在两个公开数据集上相比现有方法,在图像质量预测精度上有显著提升,并展示了良好的泛化能力。该方法通过结合显著视口采样和跨层注意力机制,增强了特征表示,提升了全景图像质量评价的准确性,预测结果更接近人类主观评价。

关键词: 全景图像, 客观图像质量评价, 交叉注意力, 显著性增强, 跨层注意力

Abstract: Panoramic images,as an important content form for immersive multimedia,provide a 360-degree horizontal and 180-degree vertical field of view,directly influencing the user’s sense of immersion in VR.To address the challenges of insufficient handling of projection distortion and inadequate utilization of multi-scale features in panoramic image quality assessment,this paper proposes a Salient Viewport Attention Network(SVA-Net).The network is composed of a saliency-guided viewport extraction module,a cross-layer attention dependency module,and a multi-channel fusion regression module.It aims to alleviate projection distortion,efficiently extract multi-scale features,and enhance feature representation.Experimental results demonstrate that SVA-Net significantly improves the accuracy of image quality prediction compared to existing methods across two public datasets and shows strong generalization ability.By combining salient viewport sampling and cross-layer attention mechanisms,this method enhances feature representation and improves the accuracy of panoramic image quality assessment,making the prediction results more aligned with human subjective evaluations.

Key words: Panoramic image, Objective image quality assessment, Cross attention, Saliency enhancement, Cross-layer attention

中图分类号: 

  • TP391
[1]DUAN H,ZHAI G,MIN X,et al.Perceptual quality assessment of omnidirectional images[C]//2018 IEEE International Symposium on Circuits and Systems(ISCAS).IEEE,2018:1-5.
[2]XU M,LI C,ZHANG S,et al.State-of-the-Art in 360 Video/Image Processing:Perception,Assessment and Compression[J].IEEE Journal of Selected Topics in Signal Processing,2020,14(1):5-26.
[3]AKHTAR Z,FALK T H.Audio-visual multimedia quality assessment:A comprehensive survey[J].IEEE Access,2017,5:21090-21117.
[4]MARTIN D,MALPICA S,GUTIERREZ D,et al.Multimodality in VR:A survey[J].ACM Computing Surveys,2022,54(10):1-36.
[5]CAO L Y,JIANG G Y,JIANG Z D,et al.Quality measurement for high dynamic range omnidirectional image systems[C]//IEEE Transactions on Instrumentation and Measurement.2021.
[6]DAVID-JOHN B,HOSFELT D,BUTLER K,et al.A privacy-preserving approach to streaming eye-tracking data[J].IEEE Transactions on Visualization and Computer Graphics,2021,27(5):2555-2565.
[7]TANG X W,HUANG X L,HU F,et al.Human-perception-oriented pseudo analog video transmissions with deep learning[J].IEEE Transactions on Vehicular Technology,2020,69(9):9896-9909.
[8]JABAR F,ASCENSO J,QUELUZ M P.Objective assessment of perceived geometric distortions in viewport rendering of 360° images[J].IEEE Journal of Selected Topics in Signal Proces-sing,2019,14(1):49-63.
[9]SUN W,MIN X,ZHAI G,et al.MC360IQA:A multi-channelCNN for blind 360-degree image quality assessment[J].IEEE Journal of Selected Topics in Signal Processing,2019,14(1):64-77.
[10]ZOU W,YANG F,WAN S.Perceptual video quality metric for compression artefacts:from two-dimensional to omnidirectional[J].IET Image Processing,2018,12(3):374-381.
[11]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.
[12]WANG Z,SIMONCELLI E P,BOVIK A C.Multiscale structural similarity for image quality assessment[C]//The Thrity-Seventh Asilomar Conference on Signals,Systems & Compu-ters,2003.IEEE,2003:1398-1402.
[13]ZHANG L,ZHANG L,MOU X,et al.FSIM:A feature similarity index for image quality assessment[J].IEEE Transactions on Image Processing,2011,20(8):2378-2386.
[14]MITTAL A,SOUNDARARAJAN R,BOVIK A C.Making a“completely blind” image quality analyzer[J].IEEE Signal Processing Letters,2012,20(3):209-212.
[15]MITTAL A,MOORTHY A K,BOVIK A C.No-reference image quality assessment in the spatial domain[J].IEEE Transactions on Image Processing,2012,21(12):4695-4708.
[16]GU K,WANG S,ZHAI G,et al.Analysis of distortion distribution for pooling in image quality prediction[J].IEEE Transactions on Broadcasting,2016,62(2):446-456.
[17]MIN X,GU K,ZHAI G,et al.Blind quality assessment based on pseudo-reference image[J].IEEE Transactions on Multimedia,2017,20(8):2049-2062.
[18]KANG L,YE P,LI Y,et al.Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks[C]//2015 IEEE International Conference on Image Processing(ICIP).IEEE,2015:2791-2795.
[19]ZHANG W,ZHAI G,WEI Y,et al.Blind image quality assessment via vision-language correspondence:A multitask learning perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14071-14081.
[20]ZAKHARCHENKO V,CHOI K P,PARK J H.Quality metric for spherical panoramic video[C]//Optics and Photonics for Information Processing X.SPIE,2016:57-65.
[21]SUN Y,LU A,YU L.Weighted-to-spherically-uniform qualityevaluation for omnidirectional video[J].IEEE Signal Processing Letters,2017,24(9):1408-1412.
[22]ZHOU Y,SUN Y,LI L,et al.Omnidirectional image quality assessment by distortion discrimination assistedmulti-stream network[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(4):1767-1777.
[23]ZHOU Y,YU M,MA H,et al.Weighted-to-spherically-uniform SSIM objective quality evaluation for panoramic video[C]//2018 14th IEEE International Conference on Signal Processing(ICSP).IEEE,2018:54-57.
[24]CHEN S,ZHANG Y,LI Y,et al.Spherical structural similarity index for objective omnidirectional video quality assessment[C]//2018 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2018:1-6.
[25]ZHENG X,JIANG G,YU M,et al.Segmented spherical projection-based blind omnidirectional image quality assessment[J].IEEE Access,2020,8:31647-31659.
[26]SUN W,GU K,MA S,et al.A large-scale compressed 360-degree spherical image database:From subjective quality evaluation to objective model comparison[C]//2018 IEEE 20th International Workshop on Multimedia Signal Processing(MMSP).IEEE,2018:1-6.
[27]KIM H G,LIM H T,RO Y M.Deep virtual reality image quality assessment with human perception guider for omnidirectional image[J].IEEE Transactions on Circuits and Systems for Video Technology,2019,30(4):917-928.
[28]JIANG H,JIANG G,YU M,et al.Cubemap-based perception-driven blind quality assessment for 360-degree images[J].IEEE Transactions on Image Processing,2021,30:2364-2377.
[29]XU J,ZHOU W,CHEN Z.Blind omnidirectional image quality assessment with viewport oriented graph convolutional networks[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,31(5):1724-1737.
[30]ZHOU Y,GONG W,SUN Y,et al.Quality Assessment forStitched Panoramic Images via Patch Registration and Bidimensional Feature Aggregation[J].IEEE Transactions on Multimedia,2023,26:3354-3365.
[31]SENDJASNI A,LARABI M C.Attention-Aware Patch-BasedCNN for Blind 360-Degree Image Quality Assessment[J].Sensors,2023,23(21):8676.
[32]DAHOU Y,TLIBA M,MCGUINNESS K,et al.ATSal:an attention based architecture for saliency prediction in 360° videos[C]//International Conference on Pattern Recognition.Cham:Springer,2021:305-320.
[33]ZHU Y,ZHAI G,MIN X.The prediction of head and eye movement for 360 degree images[J].Signal Processing:Image Communication,2018,69:15-25.
[34]XU G,LIAO W,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819.
[35]XIA J,HE L,GAO X,et al.Blind image quality assessmentbased on hierarchical dependency learning and quality aggregation[J].Neurocomputing,2024,585:127621.
[36]KARUNASINGHA D S K.Root mean square error or mean absolute error? Use their ratio as well[J].Information Sciences,2022,585:609-629.
[37]BRUNNSTROM K,HANDS D,SPERANZA F,et al.VQEGvalidation and ITU standardization of objective perceptual video quality metrics standards in a nutshell[J].IEEE Signal Processing Magazine,2009,26(3):96-101.
[38]MIN X,GU K,ZHAI G,et al.Blind quality assessment based on pseudo-reference image[J].IEEE Transactions on Multimedia,2017,20(8):2049-2062.
[39]ZHANG W X,MA K,YAN J,et al.Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,30(1):36-47.
[40]ZHANG W,ZHAI G,WEI Y,et al.Blind image quality assessment via vision-language correspondence:A multitask learning perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14071-14081.
[41]YU M,LAKSHMAN H,GIROD B.A framework to evaluate omnidirectional video coding schemes[C]//2015 IEEE International Symposium on Mixed and Augmented Reality.IEEE,2015:31-36.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!