计算机科学 ›› 2025, Vol. 52 ›› Issue (9): 249-258.doi: 10.11896/jsjkx.241000108
林恒, 纪庆革
LIN Heng, JI Qingge
摘要: 全景图像作为沉浸式多媒体的重要内容形式,提供360度水平和180度垂直视角的视觉体验,直接影响用户在虚拟现实(Virtual Reality,VR)中的沉浸感。为解决全景图像质量评价中投影失真和多尺度特征利用不充分的问题,提出了一种显著视口注意力网络(Salient Viewport Attention Network,SVA-Net)。该网络由显著性引导的视口提取模块、跨层注意力依赖模块和多通道融合回归模块组成,旨在缓解投影失真问题,同时高效提取多尺度特征并增强特征表达能力。实验结果表明,SVA-Net在两个公开数据集上相比现有方法,在图像质量预测精度上有显著提升,并展示了良好的泛化能力。该方法通过结合显著视口采样和跨层注意力机制,增强了特征表示,提升了全景图像质量评价的准确性,预测结果更接近人类主观评价。
中图分类号:
[1]DUAN H,ZHAI G,MIN X,et al.Perceptual quality assessment of omnidirectional images[C]//2018 IEEE International Symposium on Circuits and Systems(ISCAS).IEEE,2018:1-5. [2]XU M,LI C,ZHANG S,et al.State-of-the-Art in 360 Video/Image Processing:Perception,Assessment and Compression[J].IEEE Journal of Selected Topics in Signal Processing,2020,14(1):5-26. [3]AKHTAR Z,FALK T H.Audio-visual multimedia quality assessment:A comprehensive survey[J].IEEE Access,2017,5:21090-21117. [4]MARTIN D,MALPICA S,GUTIERREZ D,et al.Multimodality in VR:A survey[J].ACM Computing Surveys,2022,54(10):1-36. [5]CAO L Y,JIANG G Y,JIANG Z D,et al.Quality measurement for high dynamic range omnidirectional image systems[C]//IEEE Transactions on Instrumentation and Measurement.2021. [6]DAVID-JOHN B,HOSFELT D,BUTLER K,et al.A privacy-preserving approach to streaming eye-tracking data[J].IEEE Transactions on Visualization and Computer Graphics,2021,27(5):2555-2565. [7]TANG X W,HUANG X L,HU F,et al.Human-perception-oriented pseudo analog video transmissions with deep learning[J].IEEE Transactions on Vehicular Technology,2020,69(9):9896-9909. [8]JABAR F,ASCENSO J,QUELUZ M P.Objective assessment of perceived geometric distortions in viewport rendering of 360° images[J].IEEE Journal of Selected Topics in Signal Proces-sing,2019,14(1):49-63. [9]SUN W,MIN X,ZHAI G,et al.MC360IQA:A multi-channelCNN for blind 360-degree image quality assessment[J].IEEE Journal of Selected Topics in Signal Processing,2019,14(1):64-77. [10]ZOU W,YANG F,WAN S.Perceptual video quality metric for compression artefacts:from two-dimensional to omnidirectional[J].IET Image Processing,2018,12(3):374-381. [11]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612. [12]WANG Z,SIMONCELLI E P,BOVIK A C.Multiscale structural similarity for image quality assessment[C]//The Thrity-Seventh Asilomar Conference on Signals,Systems & Compu-ters,2003.IEEE,2003:1398-1402. [13]ZHANG L,ZHANG L,MOU X,et al.FSIM:A feature similarity index for image quality assessment[J].IEEE Transactions on Image Processing,2011,20(8):2378-2386. [14]MITTAL A,SOUNDARARAJAN R,BOVIK A C.Making a“completely blind” image quality analyzer[J].IEEE Signal Processing Letters,2012,20(3):209-212. [15]MITTAL A,MOORTHY A K,BOVIK A C.No-reference image quality assessment in the spatial domain[J].IEEE Transactions on Image Processing,2012,21(12):4695-4708. [16]GU K,WANG S,ZHAI G,et al.Analysis of distortion distribution for pooling in image quality prediction[J].IEEE Transactions on Broadcasting,2016,62(2):446-456. [17]MIN X,GU K,ZHAI G,et al.Blind quality assessment based on pseudo-reference image[J].IEEE Transactions on Multimedia,2017,20(8):2049-2062. [18]KANG L,YE P,LI Y,et al.Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks[C]//2015 IEEE International Conference on Image Processing(ICIP).IEEE,2015:2791-2795. [19]ZHANG W,ZHAI G,WEI Y,et al.Blind image quality assessment via vision-language correspondence:A multitask learning perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14071-14081. [20]ZAKHARCHENKO V,CHOI K P,PARK J H.Quality metric for spherical panoramic video[C]//Optics and Photonics for Information Processing X.SPIE,2016:57-65. [21]SUN Y,LU A,YU L.Weighted-to-spherically-uniform qualityevaluation for omnidirectional video[J].IEEE Signal Processing Letters,2017,24(9):1408-1412. [22]ZHOU Y,SUN Y,LI L,et al.Omnidirectional image quality assessment by distortion discrimination assistedmulti-stream network[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(4):1767-1777. [23]ZHOU Y,YU M,MA H,et al.Weighted-to-spherically-uniform SSIM objective quality evaluation for panoramic video[C]//2018 14th IEEE International Conference on Signal Processing(ICSP).IEEE,2018:54-57. [24]CHEN S,ZHANG Y,LI Y,et al.Spherical structural similarity index for objective omnidirectional video quality assessment[C]//2018 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2018:1-6. [25]ZHENG X,JIANG G,YU M,et al.Segmented spherical projection-based blind omnidirectional image quality assessment[J].IEEE Access,2020,8:31647-31659. [26]SUN W,GU K,MA S,et al.A large-scale compressed 360-degree spherical image database:From subjective quality evaluation to objective model comparison[C]//2018 IEEE 20th International Workshop on Multimedia Signal Processing(MMSP).IEEE,2018:1-6. [27]KIM H G,LIM H T,RO Y M.Deep virtual reality image quality assessment with human perception guider for omnidirectional image[J].IEEE Transactions on Circuits and Systems for Video Technology,2019,30(4):917-928. [28]JIANG H,JIANG G,YU M,et al.Cubemap-based perception-driven blind quality assessment for 360-degree images[J].IEEE Transactions on Image Processing,2021,30:2364-2377. [29]XU J,ZHOU W,CHEN Z.Blind omnidirectional image quality assessment with viewport oriented graph convolutional networks[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,31(5):1724-1737. [30]ZHOU Y,GONG W,SUN Y,et al.Quality Assessment forStitched Panoramic Images via Patch Registration and Bidimensional Feature Aggregation[J].IEEE Transactions on Multimedia,2023,26:3354-3365. [31]SENDJASNI A,LARABI M C.Attention-Aware Patch-BasedCNN for Blind 360-Degree Image Quality Assessment[J].Sensors,2023,23(21):8676. [32]DAHOU Y,TLIBA M,MCGUINNESS K,et al.ATSal:an attention based architecture for saliency prediction in 360° videos[C]//International Conference on Pattern Recognition.Cham:Springer,2021:305-320. [33]ZHU Y,ZHAI G,MIN X.The prediction of head and eye movement for 360 degree images[J].Signal Processing:Image Communication,2018,69:15-25. [34]XU G,LIAO W,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819. [35]XIA J,HE L,GAO X,et al.Blind image quality assessmentbased on hierarchical dependency learning and quality aggregation[J].Neurocomputing,2024,585:127621. [36]KARUNASINGHA D S K.Root mean square error or mean absolute error? Use their ratio as well[J].Information Sciences,2022,585:609-629. [37]BRUNNSTROM K,HANDS D,SPERANZA F,et al.VQEGvalidation and ITU standardization of objective perceptual video quality metrics standards in a nutshell[J].IEEE Signal Processing Magazine,2009,26(3):96-101. [38]MIN X,GU K,ZHAI G,et al.Blind quality assessment based on pseudo-reference image[J].IEEE Transactions on Multimedia,2017,20(8):2049-2062. [39]ZHANG W X,MA K,YAN J,et al.Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,30(1):36-47. [40]ZHANG W,ZHAI G,WEI Y,et al.Blind image quality assessment via vision-language correspondence:A multitask learning perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14071-14081. [41]YU M,LAKSHMAN H,GIROD B.A framework to evaluate omnidirectional video coding schemes[C]//2015 IEEE International Symposium on Mixed and Augmented Reality.IEEE,2015:31-36. |
|