融合显著视口提取与跨层注意力的全景图像质量评价方法

doi:10.11896/jsjkx.241000108

Abstract

Abstract: Panoramic images,as an important content form for immersive multimedia,provide a 360-degree horizontal and 180-degree vertical field of view,directly influencing the user’s sense of immersion in VR.To address the challenges of insufficient handling of projection distortion and inadequate utilization of multi-scale features in panoramic image quality assessment,this paper proposes a Salient Viewport Attention Network(SVA-Net).The network is composed of a saliency-guided viewport extraction module,a cross-layer attention dependency module,and a multi-channel fusion regression module.It aims to alleviate projection distortion,efficiently extract multi-scale features,and enhance feature representation.Experimental results demonstrate that SVA-Net significantly improves the accuracy of image quality prediction compared to existing methods across two public datasets and shows strong generalization ability.By combining salient viewport sampling and cross-layer attention mechanisms,this method enhances feature representation and improves the accuracy of panoramic image quality assessment,making the prediction results more aligned with human subjective evaluations.

Key words: Panoramic image, Objective image quality assessment, Cross attention, Saliency enhancement, Cross-layer attention

CLC Number:

TP391

LIN Heng, JI Qingge. Panoramic Image Quality Assessment Method Integrating Salient Viewport Extraction andCross-layer Attention[J].Computer Science, 2025, 52(9): 249-258.

References

[1]DUAN H,ZHAI G,MIN X,et al.Perceptual quality assessment of omnidirectional images[C]//2018 IEEE International Symposium on Circuits and Systems(ISCAS).IEEE,2018:1-5.
[2]XU M,LI C,ZHANG S,et al.State-of-the-Art in 360 Video/Image Processing:Perception,Assessment and Compression[J].IEEE Journal of Selected Topics in Signal Processing,2020,14(1):5-26.
[3]AKHTAR Z,FALK T H.Audio-visual multimedia quality assessment:A comprehensive survey[J].IEEE Access,2017,5:21090-21117.
[4]MARTIN D,MALPICA S,GUTIERREZ D,et al.Multimodality in VR:A survey[J].ACM Computing Surveys,2022,54(10):1-36.
[5]CAO L Y,JIANG G Y,JIANG Z D,et al.Quality measurement for high dynamic range omnidirectional image systems[C]//IEEE Transactions on Instrumentation and Measurement.2021.
[6]DAVID-JOHN B,HOSFELT D,BUTLER K,et al.A privacy-preserving approach to streaming eye-tracking data[J].IEEE Transactions on Visualization and Computer Graphics,2021,27(5):2555-2565.
[7]TANG X W,HUANG X L,HU F,et al.Human-perception-oriented pseudo analog video transmissions with deep learning[J].IEEE Transactions on Vehicular Technology,2020,69(9):9896-9909.
[8]JABAR F,ASCENSO J,QUELUZ M P.Objective assessment of perceived geometric distortions in viewport rendering of 360° images[J].IEEE Journal of Selected Topics in Signal Proces-sing,2019,14(1):49-63.
[9]SUN W,MIN X,ZHAI G,et al.MC360IQA:A multi-channelCNN for blind 360-degree image quality assessment[J].IEEE Journal of Selected Topics in Signal Processing,2019,14(1):64-77.
[10]ZOU W,YANG F,WAN S.Perceptual video quality metric for compression artefacts:from two-dimensional to omnidirectional[J].IET Image Processing,2018,12(3):374-381.
[11]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.
[12]WANG Z,SIMONCELLI E P,BOVIK A C.Multiscale structural similarity for image quality assessment[C]//The Thrity-Seventh Asilomar Conference on Signals,Systems & Compu-ters,2003.IEEE,2003:1398-1402.
[13]ZHANG L,ZHANG L,MOU X,et al.FSIM:A feature similarity index for image quality assessment[J].IEEE Transactions on Image Processing,2011,20(8):2378-2386.
[14]MITTAL A,SOUNDARARAJAN R,BOVIK A C.Making a“completely blind” image quality analyzer[J].IEEE Signal Processing Letters,2012,20(3):209-212.
[15]MITTAL A,MOORTHY A K,BOVIK A C.No-reference image quality assessment in the spatial domain[J].IEEE Transactions on Image Processing,2012,21(12):4695-4708.
[16]GU K,WANG S,ZHAI G,et al.Analysis of distortion distribution for pooling in image quality prediction[J].IEEE Transactions on Broadcasting,2016,62(2):446-456.
[17]MIN X,GU K,ZHAI G,et al.Blind quality assessment based on pseudo-reference image[J].IEEE Transactions on Multimedia,2017,20(8):2049-2062.
[18]KANG L,YE P,LI Y,et al.Simultaneous estimation of image quality and distortion via multi-task convolutional neural networks[C]//2015 IEEE International Conference on Image Processing(ICIP).IEEE,2015:2791-2795.
[19]ZHANG W,ZHAI G,WEI Y,et al.Blind image quality assessment via vision-language correspondence:A multitask learning perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14071-14081.
[20]ZAKHARCHENKO V,CHOI K P,PARK J H.Quality metric for spherical panoramic video[C]//Optics and Photonics for Information Processing X.SPIE,2016:57-65.
[21]SUN Y,LU A,YU L.Weighted-to-spherically-uniform qualityevaluation for omnidirectional video[J].IEEE Signal Processing Letters,2017,24(9):1408-1412.
[22]ZHOU Y,SUN Y,LI L,et al.Omnidirectional image quality assessment by distortion discrimination assistedmulti-stream network[J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(4):1767-1777.
[23]ZHOU Y,YU M,MA H,et al.Weighted-to-spherically-uniform SSIM objective quality evaluation for panoramic video[C]//2018 14th IEEE International Conference on Signal Processing(ICSP).IEEE,2018:54-57.
[24]CHEN S,ZHANG Y,LI Y,et al.Spherical structural similarity index for objective omnidirectional video quality assessment[C]//2018 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2018:1-6.
[25]ZHENG X,JIANG G,YU M,et al.Segmented spherical projection-based blind omnidirectional image quality assessment[J].IEEE Access,2020,8:31647-31659.
[26]SUN W,GU K,MA S,et al.A large-scale compressed 360-degree spherical image database:From subjective quality evaluation to objective model comparison[C]//2018 IEEE 20th International Workshop on Multimedia Signal Processing(MMSP).IEEE,2018:1-6.
[27]KIM H G,LIM H T,RO Y M.Deep virtual reality image quality assessment with human perception guider for omnidirectional image[J].IEEE Transactions on Circuits and Systems for Video Technology,2019,30(4):917-928.
[28]JIANG H,JIANG G,YU M,et al.Cubemap-based perception-driven blind quality assessment for 360-degree images[J].IEEE Transactions on Image Processing,2021,30:2364-2377.
[29]XU J,ZHOU W,CHEN Z.Blind omnidirectional image quality assessment with viewport oriented graph convolutional networks[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,31(5):1724-1737.
[30]ZHOU Y,GONG W,SUN Y,et al.Quality Assessment forStitched Panoramic Images via Patch Registration and Bidimensional Feature Aggregation[J].IEEE Transactions on Multimedia,2023,26:3354-3365.
[31]SENDJASNI A,LARABI M C.Attention-Aware Patch-BasedCNN for Blind 360-Degree Image Quality Assessment[J].Sensors,2023,23(21):8676.
[32]DAHOU Y,TLIBA M,MCGUINNESS K,et al.ATSal:an attention based architecture for saliency prediction in 360° videos[C]//International Conference on Pattern Recognition.Cham:Springer,2021:305-320.
[33]ZHU Y,ZHAI G,MIN X.The prediction of head and eye movement for 360 degree images[J].Signal Processing:Image Communication,2018,69:15-25.
[34]XU G,LIAO W,ZHANG X,et al.Haar wavelet downsampling:A simple but effective downsampling module for semantic segmentation[J].Pattern Recognition,2023,143:109819.
[35]XIA J,HE L,GAO X,et al.Blind image quality assessmentbased on hierarchical dependency learning and quality aggregation[J].Neurocomputing,2024,585:127621.
[36]KARUNASINGHA D S K.Root mean square error or mean absolute error? Use their ratio as well[J].Information Sciences,2022,585:609-629.
[37]BRUNNSTROM K,HANDS D,SPERANZA F,et al.VQEGvalidation and ITU standardization of objective perceptual video quality metrics standards in a nutshell[J].IEEE Signal Processing Magazine,2009,26(3):96-101.
[38]MIN X,GU K,ZHAI G,et al.Blind quality assessment based on pseudo-reference image[J].IEEE Transactions on Multimedia,2017,20(8):2049-2062.
[39]ZHANG W X,MA K,YAN J,et al.Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network[J].IEEE Transactions on Circuits and Systems for Video Technology,2020,30(1):36-47.
[40]ZHANG W,ZHAI G,WEI Y,et al.Blind image quality assessment via vision-language correspondence:A multitask learning perspective[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14071-14081.
[41]YU M,LAKSHMAN H,GIROD B.A framework to evaluate omnidirectional video coding schemes[C]//2015 IEEE International Symposium on Mixed and Augmented Reality.IEEE,2015:31-36.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Panoramic Image Quality Assessment Method Integrating Salient Viewport Extraction andCross-layer Attention

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 7

Metrics

Comments

Recommended 0

[1]	ZENG Lili, XIA Jianan, LI Shaowen, JING Maike, ZHAO Huihui, ZHOU Xuezhong. M2T-Net:Cross-task Transfer Learning Tongue Diagnosis Method Based on Multi-source Data [J]. Computer Science, 2025, 52(9): 47-53.
[2]	LUO Huilan, GUO Yuchen. Gaussian-bias Self-attention and Cross-attention Based Module for Medical Image Segmentation [J]. Computer Science, 2024, 51(11A): 240300071-9.
[3]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[4]	CHEN Kai, LIU Man, WANG Zhi-teng, MAO Shao-chen, SHEN Qiu-hui, ZHANG Hong-jun. Study on Data Filling Based on Global-attributes Attention Neural Process Model [J]. Computer Science, 2022, 49(10): 111-117.
[5]	JIN Wen-qing and HAN Fang. Main Melody Extraction Method Based on Saliency Enhancement [J]. Computer Science, 2020, 47(6A): 24-28.
[6]	TANG Yi-ping, HU Ke-gang and YUAN Gong-ping. Automatic Recognition Method of Tunnel Disease Based on Convolutional Neural Network for Panoramic Images [J]. Computer Science, 2017, 44(Z11): 207-211.
[7]	QU Zhong and LI Xiu-li. Algorithm of Eliminating Image Stitching Line Based on Improved IGG Model [J]. Computer Science, 2017, 44(12): 274-278.