Computer Science ›› 2021, Vol. 48 ›› Issue (10): 212-219.doi: 10.11896/jsjkx.200900005

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Light Field Depth Estimation Method Based on Encoder-decoder Architecture

YAN Xu1,2,3, MA Shuai1,2,3, ZENG Feng-jiao1,2,3, GUO Zheng-hua1,2,3, WU Jun-long1,2,3, YANG Ping1,2, XU Bing1,2   

  1. 1 Key Laboratory on Adaptive Optics,Institute of Optics and Electronics,Chinese Academy of Sciences,Chengdu 610209,China
    2 Institute of Optics and Electronics,Chinese Academy of Sciences,Chengdu 610209,China
    3 University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2020-09-01 Revised:2021-02-02 Online:2021-10-15 Published:2021-10-18
  • About author:YAN Xu,born in 1995,postgraduate.His main research interests include computer vision and deep learning.
    XU Bing,born in 1960,senior research scientist,Ph.D supervisor.His research interests include application of adaptive optics in improving laser beam quality,wavefront detector development,and application of light field cameras.
  • Supported by:
    National Natural Science Foundation of China(J19K004).

Abstract: Aiming at the solution to the time-consuming and low-precision disadvantage of present methodologies,the light field depth estimation method combining context information of the scene is proposed.This method is based on an end-to-end convolutional neural network,with the advantage of obtaining depth map from a single light field image.On merit of the reduced computational cost from this method,the time consumption is consequently decreased.For improvement in calculation accuracy,multi orientation epipolar plane image volumes of the light field images are input to network,from which feature can be extracted by the multi-stream encoding module,and then aggregated by the encoding-decoding architecture with skip connection,resulting in fuse the context information of the neighborhood of the target pixel in the process of per-pixel disparity estimation.Furthermore,the model uses convolutional blocks of different depths to extract the structural features of the scene from the central viewpoint image,by introducing these structural features into the corresponding skip connection,additional references for edge features are obtained and the calculation accuracy is further improved.Experiments in the HCI 4D Light Field Benchmark show that the BadPix index and MSE index of the proposed method are respectively 31.2% and 54.6% lower than those of the comparison me-thod,and the average calculation time of depth estimation is 1.2 seconds,which is much faster than comparison method.

Key words: Context information, Depth estimation, Encoder-decoder, Epipolar plane image, Light field

CLC Number: 

  • TP391
[1]GERSHUN A.The Light Field[J].Studies in Applied Mathematics,1939,18(1/2/3/4):51-151.
[2]WANNER S,GOLDLUECKE B.Globally consistent depth la-beling of 4D light fields[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:41-48.
[3]TOSIC I,BERKNER K.Light Field Scale-Depth Space Trans-form for Dense Depth Estimation[C]//Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition Workshops.2014:435-442.
[4]ZHANG S,SHENG H,LI C,et al.Robust depth estimation for light field via spinning parallelogram operator[J].Computer Vision and Image Understanding,2016,145:148-159.
[5]JEON H G,PARK J,CHOE G,et al.Accurate depth map estimation from a lenslet light field camera[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2015:1547-1555.
[6]CHEN C,LIN H,YU Z,et al.Light Field Stereo MatchingUsing Bilateral Statistics of Surface Cameras[C]//Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition.2014:1518-1525.
[7]KALANTARI N K,WANG T C,RAMAMOORTHI R.Lear-ning-based view synthesis for light field cameras[J].ACM Transactions on Graphics (TOG),2016,35(6):1-10.
[8]YOON Y,JEON H G,YOO D,et al.Light-field image super-resolution using convolutional neural network[J].IEEE Signal Processing Letters,2017,24(6):848-852.
[9]WANG T C,ZHU J Y,HIROAKI E,et al.A 4d light-field dataset and cnn architectures for material recognition[C]//Procee-dings of the European Conference on Computer Vision.2016:121-138.
[10]SRINIVASAN P P,WANG T,SREELAL A,et al.Learning to synthesize a 4d rgbd light field from a single image[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2243-2251.
[11]ZHONG T,JIN X,LI L,et al.Light field image compressionusing depth-based CNN in intra prediction[C]//Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).2019:8564-8567.
[12]HEBER S,POCK T.Convolutional networks for shape fromlight field[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2016:3746-3754.
[13]HEBER S,YU W,POCK T.Neural EPI-volume networks for shape from light field[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2252-2260.
[14]ZHOU W,LIANG L,ZHANG H,et al.Scale and Orientation Aware EPI-Patch Learning for Light Field Depth Estimation[C]//Proceedings of the International Conference on Pattern Recognition.2018:2362-2367.
[15]TAGHANAKI S A,ABHISHEK K,COHEN J P,et al.Deep Semantic Segmentation of Natural and Medical Images:A Review[J].Artificial Intelligence Review,2021,54(1):137-178.
[16]KENDALL A,MARTIROSYAN H,DASGUPTA S,et al.End-to-end learning of geometry and context for deep stereo regression[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:66-75.
[17]CHANG J R,CHEN Y S.Pyramid stereo matching network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5410-5418.
[18]HUANG P H,MATZEN K,KOPF J,et al.Deepmvs:Learning multi-view stereopsis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2821-2830.
[19]WANNER S,GOLDLUECKE B.Variational Light Field Analysis for Disparity Estimation and Super-Resolution[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(3):606-619.
[20]JOHANNSEN O,SULC A,GOLDLUECKE B.What SparseLight Field Coding Reveals about Scene Structure[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3262-3270.
[21]STRECKE M,ALPEROVICH A,GOLDLUECKE B.Accurate Depth and Normal Maps from Occlusion-Aware Focal Stack Symmetry [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2814-2822.
[22]SHENG H,ZHANG S,CAO X,et al.Geometric OcclusionAnalysis in Depth Estimation Using Integral Guided Filter for Light-Field Image[J].IEEE Transactions on Image Processing,2017,26(12):5758-5771.
[23]HONAUER K,JOHANNSEN O,KONDERMANN D,et al.A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields[C]//Proceedings of the Asian Conference on Computer Vision.2016:19-34.
[24]JOHANNSEN O,HONAUER K,GOLDLUECKE B,et al.ATaxonomy and Evaluation of Dense Light Field Depth Estimation Algorithms[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).2017:82-99.
[25]BUSLAEV A,IGLOVIKOV V I,KHVEDCHENYA E,et al.Albumentations:fast and flexible image augmentations[J].Information,2020,11(2):125.
[26]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Advances in Neural Information Processing Systems,2012,25:1097-1105.
[27]JEON H,PARK J,CHOE G,et al.Depth from a Light FieldImage with Learning-based Matching Costs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(2):297-310.
[28]LUO Y,ZHOU W,FANG J,et al.EPI-Patch Based Convolu-tional Neural Network for Depth Estimation on 4D Light Field[C]//Proceedings of the International Conference on Neural Information Processing.2017:642-652.
[29]RERABEK M,EBRAHIMI T.New Light Field Image Dataset[C]//8th International Conference on Quality of Multimedia Experience (QoMEX).2016.
[30]PENDU M L,JIANG X,GUILLEMOT C.Light Field Inpain-ting Propagation via Low Rank Matrix Completion[J].IEEE Transactions on Image Processing,2018,27(4):1981-1993.
[1] CHEN Zhang-hui, XIONG Yun. Stylized Image Captioning Model Based on Disentangle-Retrieve-Generate [J]. Computer Science, 2022, 49(6): 180-186.
[2] QIU Jia-zuo, XIONG De-yi. Frontiers in Neural Question Generation:A Literature Review [J]. Computer Science, 2021, 48(6): 159-167.
[3] HAO Zhi-feng, LIAO Xiang-cai, WEN Wen, CAI Rui-chu. Collaborative Filtering Recommendation Algorithm Based on Multi-context Information [J]. Computer Science, 2021, 48(3): 168-173.
[4] TENG Jian, TENG Fei, LI Tian-rui. Travel Demand Forecasting Based on 3D Convolution and LSTM Encoder-Decoder [J]. Computer Science, 2021, 48(12): 195-203.
[5] JIANG Qi, SU Wei, XIE Ying, ZHOUHONG An-ping, ZHANG Jiu-wen, CAI Chuan. End-to-End Chinese-Braille Automatic Conversion Based on Transformer [J]. Computer Science, 2021, 48(11A): 136-141.
[6] ZHOU Peng-cheng,GONG Sheng-rong,ZHONG Shan,BAO Zong-ming,DAI Xing-hua. Image Semantic Segmentation Based on Deep Feature Fusion [J]. Computer Science, 2020, 47(2): 126-134.
[7] WEN Hao, CHEN Hao. Tax Prediction Based on LSTM Recurrent Neural Network [J]. Computer Science, 2020, 47(11A): 437-443.
[8] XU Yang,WANG Jian-cheng,LIU Qi-yuan,LI Shou-shan. Intention Detection in Spoken Language Based on Context Information [J]. Computer Science, 2020, 47(1): 205-211.
[9] YAO Tuo-zhong, ZUO Wen-hui, AN Peng, SONG Jia-tao. Multi-semantic Interaction Based Iterative Scene Understanding Framework [J]. Computer Science, 2019, 46(5): 228-234.
[10] ZHAO Peng, WU Li-fa, HONG Zheng. Research on Broker Based Multicloud Access Control Model [J]. Computer Science, 2019, 46(11): 123-129.
[11] HAN Li , LIU Zheng-jie. CAUXT:A Tool to Help User Experience Researchers Capture Users’ Experience Data in Context of Interest [J]. Computer Science, 2018, 45(7): 278-285.
[12] ZENG Chong, GUO Hua-long, ZENG Zhi-hong, ZHAO Juan. Development of Real 3D Display System Based on Light Field Scanning [J]. Computer Science, 2018, 45(6A): 598-600.
[13] WEN Jun-hao, SUN Guang-hui and LI Shun. Study on Matrix Factorization Recommendation Algorithm Based on User Clustering and Mobile Context [J]. Computer Science, 2018, 45(4): 215-219.
[14] XUE Song and WANG Wen-jian. Depth Estimation from Single Defocused Image Based on Gaussian-Cauchy Mixed Model [J]. Computer Science, 2017, 44(1): 32-36.
[15] ZHAO Qing-qing, ZHANG Tao and ZHENG Wei-bo. Data Acquisition and Pre-processing Based on Light Field Photography [J]. Computer Science, 2016, 43(Z11): 140-143.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!