Computer Science ›› 2024, Vol. 51 ›› Issue (3): 155-164.doi: 10.11896/jsjkx.221200153

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Appearance Fusion Based Motion-aware Architecture for Moving Object Segmentation

XU Bangwu1, WU Qin1,2, ZHOU Haojie1   

  1. 1 School of Artificial Intelligence and Computer Science,Jiangnan University,Wuxi,Jiangsu 214122,China
    2 Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence,Wuxi,Jiangsu 214122,China
  • Received:2022-12-27 Revised:2023-06-05 Online:2024-03-15 Published:2024-03-13
  • About author:XU Bangwu,born in 1998,postgra-duate,is a member of CCF(No.N9250G).His main research interests include computer vision and deep lear-ning.ZHOU Haojie,born in 1981,Ph.D,associate professor,is a member of CCF(No.19225S).His main research intere-sts include system architecture,intelligent system and distributed computing.
  • Supported by:
    National Natural Science Foundation of China(61972180).

Abstract: Moving object segmentation aims to segment all moving objects in the current scene,and it is of critical significance for many computer vision applications.At present,many moving object segmentation methods use the motion information from 2D optical flow maps to segment moving objects,which have many defects.For moving objects moving in the epipolar plane or moving objects whose 3D motion direction are consistent with the background,it is difficult to identify these objects by the 2D optical flow maps.Besides,incorrect 2D optical flow also effects the result of moving object segmentation.To solve the above problems,this paper proposes different motion costs to improve the performance of moving object segmentation.In order to detect moving objects with coplanar and collinear motion,this paper proposes a balanced reprojection cost and a multi-angle optical flow contrast cost,which measures the difference between the 2D optical flow of moving objects and that of the background.For ego-motion degeneracy,this paper designs a differential homography cost.To segment moving objects in complex scenes,this paper proposes an appearance fusion based motion-aware architecture.In this architecture,in order to effectively fuse appearance features and motion features of objects,the multi-modality co-attention gate is adapted to achieve better interaction between appearance and motion cues.Besides,to emphasize moving objects,this paper introduces a multi-level motion based attention module to suppress redundant and misleading information.Extensive experiments are conducted on the KITTI dataset,the JNU-UISEE dataset,the KittiMoSeg dataset and the Davis-2016 dataset,and the proposed method achieves excellent performance.

Key words: Moving object segmentation, Balanced reprojection cost, Multi-angle optical flow contrast cost, Multi-modality co-attention gate, Multi-level motion based attention module

CLC Number: 

  • TP391.413
[1]PHILIP H S T.Geometric motion segmentation and model selection[J].Philosophical Transactions-Royal Society Mathema-tical,Physical and Engineering Sciences,1998,356(1740):1321-1340.
[2]RAHMATI H,DRAGON R,AAMO O M,et al.Weakly supervised motion segmentation with particle matching[J].Computer Vision and Image Understanding,2015,140:30-42.
[3]THAKOOR N,GAO J,DEVARAJAN V.Multibody Structure-and-Motion Segmentation by Branch-and-Bound Model Selection[J].IEEE Transactions on Image Processing,2010,19(6):1393-1402.
[4]DAVE A,TOKMAKOV P,RAMANAN D.Towards Segmenting Anything That Moves[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).Piscataway,NJ:IEEE,2020:1493-1502.
[5]TOKMAKOV P,SCHMID C,ALAHARI K.Learning to Segment Moving Objects[J].International Journal of Computer Vision,2019,127(3):282-301.
[6]YANG G,RAMANAN D.Learning to Segment Rigid Motions from Two Frames[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2021:1266-1275.
[7]YANG S,ZHANG L,QI J,et al.Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV).Piscataway,NJ:IEEE,2022:1544-1553.
[8]TORR P H S,FITZGIBBON A W,ZISSERMAN A.The Pro-blem of Degeneracy in Structure and Motion Recovery from Un-calibrated Image Sequences[J].International Journal of Computer Vision,1999,32(1):27-44.
[9]ZHU X,WANG L,ZHANG C,et al.Moving Object Detection Based on Continuous Constraint Background Model Deduction [J].Computer Science,2019,46(6):311-315.
[10]ZHANG N,SHI J H,YI J,et al.Real-time tracking method ofunderground moving target based on weighted centroid positioning[J].Journal of Jilin University(Engineering and Technology Edition),2023,53(5):1458-1464.
[11]XU Y K,CHEN T Y,CHEN S Y,et al.Multi-object Tracking and Segmentation Algorithm by Fusing Motion Feature Embedding[J].Journal of Chinese Computer Systems.2023,44(6):1304-1310.
[12]TOKMAKOV P,ALAHARI K,SCHMID C.Learning Motion Patterns in Videos[C]//Proceedings of the 2017 IEEE Confe-rence on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2017:531-539.
[13]RASHED H,RAMZY M,VAQUERO V,et al.FuseMODNet:Real-Time Camera and LiDAR Based Moving Object Detection for Robust Low-Light Autonomous Driving[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).Piscataway,NJ:IEEE,2020:2393-2402.
[14]JAIN S D,XIONG B,GRAUMAN K.FusionSeg:Learning toCombine Motion and Appearance for Fully Automatic Segmentation of Generic Objects in Videos[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Pisca-taway,NJ:IEEE,2017:2117-2126.
[15]LU X,WANG W,MA C,et al.See More,Know More:Unsupervised Video Object Segmentation With Co-Attention Siamese Networks[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Pisca-taway,NJ:IEEE,2020:3618-3627.
[16]ZHOU T,WANG S,ZHOU Y,et al.Motion-Attentive Transition for Zero-Shot Video Object Segmentation[C]//Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence(AAAI).Menlo Park,CA:AAAI,2020,34(7):13066-13073.
[17]CHENG J,TSAI Y H,WANG S,et al.SegFlow:Joint Learning for Video Object Segmentation and Optical Flow[C]//Procee-dings of the 2017 IEEE International Conference on Computer Vision(ICCV).Piscataway,NJ:IEEE,2017:686-695.
[18]LI S,SEYBOLD B,VOROBYOV A,et al.Unsupervised Video Object Segmentation with Motion-Based Bilateral Networks[C]//Proceedings of the Computer Vision(ECCV 2018).New York,NY:Springer International Publishing,2018:215-231.
[19]HU P,WANG G,KONG X,et al.Motion-Guided Cascaded Refinement Network for Video Object Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):1957-1967.
[20]PENG Q,CHEUNG Y M.Automatic Video Object Segmenta-tion Based on Visual and Motion Saliency[J].IEEE Transactions on Multimedia,2019,21(12):3083-3094.
[21]LI H,CHEN G,LI G,et al.Motion Guided Attention for Video Salient Object Detection[C]//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV).Pisca-taway,NJ:IEEE,2020:7273-7282.
[22]JI G P,FU K,WU Z,et al.Full-Duplex Strategy for Video Object Segmentation[C]//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision(ICCV).Pisca-taway,NJ:IEEE,2022:4902-4913.
[23]ZHOU X,WANG D,KRäHENBüHL P.Objects as points[J].arXiv:1904,07850,2019.
[24]YU F,WANG D,SHELHAMER E,et al.Deep Layer Aggregation[C]//Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2018:2403-2412.
[25]BRACHMANN E,ROTHER C.Neural-Guided RANSAC:Learning Where to Sample Model Hypotheses[C]//Procee-dings of the 2019 IEEE/CVF International Conference on Computer Vision(ICCV).Piscataway,NJ:IEEE,2020:4321-4330.
[26]YANG G,RAMANAN D.Upgrading Optical Flow to 3D Scene Flow Through Optical Expansion[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2020:1331-1340.
[27]YANG G,RAMANAN D.Volumetric Correspondence Net-works for Optical Flow[C]//Proceedings of the Proceedings of the 33rd International Conference on Neural Information Processing Systems(NIPS).New York,NY:Curran Associates Inc.,2019:794-805.
[28]CHUM O,PAJDLA T,STURM P.On Geometric Error for Homographies[J].Computer Vision and Image Understanding,2005,97:86-102.
[29]XIE E,SUN P,SONG X,et al.PolarMask:Single Shot Instance Segmentation With Polar Representation[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2020:12190-12199.
[30]GEIGER A,LENZ P,URTASUN R.Are we ready for autonomous driving? The KITTI vision benchmark suite[C]//Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2012:3354-3361.
[31]SIAM M,MAHGOUB H,ZAHRAN M,et al.MODNet:Motion and Appearance based Moving Object Detection Network for Autonomous Driving[C]//Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems(ITSC).Piscataway,NJ:IEEE,2018:2859-2864.
[32]PERAZZI F,PONT-TUSET J,MCWILLIAMS B,et al.ABenchmark Dataset and Evaluation Methodology for Video Object Segmentation[C]//Proceedings of the 2016 IEEE Confe-rence on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2016:724-732.
[33]MENZE M,GEIGER A.Object scene flow for autonomous vehicles[C]//Proceedings of the 2015 IEEE Conference on Compu-ter Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2015:3061-3070.
[34]OCHS P,MALIK J,BROX T.Segmentation of Moving Objects by Long Term Video Analysis[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,36(6):1187-1200.
[35]RANJAN A,JAMPANI V,BALLES L,et al.Competitive Collaboration:Joint Unsupervised Learning of Depth,Camera Motion,Optical Flow and Motion Segmentation[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway,NJ:IEEE,2020:12232-12241.
[36]WANG C,LI C,LIU J,et al.U2-ONet:A Two-Level Nested Octave U-Structure Network with a Multi-Scale Attention Mechanism for Moving Object Segmentation [J].arXiv:2007.13092,2020.
[37]LIU D,YU D,WANG C,et al.F2Net:Learning to Focus on the Foreground for Unsupervised Video Object Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence(AAAI).Menlo Park,CA:AAAI,2021,:2109-2117.
[38]LU X,WANG W,SHEN J,et al.Zero-Shot Video Object Segmentation With Co-Attention Siamese Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(4):2228-2242.
[39]LIU Z,LIU J,CHEN W,et al.FAMINet:Learning Real-Time Semisupervised Video Object Segmentation With Steepest Optimized Optical Flow[J].IEEE Transactions on Instrumentation and Measurement,2022,71:1-16.
[40]CHO S,LEE H,KIM M,et al.Pixel-Level Bijective Matching for Video Object Segmentation[C]//Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision(WACV).Piscataway,NJ:IEEE,2022:1453-1462.
[1] XUE Jinqiang, WU Qin. Progressive Multi-stage Image Denoising Algorithm Combining Convolutional Neural Network and
Multi-layer Perceptron
[J]. Computer Science, 2024, 51(4): 243-253.
[2] ZHANG Yi, WU Qin. Crowd Counting Network Based on Feature Enhancement Loss and Foreground Attention [J]. Computer Science, 2023, 50(3): 246-253.
[3] ZHOU Zi-qin, YAN Hua. 3D Shape Recognition Based on Multi-task Learning with Limited Multi-view Data [J]. Computer Science, 2020, 47(4): 125-130.
[4] LIU Yan, LEI Yin-jie, NING Qian. Study of Crowd Counting Algorithm of “Weak Supervision” Dense Scene Based on DeepNeural Network [J]. Computer Science, 2020, 47(4): 184-188.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!