Computer Science ›› 2020, Vol. 47 ›› Issue (4): 125-130.doi: 10.11896/jsjkx.190700163

• Computer Graphics & Multimedia • Previous Articles     Next Articles

3D Shape Recognition Based on Multi-task Learning with Limited Multi-view Data

ZHOU Zi-qin, YAN Hua   

  1. College of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China
  • Received:2019-07-23 Online:2020-04-15 Published:2020-04-15
  • Contact: YAN Hua,born in 1971,professor.His main research interests include pattern recognition and intelligent system.
  • About author:ZHOU Zi-qin,born in 1996,postgradua-te.Her main research interests include machine learning,deep learning,computer vision,3D shape analysis,active learning and neural architecture search.
  • Supported by:
    This work was supported by the Key Research and Development Program of Sichuan Province (2019YFG0409)

Abstract: With the rapid development of 3D scanning technology,3D shape analysis has been widely concerned by researchers.Especially with the significant success of deep learning in computer vision,the approaches of 3D shape recognition based on multi-view have become the dominant methods.In the previous work,we notice that the amount of 3D shapes is essential for the recognition accuracy.However,due to the limitation of professional 3D scanning equipment,the 3D shape data is hard to collect.In fact,the scale of existing benchmark datasets is far smaller than that of 2D datasets which impedes the development of 3D shape analysis.In order to solve this problem,we mainly develop an optimal strategy of 3D shape recognition with limited data.Inspired by multi-task learning,we develop a novel network with multiple branches and construct an auxiliary comparison module based on metric learning to exploit the similarity and discrepancy between different samples intra-class and inter-class.The proposed network mainly includes a primary branch and an auxiliary branch,which respectively use disparate loss functions with different training tasks and hyper-parameter to balance different loss items.The primary branch aims to obtain the prediction of classification and uses Cross Entropy Loss function to train it.While the similarity scores of different samples are calculated by the auxiliary module,and the Mean Square Error is used to update this branch.Both two branches share the same feature extractor to project all samples into the same representation space and train the structure jointly in training phase,while the primary branch would be used in testing phrase to calculate the accuracy.Extensive experimental results have reported on two public 3D shape benchmark datasets which demonstrate the effectiveness of our proposed architecture to enhance the discriminative power and achieve better performance compared with traditional methods,especially in the situation where merely has limited multi-view data.

Key words: Multi-view 3D shape, Limit data, Auxiliary branch, Multi-task learning, 3D shape recognition

CLC Number: 

  • TP391.413
[1]WU Z,SONG S,KHOSLA A,et al.3D shapenets:A deep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1912-1920.
[2]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.2009:248-255.
[3]YU T,YAN J,WANG Y,et al.Generalizing graph matching beyond quadratic assignment model[C]//Advances in Neural Information Processing Systems.2018:853-863.
[4]YANG Y,FENG C,SHEN Y,et al.Foldingnet:Point cloud auto-encoder via deep grid deformation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:206-215.
[5]SHEN Y,FENG C,YANG Y,et al.Mining point cloud localstructures by kernel correlation and graph pooling[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4548-4557.
[6]QI C R,SU H,NIEβNER M,et al.Volumetric and multi-view cnns for object classification on 3D data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5648-5656.
[7]JOHNS E,LEUTENEGGER S,DAVISON A J.Pairwise decomposition of image sequences for active multi-view recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3813-3822.
[8]HAN Z,SHANG M,LIU Y S,et al.View inter-prediction gan:Unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:8376-8384.
[9]SU H,MAJI S,KALOGERAKIS E,et al.Multi-view convolutional neural networks for 3D shape recognition[C]//Procee-dings of the IEEE International Conference on Computer Vision.2015:945-953.
[10]SCHROFF F,KALENICHENKO D,PHILBIN J.Facenet:Aunified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:815-823.
[11]WANG C,ZHANG X,LAN X.How to train triplet networks with 100k identities?[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1907-1915.
[12]HE X,ZHOU Y,ZHOU Z,et al.Triplet-center loss for multi-view 3D object retrieval[C]//Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2018:1945-1954.
[13]LONG M,WANG J.Learning multiple tasks with deep relationship networks[J].arXiv:1506.02117,2015.
[14]BINGEL J,SØGAARD A.Identifying beneficial task relationsfor multi-task learning in deep neural networks[J].arXiv:1702.08303,2017.
[15]LU Y,KUMAR A,ZHAI S,et al.Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5334-5343.
[16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[17]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[18]HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE international conference on computer vision.2015:1026-1034.
[1] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[2] GENG Lei-lei, CUI Chao-ran, SHI Cheng, SHEN Zhen, YIN Yi-long, FENG Shi-hong. Social Image Tag and Group Joint Recommendation Based on Deep Multi-task Learning [J]. Computer Science, 2020, 47(12): 177-182.
[3] CHEN Xun-min, YE Shu-han, ZHAN Rui. Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine [J]. Computer Science, 2020, 47(11A): 183-187.
[4] GAO Li-jian,MAO Qi-rong. Environment-assisted Multi-task Learning for Polyphonic Acoustic Event Detection [J]. Computer Science, 2020, 47(1): 159-164.
[5] WU Liang-qing, ZHANG Dong, LI Shou-shan, CHEN Ying. Multi-modal Emotion Recognition Approach Based on Multi-task Learning [J]. Computer Science, 2019, 46(11): 284-290.
[6] MENG Hao-hua LI Guo-zheng (School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China). [J]. Computer Science, 2008, 35(10): 186-187.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .