Computer Science ›› 2020, Vol. 47 ›› Issue (4): 125-130.doi: 10.11896/jsjkx.190700163

• Computer Graphics & Multimedia • Previous Articles     Next Articles

3D Shape Recognition Based on Multi-task Learning with Limited Multi-view Data

ZHOU Zi-qin, YAN Hua   

  1. College of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China
  • Received:2019-07-23 Online:2020-04-15 Published:2020-04-15
  • Contact: YAN Hua,born in 1971,professor.His main research interests include pattern recognition and intelligent system.
  • About author:ZHOU Zi-qin,born in 1996,postgradua-te.Her main research interests include machine learning,deep learning,computer vision,3D shape analysis,active learning and neural architecture search.
  • Supported by:
    This work was supported by the Key Research and Development Program of Sichuan Province (2019YFG0409)

Abstract: With the rapid development of 3D scanning technology,3D shape analysis has been widely concerned by researchers.Especially with the significant success of deep learning in computer vision,the approaches of 3D shape recognition based on multi-view have become the dominant methods.In the previous work,we notice that the amount of 3D shapes is essential for the recognition accuracy.However,due to the limitation of professional 3D scanning equipment,the 3D shape data is hard to collect.In fact,the scale of existing benchmark datasets is far smaller than that of 2D datasets which impedes the development of 3D shape analysis.In order to solve this problem,we mainly develop an optimal strategy of 3D shape recognition with limited data.Inspired by multi-task learning,we develop a novel network with multiple branches and construct an auxiliary comparison module based on metric learning to exploit the similarity and discrepancy between different samples intra-class and inter-class.The proposed network mainly includes a primary branch and an auxiliary branch,which respectively use disparate loss functions with different training tasks and hyper-parameter to balance different loss items.The primary branch aims to obtain the prediction of classification and uses Cross Entropy Loss function to train it.While the similarity scores of different samples are calculated by the auxiliary module,and the Mean Square Error is used to update this branch.Both two branches share the same feature extractor to project all samples into the same representation space and train the structure jointly in training phase,while the primary branch would be used in testing phrase to calculate the accuracy.Extensive experimental results have reported on two public 3D shape benchmark datasets which demonstrate the effectiveness of our proposed architecture to enhance the discriminative power and achieve better performance compared with traditional methods,especially in the situation where merely has limited multi-view data.

Key words: 3D shape recognition, Auxiliary branch, Limit data, Multi-task learning, Multi-view 3D shape

CLC Number: 

  • TP391.413
[1]WU Z,SONG S,KHOSLA A,et al.3D shapenets:A deep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1912-1920.
[2]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.2009:248-255.
[3]YU T,YAN J,WANG Y,et al.Generalizing graph matching beyond quadratic assignment model[C]//Advances in Neural Information Processing Systems.2018:853-863.
[4]YANG Y,FENG C,SHEN Y,et al.Foldingnet:Point cloud auto-encoder via deep grid deformation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:206-215.
[5]SHEN Y,FENG C,YANG Y,et al.Mining point cloud localstructures by kernel correlation and graph pooling[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4548-4557.
[6]QI C R,SU H,NIEβNER M,et al.Volumetric and multi-view cnns for object classification on 3D data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5648-5656.
[7]JOHNS E,LEUTENEGGER S,DAVISON A J.Pairwise decomposition of image sequences for active multi-view recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3813-3822.
[8]HAN Z,SHANG M,LIU Y S,et al.View inter-prediction gan:Unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:8376-8384.
[9]SU H,MAJI S,KALOGERAKIS E,et al.Multi-view convolutional neural networks for 3D shape recognition[C]//Procee-dings of the IEEE International Conference on Computer Vision.2015:945-953.
[10]SCHROFF F,KALENICHENKO D,PHILBIN J.Facenet:Aunified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:815-823.
[11]WANG C,ZHANG X,LAN X.How to train triplet networks with 100k identities?[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1907-1915.
[12]HE X,ZHOU Y,ZHOU Z,et al.Triplet-center loss for multi-view 3D object retrieval[C]//Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2018:1945-1954.
[13]LONG M,WANG J.Learning multiple tasks with deep relationship networks[J].arXiv:1506.02117,2015.
[14]BINGEL J,SØGAARD A.Identifying beneficial task relationsfor multi-task learning in deep neural networks[J].arXiv:1702.08303,2017.
[15]LU Y,KUMAR A,ZHAI S,et al.Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5334-5343.
[16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105.
[17]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[18]HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE international conference on computer vision.2015:1026-1034.
[1] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[2] ZHAO Kai, AN Wei-chao, ZHANG Xiao-yu, WANG Bin, ZHANG Shan, XIANG Jie. Intracerebral Hemorrhage Image Segmentation and Classification Based on Multi-taskLearning of Shared Shallow Parameters [J]. Computer Science, 2022, 49(4): 203-208.
[3] YANG Xiao-yu, YIN Kang-ning, HOU Shao-qi, DU Wen-yi, YIN Guang-qiang. Person Re-identification Based on Feature Location and Fusion [J]. Computer Science, 2022, 49(3): 170-178.
[4] SONG Long-ze, WAN Huai-yu, GUO Sheng-nan, LIN You-fang. Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction [J]. Computer Science, 2021, 48(7): 112-117.
[5] LIU Xiao-long, HAN Fang, WANG Zhi-jie. Joint Question Answering Model Based on Knowledge Representation [J]. Computer Science, 2021, 48(6): 241-245.
[6] ZHOU Xiao-jin, XU Chen-ming, RUAN Tong. Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records [J]. Computer Science, 2021, 48(4): 237-242.
[7] ZHANG Chun-yun, QU Hao, CUI Chao-ran, SUN Hao-liang, YIN Yi-long. Process Supervision Based Sequence Multi-task Method for Legal Judgement Prediction [J]. Computer Science, 2021, 48(3): 227-232.
[8] WANG Ti-shuang, LI Pei-feng, ZHU Qiao-ming. Chinese Implicit Discourse Relation Recognition Based on Data Augmentation [J]. Computer Science, 2021, 48(10): 85-90.
[9] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[10] GENG Lei-lei, CUI Chao-ran, SHI Cheng, SHEN Zhen, YIN Yi-long, FENG Shi-hong. Social Image Tag and Group Joint Recommendation Based on Deep Multi-task Learning [J]. Computer Science, 2020, 47(12): 177-182.
[11] CHEN Xun-min, YE Shu-han, ZHAN Rui. Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine [J]. Computer Science, 2020, 47(11A): 183-187.
[12] GAO Li-jian,MAO Qi-rong. Environment-assisted Multi-task Learning for Polyphonic Acoustic Event Detection [J]. Computer Science, 2020, 47(1): 159-164.
[13] WU Liang-qing, ZHANG Dong, LI Shou-shan, CHEN Ying. Multi-modal Emotion Recognition Approach Based on Multi-task Learning [J]. Computer Science, 2019, 46(11): 284-290.
[14] MENG Hao-hua LI Guo-zheng (School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China). [J]. Computer Science, 2008, 35(10): 186-187.
Full text



No Suggested Reading articles found!