计算机科学 ›› 2020, Vol. 47 ›› Issue (4): 125-130.doi: 10.11896/jsjkx.190700163
周子钦, 严华
ZHOU Zi-qin, YAN Hua
摘要: 随着三维扫描技术的快速发展,三维形状分析得到了学术界的广泛关注;尤其是深度学习在计算机视觉上取得的显著成功,使得基于多视图的三维形状识别方法成为了目前三维模型识别的主流方式。已有研究表明,三维数据集的数量对于最终的分类精度是一个非常重要的影响条件。然而,由于专业三维扫描设备的限制,三维形状数据难以采集。实际上,现有的公共基准三维数据集的规模远远小于二维数据集,三维形状分析的发展因此受到阻碍。为了解决这一问题,文中主要研究在极小数据样本情况下,三维形状识别问题的优化解策略。受多任务学习的启发,搭建了多分支的网络结构,并引入基于度量学习的辅助比较模块,用于挖掘类内和类间的相似性和差异性信息。网络模型包括主支路与辅助支路,分别使用不同的损失函数对应不同的训练任务,并使用权值超参数平衡多项损失。主支路获得预测分类,使用交叉熵损失函数进行更新;辅助支路得到不同样本间的相似性得分,使用均方差损失函数进行更新。为保证特征向量被投影到同一个空间中,主、辅助支路共享相同的特征提取模块,在训练阶段共同更新参数,在测试阶段仅使用主支路获得的分类结果。在两个公开的三维形状基准数据集上的大量实验结果表明,所提网络结构与训练策略相比传统方法,在少样本的情况下可以显著提高特征模块对不同类别的区分能力,获得更优的识别结果。
中图分类号:
[1]WU Z,SONG S,KHOSLA A,et al.3D shapenets:A deep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1912-1920. [2]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.2009:248-255. [3]YU T,YAN J,WANG Y,et al.Generalizing graph matching beyond quadratic assignment model[C]//Advances in Neural Information Processing Systems.2018:853-863. [4]YANG Y,FENG C,SHEN Y,et al.Foldingnet:Point cloud auto-encoder via deep grid deformation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:206-215. [5]SHEN Y,FENG C,YANG Y,et al.Mining point cloud localstructures by kernel correlation and graph pooling[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4548-4557. [6]QI C R,SU H,NIEβNER M,et al.Volumetric and multi-view cnns for object classification on 3D data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5648-5656. [7]JOHNS E,LEUTENEGGER S,DAVISON A J.Pairwise decomposition of image sequences for active multi-view recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3813-3822. [8]HAN Z,SHANG M,LIU Y S,et al.View inter-prediction gan:Unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:8376-8384. [9]SU H,MAJI S,KALOGERAKIS E,et al.Multi-view convolutional neural networks for 3D shape recognition[C]//Procee-dings of the IEEE International Conference on Computer Vision.2015:945-953. [10]SCHROFF F,KALENICHENKO D,PHILBIN J.Facenet:Aunified embedding for face recognition and clustering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:815-823. [11]WANG C,ZHANG X,LAN X.How to train triplet networks with 100k identities?[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:1907-1915. [12]HE X,ZHOU Y,ZHOU Z,et al.Triplet-center loss for multi-view 3D object retrieval[C]//Proceedings of the IEEEConfe-rence on Computer Vision and Pattern Recognition.2018:1945-1954. [13]LONG M,WANG J.Learning multiple tasks with deep relationship networks[J].arXiv:1506.02117,2015. [14]BINGEL J,SØGAARD A.Identifying beneficial task relationsfor multi-task learning in deep neural networks[J].arXiv:1702.08303,2017. [15]LU Y,KUMAR A,ZHAI S,et al.Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5334-5343. [16]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems.2012:1097-1105. [17]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014. [18]HE K,ZHANG X,REN S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]//Proceedings of the IEEE international conference on computer vision.2015:1026-1034. |
[1] | 杜丽君, 唐玺璐, 周娇, 陈玉兰, 程建. 基于注意力机制和多任务学习的阿尔茨海默症分类 Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning 计算机科学, 2022, 49(6A): 60-65. https://doi.org/10.11896/jsjkx.201200072 |
[2] | 赵凯, 安卫超, 张晓宇, 王彬, 张杉, 相洁. 共享浅层参数多任务学习的脑出血图像分割与分类 Intracerebral Hemorrhage Image Segmentation and Classification Based on Multi-taskLearning of Shared Shallow Parameters 计算机科学, 2022, 49(4): 203-208. https://doi.org/10.11896/jsjkx.201000153 |
[3] | 杨晓宇, 殷康宁, 候少麒, 杜文仪, 殷光强. 基于特征定位与融合的行人重识别算法 Person Re-identification Based on Feature Location and Fusion 计算机科学, 2022, 49(3): 170-178. https://doi.org/10.11896/jsjkx.210100132 |
[4] | 宋龙泽, 万怀宇, 郭晟楠, 林友芳. 面向出租车空载时间预测的多任务时空图卷积网络 Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction 计算机科学, 2021, 48(7): 112-117. https://doi.org/10.11896/jsjkx.201000089 |
[5] | 郭文, 尹童灵, 张天柱, 徐常胜. 时间一致性保持的多任务稀疏深度表达视觉跟踪 Temporal Consistency Preserving Multi-Mask Sparse Deep Representation for Visual Tracking 计算机科学, 2021, 48(6): 110-117. https://doi.org/10.11896/jsjkx.200800212 |
[6] | 刘小龙, 韩芳, 王直杰. 基于知识表示的联合问答模型 Joint Question Answering Model Based on Knowledge Representation 计算机科学, 2021, 48(6): 241-245. https://doi.org/10.11896/jsjkx.200600011 |
[7] | 周晓进, 徐陈铭, 阮彤. 面向中文电子病历的多粒度医疗实体识别 Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records 计算机科学, 2021, 48(4): 237-242. https://doi.org/10.11896/jsjkx.200100036 |
[8] | 张春云, 曲浩, 崔超然, 孙皓亮, 尹义龙. 基于过程监督的序列多任务法律判决预测方法 Process Supervision Based Sequence Multi-task Method for Legal Judgement Prediction 计算机科学, 2021, 48(3): 227-232. https://doi.org/10.11896/jsjkx.200700056 |
[9] | 王体爽, 李培峰, 朱巧明. 基于数据增强的中文隐式篇章关系识别方法 Chinese Implicit Discourse Relation Recognition Based on Data Augmentation 计算机科学, 2021, 48(10): 85-90. https://doi.org/10.11896/jsjkx.200800115 |
[10] | 潘祖江, 刘宁, 张伟, 王建勇. 基于层次注意力机制的多任务疾病进展模型 MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism 计算机科学, 2020, 47(9): 185-189. https://doi.org/10.11896/jsjkx.190900001 |
[11] | 耿蕾蕾, 崔超然, 石成, 申朕, 尹义龙, 冯仕红. 基于深度多任务学习的社交图像标签和分组联合推荐 Social Image Tag and Group Joint Recommendation Based on Deep Multi-task Learning 计算机科学, 2020, 47(12): 177-182. https://doi.org/10.11896/jsjkx.191000141 |
[12] | 陈训敏, 叶书函, 詹瑞. 基于多任务学习及由粗到精的卷积神经网络人群计数模型 Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine 计算机科学, 2020, 47(11A): 183-187. https://doi.org/10.11896/jsjkx.200300012 |
[13] | 高利剑,毛启容. 环境辅助的多任务混合声音事件检测方法 Environment-assisted Multi-task Learning for Polyphonic Acoustic Event Detection 计算机科学, 2020, 47(1): 159-164. https://doi.org/10.11896/jsjkx.190200365 |
[14] | 吴良庆, 张栋, 李寿山, 陈瑛. 基于多任务学习的多模态情绪识别方法 Multi-modal Emotion Recognition Approach Based on Multi-task Learning 计算机科学, 2019, 46(11): 284-290. https://doi.org/10.11896/jsjkx.180901665 |
[15] | 孟浩华 李国正. 基于遗传算法的多任务学习 计算机科学, 2008, 35(10): 186-187. |
|