计算机科学 ›› 2022, Vol. 49 ›› Issue (1): 212-218.doi: 10.11896/jsjkx.201100143
方仲礼, 王喆, 迟子秋
FANG Zhong-li, WANG Zhe, CHI Zi-qiu
摘要: 多标签图像分类问题是计算机视觉领域的重要问题之一,它需要对图像中的所有标签进行预测。而一幅图像中待分类的标签个数往往不止一个,同时图像中对象的大小、位置和姿态的变化都会对模型的分类性能产生影响。因此,如何有效地提高图像特征的准确表达能力是一个亟需解决的难题。 针对上述难题,文中提出了一个新颖的双流重构网络来对图像进行特征抽取。具体而言,该模型首先应用一个双流注意力网络来对图像进行基于通道信息和空间信息的特征提取,并经过特征拼接使得图像特征同时兼顾通道特征细节信息和空间特征细节信息。其次,该模型引入了重构损失函数,对双流网络进行特征约束,迫使上述两种分歧特征具有相同的特征表达能力,以此促使提取的双流特征共同向真值特征迫近。在基于VOC 2007和MS COCO多标签图像数据集上的实验结果表明,所提出的双流重构网络能够准确有效地提取出显著特征,并产生更好的分类精度。同时,鉴于重建损失对模型的解拟合作用,将该方法应用在小样本场景上,实验结果显示,所提模型对小样本数据同样具有较好的分类精度。
中图分类号:
[1]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255. [2]CHEN T R,LING J.Differential Privacy Protection MachineLearning Method Based on Features Mapping[J].Computer Science,2021,48(7):33-39. [3]WANG Q,JIA N,BRECKON T P.A baseline for multi-label image classification using an ensemble deep CNN[J].IEEE International Conference on Image Processing(ICIP),2019. [4]WEI Y,XIA W,LIN M,et al.HCP:A flexible CNN framework for multi-label image classification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38(9):1901-1907. [5]WANG J,YANG Y,MAO J,et al.Cnn-rnn:A unified frame-work for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2285-2294. [6]YANG T,CHAN A B.Learning dynamic memory networks for object tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:152-167. [7]YANG Z,HE X,GAO J,et al.Stacked attention networks for image question answering[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2016:21-29. [8]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788. [9]WANG P,CHEN P,YUAN Y,et al.Understanding convolution for semantic segmentation[C]//2018 IEEE Winter Conference on Applications of Computer vision (WACV).IEEE,2018:1451-1460. [10]ZHU F,LI H,OUYANG W,et al.Learning spatial regularization with image-level supervisions for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5513-5522. [11]WANG Z,CHEN T,LI G,et al.Multi-label image recognition by recurrently discovering attentional regions[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:464-472. [12]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes (voc) challenge[J].Internatio-nal Journal of Computer Vision,2010,88(2):303-338. [13]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [14]WU B,CHEN W,FAN Y,et al.Tencent ml-images:A large-scale multi-label image database for visual representation lear-ning[J].IEEE Access,2019,7:172683-172693. [15]GUO H,ZHENG K,FAN X,et al.Visual attention consistency under image transforms for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:729-739. [16]LUO Y,JIANG M,ZHAO Q.Visual attention in multi-labelimage classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2019. [17]DEMBCZYHSKI K,WAEGEMAN W,CHENG W,et al.On label dependence and loss minimization in multi-label classification[J].Machine Learning,2012,88(1/2):5-45. [18]NAM J,MENCÍA E L,KIM H J,et al.Maximizing subset accuracy with recurrent neural networks in multi-label classification[J].Advances in Neural Information Processing Systems,2017,30:5413-5423. [19]WANG Y,WANG S, TANG J,et al.Ppp:Joint pointwise and pairwise image label prediction[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:6005-6013. [20]DECUBBER S,MORTIER T,DEMBCZYHSKI K,et al.Deepf-measure maximization in multi-label classification:A comparative study[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer,2018:290-305. [21]WU X Z,ZHOU Z H.A unified view of multi-label performance measures[C]//International Conference on Machine Learning.PMLR,2017:3780-3788. [22]LI Y,SONG Y,LUO J.Improving pairwise ranking for multi-label image classification[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:3617-3625. [23]ELISSEEFF A,WESTON J.A kernel method for multi-labelled classification[C]//Advances in Neural Information Processing Systems.2002:681-687. [24]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014. [25]HARZALLAH H,JURIE F,SCHMID C.Combining efficientobject localization and image classification[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:237-244. [26]DONG J,XIA W,CHEN Q,et al.Subcategory-aware object clas-sification[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2013:827-834. [27]SONG Z,CHEN Q,HUANG Z,et al.Contextualizing object detection and classification[C]//CVPR 2011.IEEE,2011:1585-1592. [28]LYU F,WU Q,HU F,et al.Attend and imagine:Multi-labelimage classification with visual attention and recurrent neural networks[J].IEEE Transactions on Multimedia,2019,21(8):1971-1981. [29]ZHANG J,WU Q,SHEN C,et al.Multi-label image classification with regional latent semantic dependencies[J].IEEE Transa-ctions on Multimedia,2018,20(10):2801-2813. [30]GONG Y,JIA Y,LEUNG T,et al.Deep convolutional ranking for multilabel image annotation[J].arXiv:1312.4894,2013. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[3] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[4] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[5] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[8] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[9] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[10] | 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138 |
[11] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[12] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[13] | 刘伟业, 鲁慧民, 李玉鹏, 马宁. 指静脉识别技术研究综述 Survey on Finger Vein Recognition Research 计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056 |
[14] | 孙福权, 崔志清, 邹彭, 张琨. 基于多尺度特征的脑肿瘤分割算法 Brain Tumor Segmentation Algorithm Based on Multi-scale Features 计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217 |
[15] | 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩. 基于Transformer和LSTM的药物相互作用预测 Drug-Drug Interaction Prediction Based on Transformer and LSTM 计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150 |
|