Computer Science ›› 2022, Vol. 49 ›› Issue (1): 212-218.doi: 10.11896/jsjkx.201100143

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Dual-stream Reconstruction Network for Multi-label and Few-shot Learning

FANG Zhong-li, WANG Zhe, CHI Zi-qiu   

  1. School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
  • Received:2020-11-23 Revised:2021-03-27 Online:2022-01-15 Published:2022-01-18
  • About author:FANG Zhong-li,born in 1996,postgra-duate,is a member of China Computer Federation.His main research interests include multi-label learning and deep learning.
    WANG Zhe,born in 1981,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include pattern recognition and image processing.
  • Supported by:
    National Social Science Fund of China(15BGL048).

Abstract: The multi-label image classification problem is one of the most important problems in the field of computer vision,which needs to predict and output all the labels in an image.However,the number of labels to be classified in an image is often more than one,and the changeable size,posture,and position of objects in the image will increase the difficulty of classification.Therefore,how to effectively improve the accurate expression ability of image features is an urgent problem to be solved.In response to the above-mentioned problem,a novel dual-stream reconstruction network is proposed to extract features from images.Specifically,the model first proposes a dual-stream attention network to extract features based on channel information and spatial information,and uses feature stitching to make image features have both channel detail information and spatial detail information.Secondly,a reconstruction loss function is introduced to constrain the features of the dual-stream network,forcing the above two divergent features to have the same feature expression ability,thereby promoting the extracted dual-stream features to approach the ground-truth features.Experimental results on multi-label image datasets based on VOC 2007 and MS COCO show that the proposed dual-stream reconstruction network can accurately and effectively extract salient features and produce better classification accuracy.At the same time,in view of the sparse effect of reconstruction loss on model features,the proposed method is also applied to few-shot learning.The experimental results show thatthe proposed model also has good classification accuracy for few-shot learning.

Key words: Deep learning, Feature reconstruction, Few-shot learning, Image attention mechanism, Multi-label image recognition

CLC Number: 

  • TP183
[1]DENG J,DONG W,SOCHER R,et al.Imagenet:A large-scale hierarchical image database[C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009:248-255.
[2]CHEN T R,LING J.Differential Privacy Protection MachineLearning Method Based on Features Mapping[J].Computer Science,2021,48(7):33-39.
[3]WANG Q,JIA N,BRECKON T P.A baseline for multi-label image classification using an ensemble deep CNN[J].IEEE International Conference on Image Processing(ICIP),2019.
[4]WEI Y,XIA W,LIN M,et al.HCP:A flexible CNN framework for multi-label image classification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,38(9):1901-1907.
[5]WANG J,YANG Y,MAO J,et al.Cnn-rnn:A unified frame-work for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2285-2294.
[6]YANG T,CHAN A B.Learning dynamic memory networks for object tracking[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:152-167.
[7]YANG Z,HE X,GAO J,et al.Stacked attention networks for image question answering[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2016:21-29.
[8]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[9]WANG P,CHEN P,YUAN Y,et al.Understanding convolution for semantic segmentation[C]//2018 IEEE Winter Conference on Applications of Computer vision (WACV).IEEE,2018:1451-1460.
[10]ZHU F,LI H,OUYANG W,et al.Learning spatial regularization with image-level supervisions for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5513-5522.
[11]WANG Z,CHEN T,LI G,et al.Multi-label image recognition by recurrently discovering attentional regions[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:464-472.
[12]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes (voc) challenge[J].Internatio-nal Journal of Computer Vision,2010,88(2):303-338.
[13]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755.
[14]WU B,CHEN W,FAN Y,et al.Tencent ml-images:A large-scale multi-label image database for visual representation lear-ning[J].IEEE Access,2019,7:172683-172693.
[15]GUO H,ZHENG K,FAN X,et al.Visual attention consistency under image transforms for multi-label image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:729-739.
[16]LUO Y,JIANG M,ZHAO Q.Visual attention in multi-labelimage classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops.2019.
[17]DEMBCZYHSKI K,WAEGEMAN W,CHENG W,et al.On label dependence and loss minimization in multi-label classification[J].Machine Learning,2012,88(1/2):5-45.
[18]NAM J,MENCÍA E L,KIM H J,et al.Maximizing subset accuracy with recurrent neural networks in multi-label classification[J].Advances in Neural Information Processing Systems,2017,30:5413-5423.
[19]WANG Y,WANG S, TANG J,et al.Ppp:Joint pointwise and pairwise image label prediction[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:6005-6013.
[20]DECUBBER S,MORTIER T,DEMBCZYHSKI K,et al.Deepf-measure maximization in multi-label classification:A comparative study[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases.Cham:Springer,2018:290-305.
[21]WU X Z,ZHOU Z H.A unified view of multi-label performance measures[C]//International Conference on Machine Learning.PMLR,2017:3780-3788.
[22]LI Y,SONG Y,LUO J.Improving pairwise ranking for multi-label image classification[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2017:3617-3625.
[23]ELISSEEFF A,WESTON J.A kernel method for multi-labelled classification[C]//Advances in Neural Information Processing Systems.2002:681-687.
[24]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[25]HARZALLAH H,JURIE F,SCHMID C.Combining efficientobject localization and image classification[C]//2009 IEEE 12th International Conference on Computer Vision.IEEE,2009:237-244.
[26]DONG J,XIA W,CHEN Q,et al.Subcategory-aware object clas-sification[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2013:827-834.
[27]SONG Z,CHEN Q,HUANG Z,et al.Contextualizing object detection and classification[C]//CVPR 2011.IEEE,2011:1585-1592.
[28]LYU F,WU Q,HU F,et al.Attend and imagine:Multi-labelimage classification with visual attention and recurrent neural networks[J].IEEE Transactions on Multimedia,2019,21(8):1971-1981.
[29]ZHANG J,WU Q,SHEN C,et al.Multi-label image classification with regional latent semantic dependencies[J].IEEE Transa-ctions on Multimedia,2018,20(10):2801-2813.
[30]GONG Y,JIA Y,LEUNG T,et al.Deep convolutional ranking for multilabel image annotation[J].arXiv:1312.4894,2013.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[9] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[10] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[11] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[12] SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[13] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[14] SUN Fu-quan, CUI Zhi-qing, ZOU Peng, ZHANG Kun. Brain Tumor Segmentation Algorithm Based on Multi-scale Features [J]. Computer Science, 2022, 49(6A): 12-16.
[15] KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!