计算机科学 ›› 2020, Vol. 47 ›› Issue (2): 83-87.doi: 10.11896/jsjkx.190500077

• 计算机图形学&多媒体 • 上一篇    下一篇

基于角度特征的分类网络

王立华,杜明辉,梁亚玲   

  1. (华南理工大学电子与信息学院 广州510641)
  • 收稿日期:2019-05-17 出版日期:2020-02-15 发布日期:2020-03-18
  • 通讯作者: 杜明辉(ecmhdu@scut.edu.cn)
  • 基金资助:
    国家自然科学基金资助项目(61701181);广东省自然科学基金资助项目(2017A030325430);广州市科技计划项目(201707010070)

Classification Net Based on Angular Feature

WANG Li-hua,DU Ming-hui,LIANG Ya-ling   

  1. (School of Electronics and Information,South China University of Technology,GuangZhou 510641,China)
  • Received:2019-05-17 Online:2020-02-15 Published:2020-03-18
  • About author:WANG Li-hua,born in 1995,postgra-duate.His main research interests include computer vision and deep learning;DU Ming-hui,born in 1964,professor,Ph.D superuisor.His main research interests include signal processing and image processing.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61701181), Natural Science Foundation of Guangdong Province, China (2017A030325430) and Science and Technology Program of Guangzhou, China (201707010070).

摘要: 卷积神经网络(Convolutional Neural Networks,CNN)在图像分类任务中的卓越表现,使得其被广泛应用于计算机视觉的各个领域。图像分类模型精度与效率的提升,除了归功于网络结构的改变外,还有很大一部分原因来自于归一化技术以及分类损失函数的改进。在人脸识别任务中,随着精度的不断提升,分类损失函数从Softmax Loss到Triplet Loss,又从L-Softmax Loss到Arcface Loss,度量方式从几何度量发展到角度度量。度量方式的改变实际上是特征形式的变化,即特征形式从一般特征转变为角度特征。在Mnist数据集上,使用角度度量损失函数训练得到的特征点呈角度分布,同时准确率比几何度量高;将角度度量方式用更直接的角度特征来表示,训练得到的同类特征点呈直线分布,准确度也比一般角度度量更高。这不禁令人思考,在CNN分类模型中是否可以使用角度特征来代替一般特征。在CNN分类模型中,其主要架构往往由多个卷积层和一个或多个全连接层组成,通过统一卷积层与全连接层的归一化操作,得到角度卷积层与角度全连接层。在普通分类网络的基础上,用角度卷积层替换卷积层,用角度全连接层替换全连接层,可以得到一个由角度特征组成的角度分类网络。在Cifar-100数据集上,基于ResNet-32构造的角度分类网络相比原分类网络,分类准确率提高了2%,从而论证了角度特征在分类网络中的有效性。

关键词: 归一化, 角度特征, 卷积神经网络, 损失函数, 图像分类

Abstract: The excellent performance of Convolutional Neural Networks (CNN) in image classification tasks makes CNN models widely used in various fields of computer vision.In addition to the changes in the network structure,a large part of the reason why the accuracy and efficiency of the image classification model increase year by year comes from thenormalization technology and the improvement of the classification loss function.In the face recognition task,with the increasing precision,the classification loss function change from Softmax Loss to Triplet Loss,and from L-Softmax Loss to Arcface Loss,the measurement method develops from geometric measurement to angle measurement.The change of measurement mode is actually a change of feature form,and the feature form changes from general feature to angle feature.The feature points trained on the Mnist dataset using the angle metric loss function are angularly distributed,and the accuracy is higher than the geometric metric.If the angle metric is represented by more direct angular features,the feature points of the same class are linearly distributed after training,and accuracy is also higher than the general angle metric.This makes people wonder whether angle features can be used instead of general features in the CNN classification model.In the CNN classification model,the main structure is often composed of multiple convolutional layers and one or several fully connected layers.Through unifying the normalization operation of the convolutional layer and the fully connected layer,layers in model come to the angular convolutional layers and the angular fully connected layers.On the basis of the common classification network,the convolution layer is replaced by the angle convolution layer,and the full connection layer is replaced by the angle full connection layer,and then an angle classification network composed of angular features can be obtained.The accuracy of the angle classification network constructed on ResNet-32 is 2% higher than that of the original classification network on the Cifar-100 dataset.The validity of the feature in the classification network is demonstrated.

Key words: Angular feature, Convolutional neural networks, Image classification, Loss function, Normalization

中图分类号: 

  • TP183
[1]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[2]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[3]SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolutions[C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Boston,MA,USA:IEEE,2015:1-9.
[4]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition.Columbus,OH,USA:IEEE,2014:580-587.
[5]GIRSHICK R.Fast R-CNN[C]∥2015 IEEE International Conference on Computer Vision (ICCV).Santiago,Chile:IEEE,2015:1440-1448.
[6]REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:to-wards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[7]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]∥2015 IEEE Confe-rence on Computer Vision and Pattern Recognition (CVPR).Boston,MA,USA:IEEE,2015:3431-3440.
[8]SIMONYAN K,ZISSERMAN A.Two-stream convolutional networks for action recognition in videos[C]∥Advances in Neural Information Processing Systems.2014:568-576.
[9]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas,NV,USA:IEEE,2016:770-778.
[10]HUANG G,LIU Z,MAATEN L V D,et al.Densely connected convolutional networks[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:4700-4708.
[11]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167,2015.
[12]SALIMANS T,KINGMA D P.Weight normalization:A simple reparameterization to accelerate training of deep neural net-works[C]∥Advances in Neural Information Processing Systems.2016:901-909.
[13]LEI BA J,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016.
[14]SCHROFF F,KALENICHENKO D,PHILBIN J.FaceNet:a unified embedding for face recognition and clustering[C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Boston,MA,USA:IEEE,2015:815-823.
[15]LIU W,WEN Y,YU Z,et al.Large-margin softmax loss for convolutional neural networks[C]∥ICML.2016:7.
[16]RANJAN R,CASTILLO C D,CHELLAPPA R.L2-constrained softmax loss for discriminative face verification[J].arXiv:1703.09507,2017.
[17]LIU W Y,WEN Y D,YU Z D,et al.SphereFace:deep hypersphere embedding for face recognition[C]∥2017 IEEE Confe-rence on Computer Vision and Pattern Recognition (CVPR).Honolulu,HI:IEEE,2017:212-220.
[18]WANG F,CHENG J,LIU W Y,et al.Additive margin softmax for face verification[J].IEEE Signal Processing Letters,2018,25(7):926-930.
[19]WANG H,WANG Y T,ZHOU Z,et al.CosFace:large margin cosine loss for deep face recognition[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT:IEEE,2018:5265-5274.
[20]DENG J,GUO J,XUE N,et al.Arcface:Additive angular margin loss for deep face recognition[J].arXiv:1801.07698,2018.
[21]HUANG G B,LEARNED-MILLER E.Labeled faces in the wild:Updates and new reporting procedures:Technical Report UM-CS-2014-003 [R].Massachusetts Amherst,Amherst,MA,USA,2014.
[23]LIU W Y,LIU Z,YU Z D,et al.Decoupled networks[C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT:IEEE,2018:2771-2779.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[3] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[4] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[5] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[6] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[7] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[8] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[9] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[10] 刘月红, 牛少华, 神显豪.
基于卷积神经网络的虚拟现实视频帧内预测编码
Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network
计算机科学, 2022, 49(7): 127-131. https://doi.org/10.11896/jsjkx.211100179
[11] 徐鸣珂, 张帆.
Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法
Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition
计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085
[12] 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强.
基于向量注意力机制GoogLeNet-GMP的行人重识别方法
Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism
计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198
[13] 杨玥, 冯涛, 梁虹, 杨扬.
融合交叉注意力机制的图像任意风格迁移
Image Arbitrary Style Transfer via Criss-cross Attention
计算机科学, 2022, 49(6A): 345-352. https://doi.org/10.11896/jsjkx.210700236
[14] 杨健楠, 张帆.
一种结合双注意力机制和层次网络结构的细碎农作物分类方法
Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure
计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169
[15] 杨涵, 万游, 蔡洁萱, 方铭宇, 吴卓超, 金扬, 钱伟行.
基于步态分类辅助的虚拟IMU的行人导航方法
Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification
计算机科学, 2022, 49(6A): 759-763. https://doi.org/10.11896/jsjkx.211200148
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!