Computer Science ›› 2018, Vol. 45 ›› Issue (5): 250-254.doi: 10.11896/j.issn.1002-137X.2018.05.043

Previous Articles     Next Articles

Improved Faster RCNN Training Method Based on Hard Negative Mining

AI Tuo, LIANG Ya-ling and DU Ming-hui   

  • Online:2018-05-15 Published:2018-07-25

Abstract: In the training process of object detection method named Faster RCNN(Faster Region-based Convolutional Neural Network),there is a data imbalance problem which means that training data contains an overwhelming number of negative examples.Aiming at this problem,a discriminant function was proposed to distinguish hard positive examples,which combines location loss and classification loss.Based on this function and hard negative mining,an improved bootstrap sampling method was proposed.Five-step training method was proposed by introducing the bootstrap sampling into traditional Faster RCNN training.Comparing with the traditional training,this method improves network’s generalization ability,reduces false positive rate,and can learn hard example better.The experimental results show that the model trained by five step attains 2.4% higher mAP(mean Average Precision) on Pascal VOC 2007 dataset,reduces false positive by 3.2% on FDDB(Face Detection Data Set and Benchmark) with the same true positive rate,and gets higher fitting degree of boundary box.

Key words: Faster RCNN,Object detection,Hard negative mining,Bootstrap sampling

[1] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(6):1137-1149.
[2] YANG M,RUAN Y D,CHEN L K,et al.New Video Recognition Algorithm for Inland River Ships Based on Faster RCNN[J].Journal of Beijing Univerisyt of Posts and Telecommunications,2017,0(S1):130-134.(in Chinese) 杨名,阮雅端,陈林凯,等.甚高速区域卷积神经网络的船舶视频目标识别算法[J].北京邮电大学学报,2017,40(S1):130-134.
[3] LI J B,YANG W H,XU J Q,et al.Deep Convolutional Network Based SAR Image Object Detection and Recognition[J].Navigation Position and Timing,2017,4(1):60-66.(in Chinese) 李君宝,杨文慧,许剑清,等.基于深度卷积网络的SAR图像目标检测识别[J].导航定位与授时,2017,4(1):60-66.
[4] WANG W G,TIAN B,LIU Y,et al.Study on the electricaldevices detection in UAV images based on region based convolutional neural networks[J].Journal of Geo-information Science,2017,9(2):256-263.(in Chinese) 王万国,田兵,刘越,等.基于RCNN的无人机巡检图像电力小部件识别研究[J].地球信息科学学报,2017,19(2):256-263.
[5] VIOLA P,JONES M J.Robust Real-Time Object Detection[C]∥International Workshop on Statistical and Computational Theories of Vision -- Modeling,Learning,Computing,and Sampling.Vancouver:IEEE Press,2001.
[6] FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D,et al.Object detection with discriminatively trained part-based models[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2010,32(9):1627-1645.
[7] DALAL N,TRIGGS B.Histograms of Oriented Gradients for Human Detection[C]∥IEEE Computer Society Conference on Computer Vision & Pattern Recognition.IEEE Computer Society,2005:886-893.
[8] CHEN L Y.Object Detection Based on Ensemble of Exemplars[D].Shanghai:Shanghai Jiao Tong University,2015.(in Chinese) 陈璐艳.基于范例集成的目标检测模型研究[D].上海:上海交通大学,2015.
[9] ZHANG X S.Research on Traffic Sign Detection in Cluttered Outdoor Scene[J].Computer Applications,2015,4(10):39-42.(in Chinese) 张雪松.复杂室外场景中交通标志检测研究[J].自动化技术与应用,2015,34(10):39-42.
[10] SUNG K K.Learning and example selection for object and pattern detection[M].Cambridge:Massachusetts Institute of Technology,1996.
[11] GIRSHICK R.Fast R-CNN[C]∥IEEE International Confe-rence on Computer Vision.Santiago:IEEE,2015:1440-1448.
[12] HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2015,37(9):1904-1916.
[13] NEUBECK A,VAN GOOL L.Efficient Non-Maximum Sup-pression[C]∥International Conference on Pattern Recognition.IEEE Computer Society,2006:850-855.
[14] WAN S,CHEN Z,ZHANG T,et al.Bootstrapping Face Detection with Hard Negative Examples.
[15] SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training Region-based Object Detectors with Online Hard Example Mining.
[16] JIA Y Q,SHELHAMER E,DONAHUE J,et al.Caffe:Convolutional Architecture for Fast Feature Embedding[J].Eprint Arxiv:1408.5093.
[17] ZEILER M D,FERGUS R.Visualizing and Understanding Convolutional Networks[M].Computer Vision-ECCV.2014:Springer International Publishing,2014:818-833.
[18] RUSSAKOVSKY O,DENG J,SU H,et al.ImageNet LargeScale Visual Recognition Challenge[J].International Journal of Computer Vision,2015,115(3):211-252.
[19] EVERINGHAM M,GOOL L,WILLIAMS C K,et al.The Pascal Visual Object Classes(VOC) Challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[20] YANG S,LUO P,CHEN C L,et al.WIDER FACE:A Face Detection Benchmark[C]∥IEEE Conference on Computer Vision &Pattern Recognition.2015:5525-5533.
[21] JAIN V,LEARNED-MILLER E.FDDB:A Benchmark for Face Detection in Unconstrained Settings[M]∥UMass Amherst Technical Report.University of Massachusetts,2010.

No related articles found!
Full text



No Suggested Reading articles found!