作者:Ross.B.Girshick(RBG大神), Jeff Donahue, Trevor Darrell, Jitendra Malik

发布时间:2014

**发布期刊:**CVPR

论文全称:Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

论文地址:https://openaccess.thecvf.com/content_cvpr_2014/html/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.html

Matlab代码:https://github.com/rbgirshick/rcnn

论文Slides:https://dl.dropboxusercontent.com/s/bpi3vd7gia9f6ul/rcnn-cvpr14-slides.pdf?dl=0

论文海报:https://dl.dropboxusercontent.com/s/tzefwijlstpapl1/rcnn-poster.pdf?dl=0

论文附录:https://dl.dropboxusercontent.com/s/1yisyl5cuxo7g9y/r-cnn-cvpr-supp.pdf?dl=0

地位:R-CNN是两阶段深度学习目标检测算法的开山奠基之作,首次将深度学习和卷积神经网络用于目标检测并取得显著性能提升

一、 摘要

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012—achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specifific fifine-tuning, yields a signifificant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: *Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/˜rbg/rcnn*