作者:Jianqi Ma, Weiyuan Shao, Hao Ye, Li Wang, Hong Wang, Yingbin Zheng, Xiangyang Xue
发布时间:2018
发布期刊:Arxiv
**论文全称:**Arbitrary-Oriented Scene Text Detection via Rotation Proposals
论文地址:https://arxiv.org/pdf/1703.01086.pdf
代码:
地位:
Abstract—This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. We present the Rotation Region Proposal Networks (RRPN), which are designed to generate inclined proposals with text orientation angle information. The angle information is then adapted for bounding box regression to make the proposals more accurately fifit into the text region in terms of the orientation. The Rotation Region-of-Interest (RRoI) pooling layer is proposed to project arbitrary-oriented proposals to a feature map for a text region classififier. The whole framework is built upon a region proposal-based architecture, which ensures the computational effificiency of the arbitrary-oriented text detection compared with previous text detection systems. We conduct experiments using the rotation-based framework on three real-world scene text detection datasets and demonstrate its superiority in terms of effectiveness and effificiency over previous approaches.
**自然场景图像中的文本进行检测的难点:**光线不均匀、模糊、透视失真、方向多变等复杂情况
**HRoI:**已有的研究方法大都使用水平和接近水平的注释,并返回对水平区域的检测(主要有①滑动窗口②连接组件③自下向上策略),但是实际应用中,大量的文本区域不是水平的
基于滑动窗口的:使用固定大小的滑动窗口来滑动文本区域,并找到最有可能包含文本的区域,为了结果更精确,对滑动窗口方法应用了多种尺度和比率,这种方法计算成本大,效率低下
**基于连接组件的:**主要通过边缘检测或极端区域提取来检测字符,从而聚焦于图像的边缘和像素点,然后将sub-MSER组件组合成图像的单词或文本线区域,这种方法在涉及多个连接字符、分割笔程字符和非均匀照明等困难情况下的能力受到限制
RRoI:主要有两个步骤,即采用全卷积网络(FCN)等分割网络来生成文本预测图(这一步非常的耗时),并采用几何方法来提出倾斜的候选框**(有些网络需要多个后处理步骤,即较为繁琐)**