作者:Jiaming Han, Jian Ding, Jie Li, Gui-Song Xia

**机构:**武汉大学,上海航天电子通讯设备研究所

**论文全称:**Align Deep Features for Oriented Object Detection

发布时间:2020

发布期刊:arxiv

论文地址:https://arxiv.org/abs/2008.09397

代码:*[https://github.com/csuhan/s2anet](https://github.com/csuhan/s2anet.)*

一、摘要

The past decade has witnessed signifificant progress on detecting objects in aerial images that are often distributed with large scale variations and arbitrary orientations. However most of existing methods rely on heuristically defifined anchors with different scales, angles and aspect ratios and usually suffer from severe misalignment between anchor boxes and axis-aligned convolutional features, which leads to the common inconsistency between the classifification score and localization accuracy. To address this issue, we propose a Single-shot Alignment Network (S2A-Net) consisting of two modules: a Feature Alignment Module (FAM) and an Oriented Detection Module (ODM). The FAM can generate high-quality anchors with an Anchor Refifinement Network and adaptively align the convolutional features according to the anchor boxes with a novel Alignment Convolution.The ODM fifirst adopts active rotating fifilters to encode the orientation information and then produces orientation-sensitive and orientation-invariant features to alleviate the inconsistency between classifification score and localization accuracy. Besides, we further explore the approach to detect objects in large-size images, which leads to a better trade-off between speed and accuracy. Extensive experiments demonstrate that our method can achieve state-of-the-art performance on two commonly used aerial objects datasets (i.e., DOTA and HRSC2016) while keeping high effificiency.

二、研究背景

  1. 最近几年,目标检测算法在遥感图像上也取得进展,大多数现有的方法都致力于解决航空图像中物体的尺寸变化大和方向不定、拥挤所带来的问题

  2. 双阶段算法:

    1. 为了取得更好的检测性能,像Roi Transformer,Cad-net,Scrdet,Gliding vertex等网络就是基于R-CNN框架(包含RPN(产生anchor)和R-CNN检测头(用于bbox回归和分类)两部分)实现的
      • **缺点:**水平ROI经常会导致bbox和有向object(要检测的目标)之间的严重错位,例如一个水平的bbox中包含有多个目标,因为遥感图像中的目标是有方向的,并且是密集分布的
    2. 为解决a中提到的问题,就是使用带方向的bbox作为anchor
      • 常见的做法是设计不同角度、大小、长宽比的anchor
      • **缺点:**因为anchor数量的增加,会导致大量的计算和占用大量的内存,例如特征图上的每个点要生成5种角度、3种大小,3种长宽比的anchor,则有533=45个anchor,而对比水平anchor,没有角度,则只有3*3=9个anchor
    3. Roi Transformer为了解决b中提到的问题,将水平Roi转化为带角度的Roi,这样就可以避免产生大量的anchor
      • 缺点:依旧需要启发式的定义anchor和复杂的Roi操作
  3. 单阶段算法:

    1. 单阶段检测器通过直接采样规则且密集的anchor来回归bbox并对它们进行分类,该体系结构具有较高的计算效率,但在精度上往往落后
    2. 而单阶段算法的缺点主要有:
      1. **采样的anchor不能覆盖检测对象,导致对象和anchor之间的校准不正确。这种不一致通常会加剧前景-背景类的不平衡,并阻碍性能,**例如桥的长宽比通常是1:3到1:30,而只有少数甚至没有anchor可以分配给这种长宽比的检测对象
      2. 来自主干网络的卷积特征通常是轴向的(与坐标轴平行),接收域固定,而航空图像中的物体分布有任意的方向和不同的外观,即使anchor被分配给一个具有高置信度的物体,anchor和卷积特征之间仍然存在错位,也就是说,anchor所对应的特征在一定程度上很难代表整个对象(如下图所示),最终的分类评分不能准确地反映定位精度,这也阻碍了后处理阶段的检测性能(如NMS)
  4. 为了解决上述阶段算法出现的问题,作者使用以下的方法进行解决,如下图所示:

    作者首先将初始的anchor(蓝色边界框)细化为一个旋转的anchor(橙色边界框),然后根据anchor(橙色边界框)的指导,调整特征采样位置(橙色点),以提取对齐的深度特征

Untitled