FRED: Towards a Full Rotation-Equivariance in Aerial Image Object
Detection
- URL: http://arxiv.org/abs/2401.06159v1
- Date: Fri, 22 Dec 2023 09:31:43 GMT
- Title: FRED: Towards a Full Rotation-Equivariance in Aerial Image Object
Detection
- Authors: Chanho Lee, Jinsu Son, Hyounguk Shon, Yunho Jeon, Junmo Kim
- Abstract summary: We introduce a Fully Rotation-Equivariant Oriented Object Detector (FRED)
Our proposed method delivers comparable performance on DOTA-v1.0 and outperforms by 1.5 mAP on DOTA-v1.5, all while significantly reducing the model parameters to 16%.
- Score: 28.47314201641291
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Rotation-equivariance is an essential yet challenging property in oriented
object detection. While general object detectors naturally leverage robustness
to spatial shifts due to the translation-equivariance of the conventional CNNs,
achieving rotation-equivariance remains an elusive goal. Current detectors
deploy various alignment techniques to derive rotation-invariant features, but
still rely on high capacity models and heavy data augmentation with all
possible rotations. In this paper, we introduce a Fully Rotation-Equivariant
Oriented Object Detector (FRED), whose entire process from the image to the
bounding box prediction is strictly equivariant. Specifically, we decouple the
invariant task (object classification) and the equivariant task (object
localization) to achieve end-to-end equivariance. We represent the bounding box
as a set of rotation-equivariant vectors to implement rotation-equivariant
localization. Moreover, we utilized these rotation-equivariant vectors as
offsets in the deformable convolution, thereby enhancing the existing
advantages of spatial adaptation. Leveraging full rotation-equivariance, our
FRED demonstrates higher robustness to image-level rotation compared to
existing methods. Furthermore, we show that FRED is one step closer to non-axis
aligned learning through our experiments. Compared to state-of-the-art methods,
our proposed method delivers comparable performance on DOTA-v1.0 and
outperforms by 1.5 mAP on DOTA-v1.5, all while significantly reducing the model
parameters to 16%.
Related papers
- Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection [37.142470149311904]
We propose atemporal equivariant learning framework by considering both spatial and temporal augmentations jointly.
We show our pre-training method for 3D object detection which outperforms existing equivariant and invariant approaches in many settings.
arXiv Detail & Related papers (2024-04-17T20:41:49Z) - Structuring Representation Geometry with Rotationally Equivariant
Contrastive Learning [42.20218717636608]
Self-supervised learning converts raw perceptual data such as images to a compact space where simple Euclidean distances measure meaningful variations in data.
We extend this formulation by adding additional geometric structure to the embedding space by enforcing transformations of input space to correspond to simple transformations of embedding space.
We show that merely combining our equivariant loss with a non-collapse term results in non-trivial representations.
arXiv Detail & Related papers (2023-06-24T10:07:52Z) - Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem.
In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images.
The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - PaRot: Patch-Wise Rotation-Invariant Network via Feature Disentanglement
and Pose Restoration [16.75367717130046]
State-of-the-art models are not robust to rotations, which remains an unknown prior to real applications.
We introduce a novel Patch-wise Rotation-invariant network (PaRot)
Our disentanglement module extracts high-quality rotation-robust features and the proposed lightweight model achieves competitive results.
arXiv Detail & Related papers (2023-02-06T02:13:51Z) - Detecting Rotated Objects as Gaussian Distributions and Its 3-D
Generalization [81.29406957201458]
Existing detection methods commonly use a parameterized bounding box (BBox) to model and detect (horizontal) objects.
We argue that such a mechanism has fundamental limitations in building an effective regression loss for rotation detection.
We propose to model the rotated objects as Gaussian distributions.
We extend our approach from 2-D to 3-D with a tailored algorithm design to handle the heading estimation.
arXiv Detail & Related papers (2022-09-22T07:50:48Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - RSDet++: Point-based Modulated Loss for More Accurate Rotated Object
Detection [53.57176614020894]
We classify the discontinuity of loss in both five-param and eight-param rotated object detection methods as rotation sensitivity error (RSE)
We introduce a novel modulated rotation loss to alleviate the problem and propose a rotation sensitivity detection network (RSDet)
To further improve the accuracy of our method on objects smaller than 10 pixels, we introduce a novel RSDet++.
arXiv Detail & Related papers (2021-09-24T11:57:53Z) - Learning High-Precision Bounding Box for Rotated Object Detection via
Kullback-Leibler Divergence [100.6913091147422]
Existing rotated object detectors are mostly inherited from the horizontal detection paradigm.
In this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology.
arXiv Detail & Related papers (2021-06-03T14:29:19Z) - Optimization for Oriented Object Detection via Representation Invariance
Loss [2.501282372971187]
mainstream rotation detectors use oriented bounding boxes (OBB) or quadrilateral bounding boxes (QBB) to represent the rotating objects.
We propose a Representation Invariance Loss (RIL) to optimize the bounding box regression for the rotating objects.
Our method achieves consistent and substantial improvement in experiments on remote sensing datasets and scene text datasets.
arXiv Detail & Related papers (2021-03-22T07:55:33Z) - ReDet: A Rotation-equivariant Detector for Aerial Object Detection [27.419045245853706]
We propose a Rotation-equivariant Detector (ReDet) to address these issues.
We incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features.
Our method can achieve state-of-the-art performance on the task of aerial object detection.
arXiv Detail & Related papers (2021-03-13T15:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.