Related papers: Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers

Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers

URL: http://arxiv.org/abs/2512.18784v1
Date: Sun, 21 Dec 2025 15:57:13 GMT
Title: Eff-GRot: Efficient and Generalizable Rotation Estimation with Transformers
Authors: Fanis Mathioulakis, Gorjan Radevski, Tinne Tuytelaars,
Abstract summary: We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images.<n>Given a query image and a set of reference images with known orientations, our method directly predicts the object's rotation in a single forward pass.
Score: 35.57122848273358
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce Eff-GRot, an approach for efficient and generalizable rotation estimation from RGB images. Given a query image and a set of reference images with known orientations, our method directly predicts the object's rotation in a single forward pass, without requiring object- or category-specific training. At the core of our framework is a transformer that performs a comparison in the latent space, jointly processing rotation-aware representations from multiple references alongside a query. This design enables a favorable balance between accuracy and computational efficiency while remaining simple, scalable, and fully end-to-end. Experimental results show that Eff-GRot offers a promising direction toward more efficient rotation estimation, particularly in latency-sensitive applications.

Related papers

Computing a Characteristic Orientation for Rotation-Independent Image Analysis [0.0]
General Intensity Direction (GID) is a preprocessing method that improves rotation robustness without modifying the network architecture.<n>It transforms the image while preserving spatial structure, making it compatible with convolutional networks.<n> Experimental evaluation on the rotated MNIST dataset shows that the proposed method achieves higher accuracy than state-of-the-art rotation-invariant architectures.
arXiv Detail & Related papers (2026-02-24T14:08:12Z)
Accelerated Rotation-Invariant Convolution for UAV Image Segmentation [36.23556720064733]
In this paper, we introduce a GPU-optimized rotation-invariant convolution framework.<n>By exploiting structured data sharing among symmetrically rotated filters, our method achieves multi-orientation convolution with greatly reduced memory traffic and computational redundancy.<n>Across extensive benchmarks, the proposed convolution achieves 20--55% faster training and 15--45% lower energy consumption than CUDNN.
arXiv Detail & Related papers (2025-12-09T18:30:00Z)
Rotated Mean-Field Variational Inference and Iterative Gaussianization [11.954133194037858]
We propose to perform mean-field variational inference (MFVI) in a rotated coordinate system.<n>MFVI in a rotated coordinate system defines a rotation and a coordinatewise map that together move the target closer to Gaussian.<n>Iterating this procedure yields a sequence of transformations that progressively transforms the target toward Gaussian.
arXiv Detail & Related papers (2025-10-09T03:13:44Z)
Exploring Kernel Transformations for Implicit Neural Representations [57.2225355625268]
Implicit neural representations (INRs) leverage neural networks to represent signals by mapping coordinates to their corresponding attributes.<n>This work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged.<n>A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible overhead.
arXiv Detail & Related papers (2025-04-07T04:43:50Z)
Relaxed Rotational Equivariance via $G$-Biases in Vision [19.814324876189772]
Group Equivariant Convolution (GConv) can capture rotational equivariance from original data.<n>However, the presentation or distribution of real-world data rarely conforms to strict rotational equivariance.<n>We propose a simple but highly effective method to address this problem, which utilizes a set of learnable biases called $G$-Biases.<n> Experiments demonstrate that the proposed RREConv-based methods achieve excellent performance compared to existing GConv-based methods in both classification and 2D object detection tasks.
arXiv Detail & Related papers (2024-08-22T14:52:53Z)
Toward Efficient Visual Gyroscopes: Spherical Moments, Harmonics Filtering, and Masking Techniques for Spherical Camera Applications [83.8743080143778]
A visual gyroscope estimates camera rotation through images. The integration of omnidirectional cameras, offering a larger field of view compared to traditional RGB cameras, has proven to yield more accurate and robust results. Here, we address these challenges by introducing a novel visual gyroscope, which combines an Efficient Multi-Mask-Filter Rotation Estor and a Learning based optimization.
arXiv Detail & Related papers (2024-04-02T13:19:06Z)
Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images. The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z)
RAGO: Recurrent Graph Optimizer For Multiple Rotation Averaging [62.315673415889314]
This paper proposes a deep recurrent Rotation Averaging Graph (RAGO) for Multiple Rotation Averaging (MRA) Our framework is a real-time learning-to-optimize rotation averaging graph with a tiny size deployed for real-world applications.
arXiv Detail & Related papers (2022-12-14T13:19:40Z)
Category-Level 6D Object Pose Estimation with Flexible Vector-Based Rotation Representation [51.67545893892129]
We propose a novel 3D graph convolution based pipeline for category-level 6D pose and size estimation from monocular RGB-D images. We first design an orientation-aware autoencoder with 3D graph convolution for latent feature learning. Then, to efficiently decode the rotation information from the latent feature, we design a novel flexible vector-based decomposable rotation representation.
arXiv Detail & Related papers (2022-12-09T02:13:43Z)
DRKF: Distilled Rotated Kernel Fusion for Efficient Rotation Invariant Descriptors in Local Feature Matching [9.68840174997957]
Rotated Fusion Kernel (RKF) imposes rotations on the convolution kernel to improve the inherent nature of CNN. MOFA aggregates features extracted from multiple rotated versions of the input image. Our method can outperform other state-of-the-art techniques when exposed to large rotation variations.
arXiv Detail & Related papers (2022-09-22T10:29:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.