Related papers: Orienting Novel 3D Objects Using Self-Supervised Learning of Rotation Transforms

Orienting Novel 3D Objects Using Self-Supervised Learning of Rotation Transforms

URL: http://arxiv.org/abs/2105.14246v1
Date: Sat, 29 May 2021 08:22:55 GMT
Title: Orienting Novel 3D Objects Using Self-Supervised Learning of Rotation Transforms
Authors: Shivin Devgon, Jeffrey Ichnowski, Ashwin Balakrishna, Harry Zhang, Ken Goldberg
Abstract summary: Orienting objects is a critical component in the automation of many packing and assembly tasks. We train a deep neural network to estimate the 3D rotation as parameterized by a quaternion. We then use the trained network in a proportional controller to re-orient objects based on the estimated rotation between the two depth images.
Score: 22.91890127146324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Orienting objects is a critical component in the automation of many packing and assembly tasks. We present an algorithm to orient novel objects given a depth image of the object in its current and desired orientation. We formulate a self-supervised objective for this problem and train a deep neural network to estimate the 3D rotation as parameterized by a quaternion, between these current and desired depth images. We then use the trained network in a proportional controller to re-orient objects based on the estimated rotation between the two depth images. Results suggest that in simulation we can rotate unseen objects with unknown geometries by up to 30{\deg} with a median angle error of 1.47{\deg} over 100 random initial/desired orientations each for 22 novel objects. Experiments on physical objects suggest that the controller can achieve a median angle error of 4.2{\deg} over 10 random initial/desired orientations each for 5 objects.

Related papers

Compass Control: Multi Object Orientation Control for Text-to-Image Generation [24.4172525865806]
Existing approaches for controlling text-to-image diffusion models, while powerful, do not allow for explicit 3D object-centric control. We address the problem of multi-object orientation control in text-to-image diffusion models. This enables the generation of diverse multi-object scenes with precise orientation control for each object.
arXiv Detail & Related papers (2025-04-09T10:15:15Z)
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models [79.96917782423219]
Orient Anything is the first expert and foundational model designed to estimate object orientation in a single image. By developing a pipeline to annotate the front face of 3D objects, we collect 2M images with precise orientation annotations. Our model achieves state-of-the-art orientation estimation accuracy in both rendered and real images.
arXiv Detail & Related papers (2024-12-24T18:58:43Z)
OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images [26.37802649901314]
Oriented object detection in remote sensing images is a challenging task due to objects being distributed in multi-orientation. We propose an end-to-end transformer-based oriented object detector consisting of three dedicated modules to address these issues. Compared with previous end-to-end detectors, the OrientedFormer gains 1.16 and 1.21 AP$_50$ on DIOR-R and DOTA-v1.0 respectively, while reducing training epochs from 3$times$ to 1$times$.
arXiv Detail & Related papers (2024-09-29T10:36:33Z)
Uncertainty-aware Active Learning of NeRF-based Object Models for Robot Manipulators using Visual and Re-orientation Actions [8.059133373836913]
This paper presents an approach that enables a robot to rapidly learn the complete 3D model of a given object for manipulation in unfamiliar orientations. We use an ensemble of partially constructed NeRF models to quantify model uncertainty to determine the next action. Our approach determines when and how to grasp and re-orient an object given its partial NeRF model and re-estimates the object pose to rectify misalignments introduced during the interaction.
arXiv Detail & Related papers (2024-04-02T10:15:06Z)
DVMNet++: Rethinking Relative Pose Estimation for Unseen Objects [59.51874686414509]
Existing approaches typically predict 3D translation utilizing the ground-truth object bounding box and approximate 3D rotation with a large number of discrete hypotheses. We present a Deep Voxel Matching Network (DVMNet++) that computes the relative object pose in a single pass. Our approach delivers more accurate relative pose estimates for novel objects at a lower computational cost compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-03-20T15:41:32Z)
LocaliseBot: Multi-view 3D object localisation with differentiable rendering for robot grasping [9.690844449175948]
We focus on object pose estimation. Our approach relies on three pieces of information: multiple views of the object, the camera's parameters at those viewpoints, and 3D CAD models of objects. We show that the estimated object pose results in 99.65% grasp accuracy with the ground truth grasp candidates.
arXiv Detail & Related papers (2023-11-14T14:27:53Z)
RelPose++: Recovering 6D Poses from Sparse-view Observations [66.6922660401558]
We address the task of estimating 6D camera poses from sparse-view image sets (2-8 images) We build on the recent RelPose framework which learns a network that infers distributions over relative rotations over image pairs. Our final system results in large improvements in 6D pose prediction over prior art on both seen and unseen object categories.
arXiv Detail & Related papers (2023-05-08T17:59:58Z)
Adaptive Rotated Convolution for Rotated Object Detection [96.94590550217718]
We present Adaptive Rotated Convolution (ARC) module to handle rotated object detection problem. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images. The proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.
arXiv Detail & Related papers (2023-03-14T11:53:12Z)
Multi-Projection Fusion and Refinement Network for Salient Object Detection in 360{\deg} Omnidirectional Image [141.10227079090419]
We propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360deg omnidirectional image. MPFR-Net uses the equirectangular projection image and four corresponding cube-unfolding images as inputs. Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-23T14:50:40Z)
RSDet++: Point-based Modulated Loss for More Accurate Rotated Object Detection [53.57176614020894]
We classify the discontinuity of loss in both five-param and eight-param rotated object detection methods as rotation sensitivity error (RSE) We introduce a novel modulated rotation loss to alleviate the problem and propose a rotation sensitivity detection network (RSDet) To further improve the accuracy of our method on objects smaller than 10 pixels, we introduce a novel RSDet++.
arXiv Detail & Related papers (2021-09-24T11:57:53Z)
Oriented Object Detection in Aerial Images Based on Area Ratio of Parallelogram [0.0]
Rotated object detection is a challenging task in aerial images. Existing regression-based rotation detectors suffer the problem of discontinuous boundaries. We propose a simple effective framework to address the above challenges.
arXiv Detail & Related papers (2021-09-21T14:13:36Z)
Category-Level Metric Scale Object Shape and Pose Estimation [73.92460712829188]
We propose a framework that jointly estimates a metric scale shape and pose from a single RGB image. We validated our method on both synthetic and real-world datasets to evaluate category-level object pose and shape.
arXiv Detail & Related papers (2021-09-01T12:16:46Z)
Robust 6D Object Pose Estimation by Learning RGB-D Features [59.580366107770764]
We propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem. We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction. Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-02-29T06:24:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.