OSRE: Object-to-Spot Rotation Estimation for Bike Parking Assessment
- URL: http://arxiv.org/abs/2303.00725v1
- Date: Wed, 1 Mar 2023 18:34:10 GMT
- Title: OSRE: Object-to-Spot Rotation Estimation for Bike Parking Assessment
- Authors: Saghir Alfasly, Zaid Al-huda, Saifullah Bello, Ahmed Elazab, Jian Lu,
Chen Xu
- Abstract summary: This paper builds a camera-agnostic, well-annotated synthetic bike rotation dataset.
We then propose an object-to-spot rotation estimator (OSRE) by extending the object detection task to further regress the bike rotations in two axes.
The proposed OSRE is evaluated on synthetic and real-world data providing promising results.
- Score: 10.489021696058632
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current deep models provide remarkable object detection in terms of object
classification and localization. However, estimating object rotation with
respect to other visual objects in the visual context of an input image still
lacks deep studies due to the unavailability of object datasets with rotation
annotations.
This paper tackles these two challenges to solve the rotation estimation of a
parked bike with respect to its parking area. First, we leverage the power of
3D graphics to build a camera-agnostic well-annotated Synthetic Bike Rotation
Dataset (SynthBRSet). Then, we propose an object-to-spot rotation estimator
(OSRE) by extending the object detection task to further regress the bike
rotations in two axes. Since our model is purely trained on synthetic data, we
adopt image smoothing techniques when deploying it on real-world images. The
proposed OSRE is evaluated on synthetic and real-world data providing promising
results. Our data and code are available at
\href{https://github.com/saghiralfasly/OSRE-Project}{https://github.com/saghiralfasly/OSRE-Project}.
Related papers
- CarPatch: A Synthetic Benchmark for Radiance Field Evaluation on Vehicle
Components [77.33782775860028]
We introduce CarPatch, a novel synthetic benchmark of vehicles.
In addition to a set of images annotated with their intrinsic and extrinsic camera parameters, the corresponding depth maps and semantic segmentation masks have been generated for each view.
Global and part-based metrics have been defined and used to evaluate, compare, and better characterize some state-of-the-art techniques.
arXiv Detail & Related papers (2023-07-24T11:59:07Z) - DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object
Detection and Tracking [67.34803048690428]
We propose to model Dynamic Objects in RecurrenT (DORT) to tackle this problem.
DORT extracts object-wise local volumes for motion estimation that also alleviates the heavy computational burden.
It is flexible and practical that can be plugged into most camera-based 3D object detectors.
arXiv Detail & Related papers (2023-03-29T12:33:55Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z) - HM3D-ABO: A Photo-realistic Dataset for Object-centric Multi-view 3D
Reconstruction [37.29140654256627]
We present a photo-realistic object-centric dataset HM3D-ABO.
It is constructed by composing realistic indoor scene and realistic object.
The dataset could also be useful for tasks such as camera pose estimation and novel-view synthesis.
arXiv Detail & Related papers (2022-06-24T16:02:01Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Robust 2D/3D Vehicle Parsing in CVIS [54.825777404511605]
We present a novel approach to robustly detect and perceive vehicles in different camera views as part of a cooperative vehicle-infrastructure system (CVIS)
Our formulation is designed for arbitrary camera views and makes no assumptions about intrinsic or extrinsic parameters.
In practice, our approach outperforms SOTA methods on 2D detection, instance segmentation, and 6-DoF pose estimation.
arXiv Detail & Related papers (2021-03-11T03:35:05Z) - Single-Shot 3D Detection of Vehicles from Monocular RGB Images via
Geometry Constrained Keypoints in Real-Time [6.82446891805815]
We propose a novel 3D single-shot object detection method for detecting vehicles in monocular RGB images.
Our approach lifts 2D detections to 3D space by predicting additional regression and classification parameters.
We test our approach on different datasets for autonomous driving and evaluate it using the challenging KITTI 3D Object Detection and the novel nuScenes Object Detection benchmarks.
arXiv Detail & Related papers (2020-06-23T15:10:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.