Tiny-YOLO object detection supplemented with geometrical data
- URL: http://arxiv.org/abs/2008.02170v2
- Date: Thu, 15 Oct 2020 19:15:01 GMT
- Title: Tiny-YOLO object detection supplemented with geometrical data
- Authors: Ivan Khokhlov, Egor Davydenko, Ilya Osokin, Ilya Ryakin, Azer Babaev,
Vladimir Litvinenko, Roman Gorbachev
- Abstract summary: We propose a method of improving detection precision (mAP) with the help of the prior knowledge about the scene geometry.
We focus our attention on autonomous robots, so given the robot's dimensions and the inclination angles of the camera, it is possible to predict the spatial scale for each pixel of the input frame.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a method of improving detection precision (mAP) with the help of
the prior knowledge about the scene geometry: we assume the scene to be a plane
with objects placed on it. We focus our attention on autonomous robots, so
given the robot's dimensions and the inclination angles of the camera, it is
possible to predict the spatial scale for each pixel of the input frame. With
slightly modified YOLOv3-tiny we demonstrate that the detection supplemented by
the scale channel, further referred as S, outperforms standard RGB-based
detection with small computational overhead.
Related papers
- Using a Distance Sensor to Detect Deviations in a Planar Surface [20.15053198469424]
We investigate methods for determining if a planar surface contains geometric deviations using only an instantaneous measurement from a miniature optical time-of-flight sensor.
Key to our method is to utilize the entirety of information encoded in raw time-of-flight data captured by off-the-shelf distance sensors.
We build an example application in which our method enables mobile robot obstacle avoidance over a wide field-of-view.
arXiv Detail & Related papers (2024-08-07T15:24:25Z) - Anyview: Generalizable Indoor 3D Object Detection with Variable Frames [63.51422844333147]
We present a novel 3D detection framework named AnyView for our practical applications.
Our method achieves both great generalizability and high detection accuracy with a simple and clean architecture.
arXiv Detail & Related papers (2023-10-09T02:15:45Z) - Point Anywhere: Directed Object Estimation from Omnidirectional Images [10.152838128195468]
We propose a method using an omnidirectional camera to eliminate the user/object position constraint and the left/right constraint of the pointing arm.
The proposed method enables highly accurate estimation by repeatedly extracting regions of interest from the equirectangular image.
arXiv Detail & Related papers (2023-08-02T08:32:43Z) - RGB-Only Reconstruction of Tabletop Scenes for Collision-Free
Manipulator Control [71.51781695764872]
We present a system for collision-free control of a robot manipulator that uses only RGB views of the world.
Perceptual input of a tabletop scene is provided by multiple images of an RGB camera that is either handheld or mounted on the robot end effector.
A NeRF-like process is used to reconstruct the 3D geometry of the scene, from which the Euclidean full signed distance function (ESDF) is computed.
A model predictive control algorithm is then used to control the manipulator to reach a desired pose while avoiding obstacles in the ESDF.
arXiv Detail & Related papers (2022-10-21T01:45:08Z) - Neural Scene Representation for Locomotion on Structured Terrain [56.48607865960868]
We propose a learning-based method to reconstruct the local terrain for a mobile robot traversing urban environments.
Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the estimates the topography in the robot's vicinity.
We propose a 3D reconstruction model that faithfully reconstructs the scene, despite the noisy measurements and large amounts of missing data coming from the blind spots of the camera arrangement.
arXiv Detail & Related papers (2022-06-16T10:45:17Z) - A Fast Location Algorithm for Very Sparse Point Clouds Based on Object
Detection [0.0]
We propose an algorithm which can quickly locate the target object through image object detection in the circumstances of having very sparse feature points.
We conduct the experiment in a manually designed scene by holding a smartphone and the results represent high positioning speed and accuracy of our method.
arXiv Detail & Related papers (2021-10-21T05:17:48Z) - Nothing But Geometric Constraints: A Model-Free Method for Articulated
Object Pose Estimation [89.82169646672872]
We propose an unsupervised vision-based system to estimate the joint configurations of the robot arm from a sequence of RGB or RGB-D images without knowing the model a priori.
We combine a classical geometric formulation with deep learning and extend the use of epipolar multi-rigid-body constraints to solve this task.
arXiv Detail & Related papers (2020-11-30T20:46:48Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - Single View Metrology in the Wild [94.7005246862618]
We present a novel approach to single view metrology that can recover the absolute scale of a scene represented by 3D heights of objects or camera height above the ground.
Our method relies on data-driven priors learned by a deep network specifically designed to imbibe weakly supervised constraints from the interplay of the unknown camera with 3D entities such as object heights.
We demonstrate state-of-the-art qualitative and quantitative results on several datasets as well as applications including virtual object insertion.
arXiv Detail & Related papers (2020-07-18T22:31:33Z) - Object-oriented SLAM using Quadrics and Symmetry Properties for Indoor
Environments [11.069661312755034]
This paper proposes a sparse object-level SLAM algorithm based on an RGB-D camera.
A quadric representation is used as a landmark to compactly model objects, including their position, orientation, and occupied space.
Experiments have shown that compared with the state-of-art algorithm, especially on the forward trajectory of mobile robots, the proposed algorithm significantly improves the accuracy and convergence speed of quadric reconstruction.
arXiv Detail & Related papers (2020-04-11T04:15:25Z) - Real-Time Object Detection and Recognition on Low-Compute Humanoid
Robots using Deep Learning [0.12599533416395764]
We describe a novel architecture that enables multiple low-compute NAO robots to perform real-time detection, recognition and localization of objects in its camera view.
The proposed algorithm for object detection and localization is an empirical modification of YOLOv3, based on indoor experiments in multiple scenarios.
The architecture also comprises of an effective end-to-end pipeline to feed the real-time frames from the camera feed to the neural net and use its results for guiding the robot.
arXiv Detail & Related papers (2020-01-20T05:24:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.