Long Range Object-Level Monocular Depth Estimation for UAVs
- URL: http://arxiv.org/abs/2302.08943v1
- Date: Fri, 17 Feb 2023 15:26:04 GMT
- Title: Long Range Object-Level Monocular Depth Estimation for UAVs
- Authors: David Silva, Nicolas Jourdan, Nils G\"ahlert
- Abstract summary: We propose several novel extensions to state-of-the-art methods for monocular object detection from images at long range.
Firstly, we propose Sigmoid and ReLU-like encodings when modeling depth estimation as a regression task.
Secondly, we frame the depth estimation as a classification problem and introduce a Soft-Argmax function in the calculation of the training loss.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computer vision-based object detection is a key modality for advanced
Detect-And-Avoid systems that allow for autonomous flight missions of UAVs.
While standard object detection frameworks do not predict the actual depth of
an object, this information is crucial to avoid collisions. In this paper, we
propose several novel extensions to state-of-the-art methods for monocular
object detection from images at long range. Firstly, we propose Sigmoid and
ReLU-like encodings when modeling depth estimation as a regression task.
Secondly, we frame the depth estimation as a classification problem and
introduce a Soft-Argmax function in the calculation of the training loss. The
extensions are exemplarily applied to the YOLOX object detection framework. We
evaluate the performance using the Amazon Airborne Object Tracking dataset. In
addition, we introduce the Fitness score as a new metric that jointly assesses
both object detection and depth estimation performance. Our results show that
the proposed methods outperform state-of-the-art approaches w.r.t. existing, as
well as the proposed metrics.
Related papers
- On the Inherent Robustness of One-Stage Object Detection against Out-of-Distribution Data [6.7236795813629]
We propose a novel detection algorithm for detecting unknown objects in image data.
It exploits supervised dimensionality reduction techniques to mitigate the effects of the curse of dimensionality on the features extracted by the model.
It utilizes high-resolution feature maps to identify potential unknown objects in an unsupervised fashion.
arXiv Detail & Related papers (2024-11-07T10:15:25Z) - TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs [5.6168844664788855]
This work presents TanDepth, a practical, online scale recovery method for obtaining metric depth results from relative estimations at inference-time.
Tailored for Unmanned Aerial Vehicle (UAV) applications, our method leverages sparse measurements from Global Digital Elevation Models (GDEM) by projecting them to the camera view.
An adaptation to the Cloth Simulation Filter is presented, which allows selecting ground points from the estimated depth map to then correlate with the projected reference points.
arXiv Detail & Related papers (2024-09-08T15:54:43Z) - Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance Estimation [16.671696289301625]
This paper presents a deep-learning framework that utilizes optical sensors for the detection, tracking, and distance estimation of non-cooperative aerial vehicles.
In this work, we propose a method for estimating the distance information of a detected aerial object in real time using only the input of a monocular camera.
arXiv Detail & Related papers (2024-05-10T18:06:41Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Incremental Object-Based Novelty Detection with Feedback Loop [18.453867533201308]
Object-based Novelty Detection (ND) aims to identify unknown objects that do not belong to classes seen during training.
Traditional approaches to ND focus on one time offline post processing of the pretrained object detection output.
We propose a novel framework for object-based ND, assuming that human feedback can be requested on the predicted output.
arXiv Detail & Related papers (2023-11-15T14:46:20Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for
Dynamic Scenes [58.89295356901823]
Self-supervised monocular depth estimation has shown impressive results in static scenes.
It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions.
We introduce an external pretrained monocular depth estimation model for generating single-image depth prior.
Our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes.
arXiv Detail & Related papers (2022-11-07T16:17:47Z) - Detecting Invisible People [58.49425715635312]
We re-purpose tracking benchmarks and propose new metrics for the task of detecting invisible objects.
We demonstrate that current detection and tracking systems perform dramatically worse on this task.
Second, we build dynamic models that explicitly reason in 3D, making use of observations produced by state-of-the-art monocular depth estimation networks.
arXiv Detail & Related papers (2020-12-15T16:54:45Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.