Learning Monocular Depth from Events via Egomotion Compensation
- URL: http://arxiv.org/abs/2412.19067v1
- Date: Thu, 26 Dec 2024 05:41:18 GMT
- Title: Learning Monocular Depth from Events via Egomotion Compensation
- Authors: Haitao Meng, Chonghao Zhong, Sheng Tang, Lian JunJia, Wenwei Lin, Zhenshan Bing, Yi Chang, Gang Chen, Alois Knoll,
- Abstract summary: Event cameras are neuromorphically inspired sensors that sparsely and asynchronously report brightness changes.
We propose an interpretable monocular depth estimation framework, where the likelihood of various depth hypotheses is explicitly determined by the effect of motion compensation.
Our proposed framework outperforms cutting-edge methods by up to 10% in terms of the absolute relative error metric.
- Score: 20.388521240421948
- License:
- Abstract: Event cameras are neuromorphically inspired sensors that sparsely and asynchronously report brightness changes. Their unique characteristics of high temporal resolution, high dynamic range, and low power consumption make them well-suited for addressing challenges in monocular depth estimation (e.g., high-speed or low-lighting conditions). However, current existing methods primarily treat event streams as black-box learning systems without incorporating prior physical principles, thus becoming over-parameterized and failing to fully exploit the rich temporal information inherent in event camera data. To address this limitation, we incorporate physical motion principles to propose an interpretable monocular depth estimation framework, where the likelihood of various depth hypotheses is explicitly determined by the effect of motion compensation. To achieve this, we propose a Focus Cost Discrimination (FCD) module that measures the clarity of edges as an essential indicator of focus level and integrates spatial surroundings to facilitate cost estimation. Furthermore, we analyze the noise patterns within our framework and improve it with the newly introduced Inter-Hypotheses Cost Aggregation (IHCA) module, where the cost volume is refined through cost trend prediction and multi-scale cost consistency constraints. Extensive experiments on real-world and synthetic datasets demonstrate that our proposed framework outperforms cutting-edge methods by up to 10\% in terms of the absolute relative error metric, revealing superior performance in predicting accuracy.
Related papers
- Towards Resource-Efficient Federated Learning in Industrial IoT for Multivariate Time Series Analysis [50.18156030818883]
Anomaly and missing data constitute a thorny problem in industrial applications.
Deep learning enabled anomaly detection has emerged as a critical direction.
The data collected in edge devices contain user privacy.
arXiv Detail & Related papers (2024-11-06T15:38:31Z) - Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation [34.529280562470746]
We introduce a novel self-supervised loss combining the Contrast Maximization framework with a non-linear motion prior in the form of pixel-level trajectories.
Their effectiveness is demonstrated in two scenarios: In dense continuous-time motion estimation, our method improves the zero-shot performance of a synthetically trained model by 29%.
arXiv Detail & Related papers (2024-07-15T15:18:28Z) - Self-supervised Event-based Monocular Depth Estimation using Cross-modal
Consistency [18.288912105820167]
We propose a self-supervised event-based monocular depth estimation framework named EMoDepth.
EMoDepth constrains the training process using the cross-modal consistency from intensity frames that are aligned with events in the pixel coordinate.
In inference, only events are used for monocular depth prediction.
arXiv Detail & Related papers (2024-01-14T07:16:52Z) - DS-Depth: Dynamic and Static Depth Estimation via a Fusion Cost Volume [26.990400985745786]
We propose a novel dynamic cost volume that exploits residual optical flow to describe moving objects.
The results demonstrate that our model outperforms previously published baselines for self-supervised monocular depth estimation.
arXiv Detail & Related papers (2023-08-14T15:57:42Z) - FEDORA: Flying Event Dataset fOr Reactive behAvior [9.470870778715689]
Event-based sensors have emerged as low latency and low energy alternatives to standard frame-based cameras for capturing high-speed motion.
We present Flying Event dataset fOr Reactive behAviour (FEDORA) - a fully synthetic dataset for perception tasks.
arXiv Detail & Related papers (2023-05-22T22:59:05Z) - Practical Exposure Correction: Great Truths Are Always Simple [65.82019845544869]
We establish a Practical Exposure Corrector (PEC) that assembles the characteristics of efficiency and performance.
We introduce an exposure adversarial function as the key engine to fully extract valuable information from the observation.
Our experiments fully reveal the superiority of our proposed PEC.
arXiv Detail & Related papers (2022-12-29T09:52:13Z) - Rethinking Cost-sensitive Classification in Deep Learning via
Adversarial Data Augmentation [4.479834103607382]
Cost-sensitive classification is critical in applications where misclassification errors widely vary in cost.
This paper proposes a cost-sensitive adversarial data augmentation framework to make over- parameterized models cost-sensitive.
Our method can effectively minimize the overall cost and reduce critical errors, while achieving comparable performance in terms of overall accuracy.
arXiv Detail & Related papers (2022-08-24T19:00:30Z) - Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular
Depth Estimation by Integrating IMU Motion Dynamics [74.1720528573331]
Unsupervised monocular depth and ego-motion estimation has drawn extensive research attention in recent years.
We propose DynaDepth, a novel scale-aware framework that integrates information from vision and IMU motion dynamics.
We validate the effectiveness of DynaDepth by conducting extensive experiments and simulations on the KITTI and Make3D datasets.
arXiv Detail & Related papers (2022-07-11T07:50:22Z) - DeepRM: Deep Recurrent Matching for 6D Pose Refinement [77.34726150561087]
DeepRM is a novel recurrent network architecture for 6D pose refinement.
The architecture incorporates LSTM units to propagate information through each refinement step.
DeepRM achieves state-of-the-art performance on two widely accepted challenging datasets.
arXiv Detail & Related papers (2022-05-28T16:18:08Z) - Learning Monocular Dense Depth from Events [53.078665310545745]
Event cameras produce brightness changes in the form of a stream of asynchronous events instead of intensity frames.
Recent learning-based approaches have been applied to event-based data, such as monocular depth prediction.
We propose a recurrent architecture to solve this task and show significant improvement over standard feed-forward methods.
arXiv Detail & Related papers (2020-10-16T12:36:23Z) - Object-based Illumination Estimation with Rendering-aware Neural
Networks [56.01734918693844]
We present a scheme for fast environment light estimation from the RGBD appearance of individual objects and their local image areas.
With the estimated lighting, virtual objects can be rendered in AR scenarios with shading that is consistent to the real scene.
arXiv Detail & Related papers (2020-08-06T08:23:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.