DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
- URL: http://arxiv.org/abs/2508.11591v1
- Date: Fri, 15 Aug 2025 16:55:12 GMT
- Title: DashCam Video: A complementary low-cost data stream for on-demand forest-infrastructure system monitoring
- Authors: Durga Joshi, Chandi Witharana, Robert Fahey, Thomas Worthley, Zhe Zhu, Diego Cerrai,
- Abstract summary: This study introduces a novel, low-cost, and reproducible framework for real-time, object-level structural assessment and geolocation of roadside vegetation and infrastructure.<n>We developed an end-to-end pipeline that combines monocular depth estimation, depth error correction, and geometric triangulation to generate accurate spatial and structural data from vehicle-mounted dashcams.<n>Our approach complements conventional RS methods, such as LiDAR and image by offering a fast, real-time, and cost-effective solution for object-level monitoring of vegetation risks and infrastructure exposure.
- Score: 1.6064410860203764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Our study introduces a novel, low-cost, and reproducible framework for real-time, object-level structural assessment and geolocation of roadside vegetation and infrastructure with commonly available but underutilized dashboard camera (dashcam) video data. We developed an end-to-end pipeline that combines monocular depth estimation, depth error correction, and geometric triangulation to generate accurate spatial and structural data from street-level video streams from vehicle-mounted dashcams. Depth maps were first estimated using a state-of-the-art monocular depth model, then refined via a gradient-boosted regression framework to correct underestimations, particularly for distant objects. The depth correction model achieved strong predictive performance (R2 = 0.92, MAE = 0.31 on transformed scale), significantly reducing bias beyond 15 m. Further, object locations were estimated using GPS-based triangulation, while object heights were calculated using pin hole camera geometry. Our method was evaluated under varying conditions of camera placement and vehicle speed. Low-speed vehicle with inside camera gave the highest accuracy, with mean geolocation error of 2.83 m, and mean absolute error (MAE) in height estimation of 2.09 m for trees and 0.88 m for poles. To the best of our knowledge, it is the first framework to combine monocular depth modeling, triangulated GPS-based geolocation, and real-time structural assessment for urban vegetation and infrastructure using consumer-grade video data. Our approach complements conventional RS methods, such as LiDAR and image by offering a fast, real-time, and cost-effective solution for object-level monitoring of vegetation risks and infrastructure exposure, making it especially valuable for utility companies, and urban planners aiming for scalable and frequent assessments in dynamic urban environments.
Related papers
- Gamma-from-Mono: Road-Relative, Metric, Self-Supervised Monocular Geometry for Vehicular Applications [2.9457242478147503]
We introduce Gamma-from-Mono (GfM), a lightweight monocular geometry estimation method.<n>GfM predicts a dominant road surface plane together with residual variations expressed by gamma.<n>With only the camera's height above ground, GfM deterministically recovers metric depth via a closed form.
arXiv Detail & Related papers (2025-12-03T22:37:38Z) - 2.5D Object Detection for Intelligent Roadside Infrastructure [37.07785188366053]
We introduce a 2.5D object detection framework for infrastructure roadside-mounted cameras.<n>We employ a prediction approach to detect ground planes of vehicles as parallelograms in the image frame.<n>Our results show high detection accuracy, strong cross-viewpoint generalization, and robustness to diverse lighting and weather conditions.
arXiv Detail & Related papers (2025-07-04T13:16:59Z) - NOVA: Navigation via Object-Centric Visual Autonomy for High-Speed Target Tracking in Unstructured GPS-Denied Environments [56.35569661650558]
We introduce NOVA, a fully onboard, object-centric framework that enables robust target tracking and collision-aware navigation.<n>Rather than constructing a global map, NOVA formulates perception, estimation, and control entirely in the target's reference frame.<n>We validate NOVA across challenging real-world scenarios, including urban mazes, forest trails, and repeated transitions through buildings with intermittent GPS loss.
arXiv Detail & Related papers (2025-06-23T14:28:30Z) - Semi-SD: Semi-Supervised Metric Depth Estimation via Surrounding Cameras for Autonomous Driving [20.19617659712535]
Semi-SD is a novel metric depth estimation framework tailored for surrounding cameras equipment in autonomous driving.<n>We propose a unified spatial-temporal-semantic fusion module to construct the visual fused features.<n>We evaluate our algorithm on DDAD and nuScenes datasets, and the results demonstrate that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-03-25T14:39:04Z) - Boosting 3-DoF Ground-to-Satellite Camera Localization Accuracy via
Geometry-Guided Cross-View Transformer [66.82008165644892]
We propose a method to increase the accuracy of a ground camera's location and orientation by estimating the relative rotation and translation between the ground-level image and its matched/retrieved satellite image.
Experimental results demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2023-07-16T11:52:27Z) - Vision Transformers, a new approach for high-resolution and large-scale
mapping of canopy heights [50.52704854147297]
We present a new vision transformer (ViT) model optimized with a classification (discrete) and a continuous loss function.
This model achieves better accuracy than previously used convolutional based approaches (ConvNets) optimized with only a continuous loss function.
arXiv Detail & Related papers (2023-04-22T22:39:03Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - Real-Time Monocular Human Depth Estimation and Segmentation on Embedded
Systems [13.490605853268837]
Estimating a scene's depth to achieve collision avoidance against moving pedestrians is a crucial and fundamental problem in the robotic field.
This paper proposes a novel, low complexity network architecture for fast and accurate human depth estimation and segmentation in indoor environments.
arXiv Detail & Related papers (2021-08-24T03:26:08Z) - Multi-view Depth Estimation using Epipolar Spatio-Temporal Networks [87.50632573601283]
We present a novel method for multi-view depth estimation from a single video.
Our method achieves temporally coherent depth estimation results by using a novel Epipolar Spatio-Temporal (EST) transformer.
To reduce the computational cost, inspired by recent Mixture-of-Experts models, we design a compact hybrid network.
arXiv Detail & Related papers (2020-11-26T04:04:21Z) - Road Curb Detection and Localization with Monocular Forward-view Vehicle
Camera [74.45649274085447]
We propose a robust method for estimating road curb 3D parameters using a calibrated monocular camera equipped with a fisheye lens.
Our approach is able to estimate the vehicle to curb distance in real time with mean accuracy of more than 90%.
arXiv Detail & Related papers (2020-02-28T00:24:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.