Related papers: Stable Yaw Estimation of Boats from the Viewpoint of UAVs and USVs

Stable Yaw Estimation of Boats from the Viewpoint of UAVs and USVs

URL: http://arxiv.org/abs/2306.14056v1
Date: Sat, 24 Jun 2023 20:47:37 GMT
Title: Stable Yaw Estimation of Boats from the Viewpoint of UAVs and USVs
Authors: Benjamin Kiefer, Timon H\"ofer, Andreas Zell
Abstract summary: We propose a method based on HyperPosePDF for predicting the orientation of boats in the 6D space. We extend HyperPosePDF to work in video-based scenarios, such that it yields robust orientation predictions across time.
Score: 14.573513188682183
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Yaw estimation of boats from the viewpoint of unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs) or boats is a crucial task in various applications such as 3D scene rendering, trajectory prediction, and navigation. However, the lack of literature on yaw estimation of objects from the viewpoint of UAVs has motivated us to address this domain. In this paper, we propose a method based on HyperPosePDF for predicting the orientation of boats in the 6D space. For that, we use existing datasets, such as PASCAL3D+ and our own datasets, SeaDronesSee-3D and BOArienT, which we annotated manually. We extend HyperPosePDF to work in video-based scenarios, such that it yields robust orientation predictions across time. Naively applying HyperPosePDF on video data yields single-point predictions, resulting in far-off predictions and often incorrect symmetric orientations due to unseen or visually different data. To alleviate this issue, we propose aggregating the probability distributions of pose predictions, resulting in significantly improved performance, as shown in our experimental evaluation. Our proposed method could significantly benefit downstream tasks in marine robotics.

Related papers

ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction [11.312780421161204]
In this paper, we propose ViPOcc, which leverages the visual priors from vision foundation models for fine-grained 3D occupancy prediction. We also propose a semantic-guided non-overlapping Gaussian mixture sampler for efficient, instance-aware ray sampling. Our experiments demonstrate the superior performance of ViPOcc in both 3D occupancy prediction and depth estimation tasks.
arXiv Detail & Related papers (2024-12-15T15:04:27Z)
RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception [98.76525636842177]
RoScenes is the largest multi-view roadside perception dataset. Our dataset achieves surprising 21.13M 3D annotations within 64,000 $m2$.
arXiv Detail & Related papers (2024-05-16T08:06:52Z)
RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images. We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z)
OccNeRF: Advancing 3D Occupancy Prediction in LiDAR-Free Environments [77.0399450848749]
We propose an OccNeRF method for training occupancy networks without 3D supervision. We parameterize the reconstructed occupancy fields and reorganize the sampling strategy to align with the cameras' infinite perceptive range. For semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.
arXiv Detail & Related papers (2023-12-14T18:58:52Z)
Metrically Scaled Monocular Depth Estimation through Sparse Priors for Underwater Robots [0.0]
We formulate a deep learning model that fuses sparse depth measurements from triangulated features to improve the depth predictions. The network is trained in a supervised fashion on the forward-looking underwater dataset, FLSea. The method achieves real-time performance, running at 160 FPS on a laptop GPU and 7 FPS on a single CPU core.
arXiv Detail & Related papers (2023-10-25T16:32:31Z)
A Novel Deep Neural Network for Trajectory Prediction in Automated Vehicles Using Velocity Vector Field [12.067838086415833]
This paper proposes a novel technique for trajectory prediction that combines a data-driven learning-based method with a velocity vector field (VVF) generated from a nature-inspired concept. The accuracy remains consistent with decreasing observation windows which alleviates the requirement of a long history of past observations for accurate trajectory prediction.
arXiv Detail & Related papers (2023-09-19T22:14:52Z)
Uncertainty-aware State Space Transformer for Egocentric 3D Hand Trajectory Forecasting [79.34357055254239]
Hand trajectory forecasting is crucial for enabling a prompt understanding of human intentions when interacting with AR/VR systems. Existing methods handle this problem in a 2D image space which is inadequate for 3D real-world applications. We set up an egocentric 3D hand trajectory forecasting task that aims to predict hand trajectories in a 3D space from early observed RGB videos in a first-person view.
arXiv Detail & Related papers (2023-07-17T04:55:02Z)
A Review on Viewpoints and Path-planning for UAV-based 3D Reconstruction [3.0479044961661708]
3D reconstruction using the data captured by UAVs is also attracting attention in research and industry. This review paper investigates a wide range of model-free and model-based algorithms for viewpoint and path planning for 3D reconstruction of large-scale objects.
arXiv Detail & Related papers (2022-05-07T20:29:39Z)
SoK: Vehicle Orientation Representations for Deep Rotation Estimation [2.052323405257355]
We study the accuracy performance of various existing orientation representations using the KITTI 3D object detection dataset. We propose a new form of orientation representation: Tricosine.
arXiv Detail & Related papers (2021-12-08T17:12:54Z)
SLPC: a VRNN-based approach for stochastic lidar prediction and completion in autonomous driving [63.87272273293804]
We propose a new LiDAR prediction framework that is based on generative models namely Variational Recurrent Neural Networks (VRNNs) Our algorithm is able to address the limitations of previous video prediction frameworks when dealing with sparse data by spatially inpainting the depth maps in the upcoming frames. We present a sparse version of VRNNs and an effective self-supervised training method that does not require any labels.
arXiv Detail & Related papers (2021-02-19T11:56:44Z)
Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting [106.3504366501894]
Self-driving vehicles and robotic manipulation systems often forecast future object poses by first detecting and tracking objects. This detect-then-forecast pipeline is expensive to scale, as pose forecasting algorithms typically require labeled sequences of object poses. We propose to first forecast 3D sensor data and then detect/track objects on the predicted point cloud sequences to obtain future poses. This makes it less expensive to scale pose forecasting, as the sensor data forecasting task requires no labels.
arXiv Detail & Related papers (2020-03-18T17:54:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.