PRGFlow: Benchmarking SWAP-Aware Unified Deep Visual Inertial Odometry
- URL: http://arxiv.org/abs/2006.06753v1
- Date: Thu, 11 Jun 2020 19:12:54 GMT
- Title: PRGFlow: Benchmarking SWAP-Aware Unified Deep Visual Inertial Odometry
- Authors: Nitin J. Sanket, Chahat Deep Singh, Cornelia Ferm\"uller, Yiannis
Aloimonos
- Abstract summary: We present a deep learning approach for visual translation estimation and loosely fuse it with an Inertial sensor for full 6DoF odometry estimation.
We evaluate our network on the MSCOCO dataset and evaluate the VI fusion on multiple real-flight trajectories.
- Score: 14.077054191270213
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Odometry on aerial robots has to be of low latency and high robustness whilst
also respecting the Size, Weight, Area and Power (SWAP) constraints as demanded
by the size of the robot. A combination of visual sensors coupled with Inertial
Measurement Units (IMUs) has proven to be the best combination to obtain robust
and low latency odometry on resource-constrained aerial robots. Recently, deep
learning approaches for Visual Inertial fusion have gained momentum due to
their high accuracy and robustness. However, the remarkable advantages of these
techniques are their inherent scalability (adaptation to different sized aerial
robots) and unification (same method works on different sized aerial robots) by
utilizing compression methods and hardware acceleration, which have been
lacking from previous approaches.
To this end, we present a deep learning approach for visual translation
estimation and loosely fuse it with an Inertial sensor for full 6DoF odometry
estimation. We also present a detailed benchmark comparing different
architectures, loss functions and compression methods to enable scalability. We
evaluate our network on the MSCOCO dataset and evaluate the VI fusion on
multiple real-flight trajectories.
Related papers
- FBRT-YOLO: Faster and Better for Real-Time Aerial Image Detection [21.38164867490915]
We propose a new family of real-time detectors for aerial image detection, named FBRT-YOLO, to address the imbalance between detection accuracy and efficiency.
FCM focuses on alleviating the problem of information imbalance caused by the loss of small target information in deep networks.
MKP, leverages convolutions with kernels of different sizes to enhance the relationships between targets of various scales.
arXiv Detail & Related papers (2025-04-29T11:53:54Z) - RoMeO: Robust Metric Visual Odometry [11.381243799745729]
Visual odometry (VO) aims to estimate camera poses from visual inputs -- a fundamental building block for many applications such as VR/AR and robotics.
Existing approaches lack robustness under this challenging scenario and fail to generalize to unseen data (especially outdoors)
We propose Robust Metric Visual Odometry (RoMeO), a novel method that resolves these issues leveraging priors from pre-trained depth models.
arXiv Detail & Related papers (2024-12-16T08:08:35Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Random resistive memory-based deep extreme point learning machine for
unified visual processing [67.51600474104171]
We propose a novel hardware-software co-design, random resistive memory-based deep extreme point learning machine (DEPLM)
Our co-design system achieves huge energy efficiency improvements and training cost reduction when compared to conventional systems.
arXiv Detail & Related papers (2023-12-14T09:46:16Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - A Distance-Geometric Method for Recovering Robot Joint Angles From an
RGB Image [7.971699294672282]
We present a novel method for retrieving the joint angles of a robot manipulator using only a single RGB image of its current configuration.
Our approach, based on a distance-geometric representation of the configuration space, exploits the knowledge of a robot's kinematic model.
arXiv Detail & Related papers (2023-01-05T12:57:45Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - AirDet: Few-Shot Detection without Fine-tuning for Autonomous
Exploration [16.032316550612336]
We present AirDet, which is free of fine-tuning by learning class relation with support images.
AirDet achieves comparable or even better results than the exhaustively finetuned methods, reaching up to 40-60% improvements on the baseline.
We present evaluation results on real-world exploration tests from the DARPA Subterranean Challenge.
arXiv Detail & Related papers (2021-12-03T06:41:07Z) - Learning suction graspability considering grasp quality and robot
reachability for bin-picking [4.317666242093779]
We propose an intuitive geometric analytic-based grasp quality evaluation metric.
We further incorporate a reachability evaluation metric.
Experiment results show that our intuitive grasp quality evaluation metric is competitive with a physically-inspired metric.
arXiv Detail & Related papers (2021-11-04T00:55:42Z) - GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object
Detector [11.161639542268015]
We propose sensor-aware multi-modal fusion strategies for 2D object detection in harsh-lighting conditions.
Our network learns to estimate the measurement reliability of each sensor modality in the form of scalar weights and masks.
We show that the proposed strategies out-perform the existing state-of-the-art methods on the FLIR-Thermal dataset.
arXiv Detail & Related papers (2021-02-24T14:56:37Z) - IMU Preintegrated Features for Efficient Deep Inertial Odometry [0.0]
Inertial measurement units (IMUs) as ubiquitous proprioceptive motion measurement devices are available on various gadgets and robotic platforms.
Direct inference of geometrical transformations or odometry based on these data alone is a challenging task.
This paper proposes the IMU preintegrated features as a replacement for the raw IMU data in deep inertial odometry.
arXiv Detail & Related papers (2020-07-06T17:58:35Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z) - Deep Soft Procrustes for Markerless Volumetric Sensor Alignment [81.13055566952221]
In this work, we improve markerless data-driven correspondence estimation to achieve more robust multi-sensor spatial alignment.
We incorporate geometric constraints in an end-to-end manner into a typical segmentation based model and bridge the intermediate dense classification task with the targeted pose estimation one.
Our model is experimentally shown to achieve similar results with marker-based methods and outperform the markerless ones, while also being robust to the pose variations of the calibration structure.
arXiv Detail & Related papers (2020-03-23T10:51:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.