Efficient Deep Visual and Inertial Odometry with Adaptive Visual
Modality Selection
- URL: http://arxiv.org/abs/2205.06187v1
- Date: Thu, 12 May 2022 16:17:49 GMT
- Title: Efficient Deep Visual and Inertial Odometry with Adaptive Visual
Modality Selection
- Authors: Mingyu Yang, Yu Chen, Hun-Seok Kim
- Abstract summary: We propose an adaptive deep-learning based VIO method that reduces computational redundancy by opportunistically disabling the visual modality.
A Gumbel-Softmax trick is adopted to train the policy network to make the decision process differentiable for end-to-end system training.
Experiment results show that our method achieves a similar or even better performance than the full-modality baseline.
- Score: 12.754974372231647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, deep learning-based approaches for visual-inertial odometry
(VIO) have shown remarkable performance outperforming traditional geometric
methods. Yet, all existing methods use both the visual and inertial
measurements for every pose estimation incurring potential computational
redundancy. While visual data processing is much more expensive than that for
the inertial measurement unit (IMU), it may not always contribute to improving
the pose estimation accuracy. In this paper, we propose an adaptive
deep-learning based VIO method that reduces computational redundancy by
opportunistically disabling the visual modality. Specifically, we train a
policy network that learns to deactivate the visual feature extractor on the
fly based on the current motion state and IMU readings. A Gumbel-Softmax trick
is adopted to train the policy network to make the decision process
differentiable for end-to-end system training. The learned strategy is
interpretable, and it shows scenario-dependent decision patterns for adaptive
complexity reduction. Experiment results show that our method achieves a
similar or even better performance than the full-modality baseline with up to
78.8% computational complexity reduction for KITTI dataset evaluation. Our code
will be shared in https://github.com/mingyuyng/Visual-Selective-VIO
Related papers
- Enhancing Digital Hologram Reconstruction Using Reverse-Attention Loss for Untrained Physics-Driven Deep Learning Models with Uncertain Distance [10.788482076164314]
We present a pioneering approach to addressing the Autofocusing challenge in untrained deep-learning methods.
Our method presents a significant reconstruction performance over rival methods.
For example, the difference is less than 1dB in PSNR and 0.002 in SSIM for the target sample.
arXiv Detail & Related papers (2024-01-11T01:30:46Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Online Network Source Optimization with Graph-Kernel MAB [62.6067511147939]
We propose Grab-UCB, a graph- kernel multi-arms bandit algorithm to learn online the optimal source placement in large scale networks.
We describe the network processes with an adaptive graph dictionary model, which typically leads to sparse spectral representations.
We derive the performance guarantees that depend on network parameters, which further influence the learning curve of the sequential decision strategy.
arXiv Detail & Related papers (2023-07-07T15:03:42Z) - Learning to Optimize Permutation Flow Shop Scheduling via Graph-based
Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems.
We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately.
Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Exploiting Invariance in Training Deep Neural Networks [4.169130102668252]
Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks.
The resulting algorithm requires less parameter tuning, trains well with an initial learning rate 1.0, and easily generalizes to different tasks.
Tested on ImageNet, MS COCO, and Cityscapes datasets, our proposed technique requires fewer iterations to train, surpasses all baselines by a large margin, seamlessly works on both small and large batch size training, and applies to different computer vision tasks of image classification, object detection, and semantic segmentation.
arXiv Detail & Related papers (2021-03-30T19:18:31Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [36.414471128890284]
We tackle the essential problem of scale inconsistency for self-supervised joint depth-pose learning.
Most existing methods assume that a consistent scale of depth and pose can be learned across all input samples.
We propose a novel system that explicitly disentangles scale from the network estimation.
arXiv Detail & Related papers (2020-04-03T00:28:09Z) - Denoising IMU Gyroscopes with Deep Learning for Open-Loop Attitude
Estimation [0.0]
This paper proposes a learning method for denoising gyroscopes of Inertial Measurement Units (IMUs) using ground truth data.
The obtained algorithm outperforms the state-of-the-art on the (unseen) test sequences.
arXiv Detail & Related papers (2020-02-25T08:04:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.