Balancing the Budget: Feature Selection and Tracking for Multi-Camera
Visual-Inertial Odometry
- URL: http://arxiv.org/abs/2109.05975v1
- Date: Mon, 13 Sep 2021 13:53:09 GMT
- Title: Balancing the Budget: Feature Selection and Tracking for Multi-Camera
Visual-Inertial Odometry
- Authors: Lintong Zhang, David Wisth, Marco Camurri, Maurice Fallon
- Abstract summary: We present a multi-camera visual-inertial odometry system based on factor graph optimization.
We focus on motion tracking in challenging environments such as in narrow corridors and dark spaces with aggressive motions and abrupt lighting changes.
- Score: 3.441021278275805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a multi-camera visual-inertial odometry system based on factor
graph optimization which estimates motion by using all cameras simultaneously
while retaining a fixed overall feature budget. We focus on motion tracking in
challenging environments such as in narrow corridors and dark spaces with
aggressive motions and abrupt lighting changes. These scenarios cause
traditional monocular or stereo odometry to fail. While tracking motion across
extra cameras should theoretically prevent failures, it causes additional
complexity and computational burden. To overcome these challenges, we introduce
two novel methods to improve multi-camera feature tracking. First, instead of
tracking features separately in each camera, we track features continuously as
they move from one camera to another. This increases accuracy and achieves a
more compact factor graph representation. Second, we select a fixed budget of
tracked features which are spread across the cameras to ensure that the limited
computational budget is never exceeded. We have found that using a smaller set
of informative features can maintain the same tracking accuracy while reducing
back-end optimization time. Our proposed method was extensively tested using a
hardware-synchronized device containing an IMU and four cameras (a front stereo
pair and two lateral) in scenarios including an underground mine, large open
spaces, and building interiors with narrow stairs and corridors. Compared to
stereo-only state-of-the-art VIO methods, our approach reduces the drift rate
(RPE) by up to 80% in translation and 39% in rotation.
Related papers
- Redundancy-Aware Camera Selection for Indoor Scene Neural Rendering [54.468355408388675]
We build a similarity matrix that incorporates both the spatial diversity of the cameras and the semantic variation of the images.
We apply a diversity-based sampling algorithm to optimize the camera selection.
We also develop a new dataset, IndoorTraj, which includes long and complex camera movements captured by humans in virtual indoor environments.
arXiv Detail & Related papers (2024-09-11T08:36:49Z) - Improved Single Camera BEV Perception Using Multi-Camera Training [4.003066044908734]
In large-scale production, cost efficiency is an optimization goal, so that using fewer cameras becomes more relevant.
This raises the problem of developing a BEV perception model that provides a sufficient performance on a low-cost sensor setup.
The objective of our approach is to reduce the aforementioned performance drop as much as possible using a modern multi-camera surround view model reduced for single-camera inference.
arXiv Detail & Related papers (2024-09-04T13:06:40Z) - VICAN: Very Efficient Calibration Algorithm for Large Camera Networks [49.17165360280794]
We introduce a novel methodology that extends Pose Graph Optimization techniques.
We consider the bipartite graph encompassing cameras, object poses evolving dynamically, and camera-object relative transformations at each time step.
Our framework retains compatibility with traditional PGO solvers, but its efficacy benefits from a custom-tailored optimization scheme.
arXiv Detail & Related papers (2024-03-25T17:47:03Z) - SDGE: Stereo Guided Depth Estimation for 360$^\circ$ Camera Sets [65.64958606221069]
Multi-camera systems are often used in autonomous driving to achieve a 360$circ$ perception.
These 360$circ$ camera sets often have limited or low-quality overlap regions, making multi-view stereo methods infeasible for the entire image.
We propose the Stereo Guided Depth Estimation (SGDE) method, which enhances depth estimation of the full image by explicitly utilizing multi-view stereo results on the overlap.
arXiv Detail & Related papers (2024-02-19T02:41:37Z) - Robust Frame-to-Frame Camera Rotation Estimation in Crowded Scenes [8.061773364318313]
We present an approach to estimating camera rotation in crowded, real-world scenes from handheld monocular video.
We provide a new dataset and benchmark, with high-accuracy, rigorously verified ground truth, on 17 video sequences.
This represents a strong new performance point for crowded scenes, an important setting for computer vision.
arXiv Detail & Related papers (2023-09-15T17:44:07Z) - Scalable and Real-time Multi-Camera Vehicle Detection,
Re-Identification, and Tracking [58.95210121654722]
We propose a real-time city-scale multi-camera vehicle tracking system that handles real-world, low-resolution CCTV instead of idealized and curated video streams.
Our method is ranked among the top five performers on the public leaderboard.
arXiv Detail & Related papers (2022-04-15T12:47:01Z) - Cross-Camera Feature Prediction for Intra-Camera Supervised Person
Re-identification across Distant Scenes [70.30052164401178]
Person re-identification (Re-ID) aims to match person images across non-overlapping camera views.
ICS-DS Re-ID uses cross-camera unpaired data with intra-camera identity labels for training.
Cross-camera feature prediction method to mine cross-camera self supervision information.
Joint learning of global-level and local-level features forms a global-local cross-camera feature prediction scheme.
arXiv Detail & Related papers (2021-07-29T11:27:50Z) - CoMo: A novel co-moving 3D camera system [0.0]
CoMo is a co-moving camera system of two synchronized high speed cameras coupled with rotational stages.
We address the calibration of the external parameters measuring the position of the cameras and their three angles of yaw, pitch and roll in the system "home" configuration.
We evaluate the robustness and accuracy of the system by comparing reconstructed and measured 3D distances in what we call 3D tests, which show a relative error of the order of 1%.
arXiv Detail & Related papers (2021-01-26T13:29:13Z) - Unified Multi-Modal Landmark Tracking for Tightly Coupled
Lidar-Visual-Inertial Odometry [5.131684964386192]
We present an efficient multi-sensor odometry system for mobile platforms that jointly optimize visual, lidar, and inertial information.
New method to extract 3D line and planar primitives from lidar point clouds is presented.
System has been tested on a variety of platforms and scenarios, including underground exploration with a legged robot and outdoor scanning with a dynamically moving handheld device.
arXiv Detail & Related papers (2020-11-13T09:54:03Z) - Infrastructure-based Multi-Camera Calibration using Radial Projections [117.22654577367246]
Pattern-based calibration techniques can be used to calibrate the intrinsics of the cameras individually.
Infrastucture-based calibration techniques are able to estimate the extrinsics using 3D maps pre-built via SLAM or Structure-from-Motion.
We propose to fully calibrate a multi-camera system from scratch using an infrastructure-based approach.
arXiv Detail & Related papers (2020-07-30T09:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.