Related papers: Aggressive Perception-Aware Navigation using Deep Optical Flow Dynamics and PixelMPC

Aggressive Perception-Aware Navigation using Deep Optical Flow Dynamics and PixelMPC

URL: http://arxiv.org/abs/2001.02307v1
Date: Tue, 7 Jan 2020 22:33:12 GMT
Title: Aggressive Perception-Aware Navigation using Deep Optical Flow Dynamics and PixelMPC
Authors: Keuntaek Lee, Jason Gibson, Evangelos A. Theodorou
Abstract summary: We introduce deep optical flow (DOF) dynamics, which is a combination of optical flow and robot dynamics. Using the DOF dynamics, MPC explicitly incorporates the predicted movement of relevant pixels into the planned trajectory of a robot. Our implementation of DOF is memory-efficient, data-efficient, and computationally cheap so that it can be computed in real-time for use in an MPC framework.
Score: 21.81438321320149
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, vision-based control has gained traction by leveraging the power of machine learning. In this work, we couple a model predictive control (MPC) framework to a visual pipeline. We introduce deep optical flow (DOF) dynamics, which is a combination of optical flow and robot dynamics. Using the DOF dynamics, MPC explicitly incorporates the predicted movement of relevant pixels into the planned trajectory of a robot. Our implementation of DOF is memory-efficient, data-efficient, and computationally cheap so that it can be computed in real-time for use in an MPC framework. The suggested Pixel Model Predictive Control (PixelMPC) algorithm controls the robot to accomplish a high-speed racing task while maintaining visibility of the important features (gates). This improves the reliability of vision-based estimators for localization and can eventually lead to safe autonomous flight. The proposed algorithm is tested in a photorealistic simulation with a high-speed drone racing task.

Related papers

Light Future: Multimodal Action Frame Prediction via InstructPix2Pix [0.0]
This paper proposes a novel, efficient, and lightweight approach for robot action prediction.<n>It offers significantly reduced computational cost and inference latency compared to conventional video prediction models.<n>It pioneers the adaptation of the InstructPix2Pix model for forecasting future visual frames in robotic tasks.
arXiv Detail & Related papers (2025-07-20T03:57:18Z)
cVLA: Towards Efficient Camera-Space VLAs [26.781510474119845]
Vision-Language-Action (VLA) models offer a compelling framework for tackling complex robotic manipulation tasks.<n>We propose a novel VLA approach that leverages the competitive performance of Vision Language Models on 2D images.<n>Our model predicts trajectory waypoints, making it both more efficient to train and robot embodiment.
arXiv Detail & Related papers (2025-07-02T22:56:41Z)
Human-Robot Navigation using Event-based Cameras and Reinforcement Learning [1.7614751781649955]
This work introduces a robot navigation controller that combines event cameras and other sensors with reinforcement learning to enable real-time human-centered navigation and obstacle avoidance.<n>Unlike conventional image-based controllers, which operate at fixed rates and suffer from motion blur and latency, this approach leverages the asynchronous nature of event cameras to process visual information over flexible time intervals.
arXiv Detail & Related papers (2025-06-12T15:03:08Z)
Hybrid Neural-MPM for Interactive Fluid Simulations in Real-Time [57.30651532625017]
We present a novel hybrid method that integrates numerical simulation, neural physics, and generative control.<n>Our system demonstrates robust performance across diverse 2D/3D scenarios, material types, and obstacle interactions.<n>We promise to release both models and data upon acceptance.
arXiv Detail & Related papers (2025-05-25T01:27:18Z)
Real-Time Navigation for Autonomous Aerial Vehicles Using Video [11.414350041043326]
We introduce a novel Markov Decision Process(MDP) framework to reduce the workload of Computer Vision(CV) algorithms. We apply our proposed framework to both feature-based and neural-network-based object-detection tasks. These holistic tests show significant benefits in energy consumption and speed with only a limited loss in accuracy.
arXiv Detail & Related papers (2025-04-01T01:14:42Z)
Optical Flow Matters: an Empirical Comparative Study on Fusing Monocular Extracted Modalities for Better Steering [37.46760714516923]
This research introduces a new end-to-end method that exploits multimodal information from a single monocular camera to improve the steering predictions for self-driving cars. By focusing on the fusion of RGB imagery with depth completion information or optical flow data, we propose a framework that integrates these modalities through both early and hybrid fusion techniques.
arXiv Detail & Related papers (2024-09-18T09:36:24Z)
Neuromorphic Optical Flow and Real-time Implementation with Event Cameras [47.11134388304464]
We build on the latest developments in event-based vision and spiking neural networks. We propose a new network architecture that improves the state-of-the-art self-supervised optical flow accuracy. We demonstrate high speed optical flow prediction with almost two orders of magnitude reduced complexity.
arXiv Detail & Related papers (2023-04-14T14:03:35Z)
Learning Deep Sensorimotor Policies for Vision-based Autonomous Drone Racing [52.50284630866713]
Existing systems often require hand-engineered components for state estimation, planning, and control. This paper tackles the vision-based autonomous-drone-racing problem by learning deep sensorimotor policies.
arXiv Detail & Related papers (2022-10-26T19:03:17Z)
StreamYOLO: Real-time Object Detection for Streaming Perception [84.2559631820007]
We endow the models with the capacity of predicting the future, significantly improving the results for streaming perception. We consider multiple velocities driving scene and propose Velocity-awared streaming AP (VsAP) to jointly evaluate the accuracy. Our simple method achieves the state-of-the-art performance on Argoverse-HD dataset and improves the sAP and VsAP by 4.7% and 8.2% respectively.
arXiv Detail & Related papers (2022-07-21T12:03:02Z)
PUCK: Parallel Surface and Convolution-kernel Tracking for Event-Based Cameras [4.110120522045467]
Event-cameras can guarantee fast visual sensing in dynamic environments, but require a tracking algorithm that can keep up with the high data rate induced by the robot ego-motion. We introduce a novel tracking method that leverages the Exponential Reduced Ordinal Surface (EROS) data representation to decouple event-by-event processing and tracking. We propose the task of tracking the air hockey puck sliding on a surface, with the future aim of controlling the iCub robot to reach the target precisely and on time.
arXiv Detail & Related papers (2022-05-16T13:23:52Z)
Visual-Inertial Odometry with Online Calibration of Velocity-Control Based Kinematic Motion Models [3.42658286826597]
Visual-inertial odometry (VIO) is an important technology for autonomous robots with power and payload constraints. We propose a novel approach for VIO with stereo cameras which integrates and calibrates the velocity-control based kinematic motion model of wheeled mobile robots online.
arXiv Detail & Related papers (2022-04-14T06:21:12Z)
Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception. We build a simple and effective framework for streaming perception. Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z)
Spatiotemporal Costmap Inference for MPC via Deep Inverse Reinforcement Learning [27.243603228431564]
We propose a new IRLRL algorithm that learns a goal-conditionedtemporal reward function. The resulting costmap is used by Model Predictive Controllers (MPCs) to perform a task.
arXiv Detail & Related papers (2022-01-17T17:36:29Z)
Nonprehensile Riemannian Motion Predictive Control [57.295751294224765]
We introduce a novel Real-to-Sim reward analysis technique to reliably imagine and predict the outcome of taking possible actions for a real robotic platform. We produce a closed-loop controller to reactively push objects in a continuous action space. We observe that RMPC is robust in cluttered as well as occluded environments and outperforms the baselines.
arXiv Detail & Related papers (2021-11-15T18:50:04Z)
Risk-Averse MPC via Visual-Inertial Input and Recurrent Networks for Online Collision Avoidance [95.86944752753564]
We propose an online path planning architecture that extends the model predictive control (MPC) formulation to consider future location uncertainties. Our algorithm combines an object detection pipeline with a recurrent neural network (RNN) which infers the covariance of state estimates. The robustness of our methods is validated on complex quadruped robot dynamics and can be generally applied to most robotic platforms.
arXiv Detail & Related papers (2020-07-28T07:34:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.