Differentiable Model Predictive Control on the GPU
- URL: http://arxiv.org/abs/2510.06179v1
- Date: Tue, 07 Oct 2025 17:42:17 GMT
- Title: Differentiable Model Predictive Control on the GPU
- Authors: Emre Adabag, Marcus Greiff, John Subosits, Thomas Lew,
- Abstract summary: Differentiable model predictive control (MPC) offers a powerful framework for combining learning and control.<n>In this work, we tackle this bottleneck by introducing a GPU-accelerated differentiable optimization tool for MPC.<n>We demonstrate substantial speedups over CPU- and GPU-based baselines, significantly improving upon state-of-the-art training times.
- Score: 2.9273915933729047
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Differentiable model predictive control (MPC) offers a powerful framework for combining learning and control. However, its adoption has been limited by the inherently sequential nature of traditional optimization algorithms, which are challenging to parallelize on modern computing hardware like GPUs. In this work, we tackle this bottleneck by introducing a GPU-accelerated differentiable optimization tool for MPC. This solver leverages sequential quadratic programming and a custom preconditioned conjugate gradient (PCG) routine with tridiagonal preconditioning to exploit the problem's structure and enable efficient parallelization. We demonstrate substantial speedups over CPU- and GPU-based baselines, significantly improving upon state-of-the-art training times on benchmark reinforcement learning and imitation learning tasks. Finally, we showcase the method on the challenging task of reinforcement learning for driving at the limits of handling, where it enables robust drifting of a Toyota Supra through water puddles.
Related papers
- Batched First-Order Methods for Parallel LP Solving in MIP [5.672808839120628]
We extend the primal-dual hybrid algorithm to efficiently solve batches of related linear programming problems that arise in mixed-integer programming techniques such as strong branching and bound tightening.<n>We demonstrate the effectiveness of our approach on various case studies and identify the problem sizes where first-order methods outperform traditional simplex-based solvers depending on the computational environment one can use.
arXiv Detail & Related papers (2026-01-29T17:02:46Z) - PICT -- A Differentiable, GPU-Accelerated Multi-Block PISO Solver for Simulation-Coupled Learning Tasks in Fluid Dynamics [59.38498811984876]
We present our fluid simulator PICT, a differentiable pressure-implicit solver coded in PyTorch with Graphics-processing-unit (GPU) support.<n>We first verify the accuracy of both the forward simulation and our derived gradients in various established benchmarks.<n>We show that the gradients provided by our solver can be used to learn complicated turbulence models in 2D and 3D.
arXiv Detail & Related papers (2025-05-22T17:55:10Z) - Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing [0.0]
This paper addresses the issue of increasing the performance of reinforcement learning (RL) solutions for autonomous racing cars.
We propose a partial end-to-end algorithm that decouples the planning and control tasks.
By leveraging the robustness of a classical controller, our partial end-to-end driving algorithm exhibits better robustness towards model mismatches than standard end-to-end algorithms.
arXiv Detail & Related papers (2023-12-11T14:27:10Z) - Multiplicative update rules for accelerating deep learning training and
increasing robustness [69.90473612073767]
We propose an optimization framework that fits to a wide range of machine learning algorithms and enables one to apply alternative update rules.
We claim that the proposed framework accelerates training, while leading to more robust models in contrast to traditionally used additive update rule.
arXiv Detail & Related papers (2023-07-14T06:44:43Z) - GloptiNets: Scalable Non-Convex Optimization with Certificates [61.50835040805378]
We present a novel approach to non-cube optimization with certificates, which handles smooth functions on the hypercube or on the torus.
By exploiting the regularity of the target function intrinsic in the decay of its spectrum, we allow at the same time to obtain precise certificates and leverage the advanced and powerful neural networks.
arXiv Detail & Related papers (2023-06-26T09:42:59Z) - Accelerated Policy Learning with Parallel Differentiable Simulation [59.665651562534755]
We present a differentiable simulator and a new policy learning algorithm (SHAC)
Our algorithm alleviates problems with local minima through a smooth critic function.
We show substantial improvements in sample efficiency and wall-clock time over state-of-the-art RL and differentiable simulation-based algorithms.
arXiv Detail & Related papers (2022-04-14T17:46:26Z) - Efficient Differentiable Simulation of Articulated Bodies [89.64118042429287]
We present a method for efficient differentiable simulation of articulated bodies.
This enables integration of articulated body dynamics into deep learning frameworks.
We show that reinforcement learning with articulated systems can be accelerated using gradients provided by our method.
arXiv Detail & Related papers (2021-09-16T04:48:13Z) - Scalable Deep-Learning-Accelerated Topology Optimization for Additively
Manufactured Materials [4.221095652322005]
Topology optimization (TO) is a popular and powerful computational approach for designing novel structures, materials, and devices.
To address these issues, we propose a general scalable deep-learning (DL) based TO framework, referred to as SDL-TO.
Our framework accelerates TO by learning the iterative history data and simultaneously training on the mapping between the given design and its gradient.
arXiv Detail & Related papers (2020-11-28T17:38:31Z) - Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling [8.14784681248878]
In this paper, we propose a reinforcement learning approach to solve a realistic scheduling problem.
We apply it to an algorithm commonly executed in the high performance computing community, the Cholesky factorization.
Our algorithm uses graph neural networks in combination with an actor-critic algorithm (A2C) to build an adaptive representation of the problem on the fly.
arXiv Detail & Related papers (2020-11-09T10:57:21Z) - A Learned Performance Model for Tensor Processing Units [5.733911161090224]
We demonstrate a method of learning performance models from a corpus of graph programs for Processing Unit (TPU) instances.
We show that our learned model outperforms a heavily-optimized analytical performance model on two tasks.
It helps an autotuner discover faster programs in a setting where access to TPUs is limited or expensive.
arXiv Detail & Related papers (2020-08-03T17:24:52Z) - MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical
Models [96.1052289276254]
This work introduces a new MAP-solver, based on the popular Dual Block-Coordinate Ascent principle.
Surprisingly, by making a small change to the low-performing solver, we derive the new solver MPLP++ that significantly outperforms all existing solvers by a large margin.
arXiv Detail & Related papers (2020-04-16T16:20:53Z) - Scalable Second Order Optimization for Deep Learning [34.12384996822749]
We present a scalable implementation of a second-order preconditioned method (concretely, a variant of full-matrix Adagrad)
Our novel design effectively utilizes the prevalent heterogeneous hardware architecture for training deep models, consisting of a multicore CPU coupled with multiple accelerator units.
We demonstrate superior performance compared to state-of-the-art on very large learning tasks such as machine translation with Transformers, language modeling with BERT, click-through rate prediction on Criteo, and image classification on ImageNet with ResNet-50.
arXiv Detail & Related papers (2020-02-20T20:51:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.