Consistency Deep Equilibrium Models
- URL: http://arxiv.org/abs/2602.03024v1
- Date: Tue, 03 Feb 2026 02:42:48 GMT
- Title: Consistency Deep Equilibrium Models
- Authors: Junchao Lin, Zenan Ling, Jingwen Xu, Robert C. Qiu,
- Abstract summary: Deep Equilibrium Models (DEQs) have emerged as a powerful paradigm in deep learning.<n>DEQs incur significant inference latency due to the iterative nature of fixed-point solvers.<n>We introduce the Consistency Deep Equilibrium Model (C-DEQ) to accelerate DEQ inference.
- Score: 8.278751626877431
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Equilibrium Models (DEQs) have emerged as a powerful paradigm in deep learning, offering the ability to model infinite-depth networks with constant memory usage. However, DEQs incur significant inference latency due to the iterative nature of fixed-point solvers. In this work, we introduce the Consistency Deep Equilibrium Model (C-DEQ), a novel framework that leverages consistency distillation to accelerate DEQ inference. We cast the DEQ iterative inference process as evolution along a fixed ODE trajectory toward the equilibrium. Along this trajectory, we train C-DEQs to consistently map intermediate states directly to the fixed point, enabling few-step inference while preserving the performance of the teacher DEQ. At the same time, it facilitates multi-step evaluation to flexibly trade computation for performance gains. Extensive experiments across various domain tasks demonstrate that C-DEQs achieves consistent 2-20$\times$ accuracy improvements over implicit DEQs under the same few-step inference budget.
Related papers
- Lipschitz Multiscale Deep Equilibrium Models: A Theoretically Guaranteed and Accelerated Approach [10.914558012458423]
Deep equilibrium models (DEQs) achieve infinitely deep network representations without stacking layers by exploring fixed points of layer transformations in neural networks.<n>DEQs face the challenge of requiring vastly more computational time for training and inference than conventional methods.<n>This study explored an approach to improve fixed-point convergence and consequently reduce computational time.
arXiv Detail & Related papers (2026-02-03T09:22:56Z) - GB-DQN: Gradient Boosted DQN Models for Non-stationary Reinforcement Learning [0.0]
We propose emphGradient-Boosted Deep Q-Networks (GB-DQN), an adaptive ensemble method that addresses model drift through incremental residual learning.<n>Instead of retraining a single Q-network, GB-DQN constructs an additive ensemble in which each new learner is trained to approximate the Bellman residual of the current ensemble after drift.
arXiv Detail & Related papers (2025-12-18T19:53:50Z) - Universal Approximation Theorem of Deep Q-Networks [2.1756081703276]
We analyze Deep Q-Networks (DQNs) via control and Forward-Backward Differential Equations (FBSDEs)<n>We show that DQNs can approximate the optimal Q-function on compact sets with arbitrary accuracy and high probability.<n>This work bridges deep reinforcement learning and control, offering insights into DQNs in continuous-time settings.
arXiv Detail & Related papers (2025-05-04T22:57:33Z) - DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows [13.420336353905675]
Deep Equilibrium Models (DEQs) are a class of implicit neural networks that solve for a fixed point of a neural network in their forward pass.<n>We present Distributional Deep Equilibrium Models (DDEQs), extending DEQs to discrete measure inputs, such as sets or point clouds.<n>In experiments, we show that they can compete with state-of-the-art models in tasks such as point cloud classification and point cloud completion.
arXiv Detail & Related papers (2025-03-03T03:48:14Z) - Uncertainty Quantification for Forward and Inverse Problems of PDEs via
Latent Global Evolution [110.99891169486366]
We propose a method that integrates efficient and precise uncertainty quantification into a deep learning-based surrogate model.
Our method endows deep learning-based surrogate models with robust and efficient uncertainty quantification capabilities for both forward and inverse problems.
Our method excels at propagating uncertainty over extended auto-regressive rollouts, making it suitable for scenarios involving long-term predictions.
arXiv Detail & Related papers (2024-02-13T11:22:59Z) - Towards Continual Learning Desiderata via HSIC-Bottleneck
Orthogonalization and Equiangular Embedding [55.107555305760954]
We propose a conceptually simple yet effective method that attributes forgetting to layer-wise parameter overwriting and the resulting decision boundary distortion.
Our method achieves competitive accuracy performance, even with absolute superiority of zero exemplar buffer and 1.02x the base model.
arXiv Detail & Related papers (2024-01-17T09:01:29Z) - Understanding, Predicting and Better Resolving Q-Value Divergence in
Offline-RL [86.0987896274354]
We first identify a fundamental pattern, self-excitation, as the primary cause of Q-value estimation divergence in offline RL.
We then propose a novel Self-Excite Eigenvalue Measure (SEEM) metric to measure the evolving property of Q-network at training.
For the first time, our theory can reliably decide whether the training will diverge at an early stage.
arXiv Detail & Related papers (2023-10-06T17:57:44Z) - A Closer Look at the Adversarial Robustness of Deep Equilibrium Models [25.787638780625514]
We develop approaches to estimate the intermediate gradients of DEQs and integrate them into the attacking pipelines.
Our approaches facilitate fully white-box evaluations and lead to effective adversarial defense for DEQs.
arXiv Detail & Related papers (2023-06-02T10:40:30Z) - Global Convergence of Over-parameterized Deep Equilibrium Models [52.65330015267245]
A deep equilibrium model (DEQ) is implicitly defined through an equilibrium point of an infinite-depth weight-tied model with an input-injection.
Instead of infinite computations, it solves an equilibrium point directly with root-finding and computes gradients with implicit differentiation.
We propose a novel probabilistic framework to overcome the technical difficulty in the non-asymptotic analysis of infinite-depth weight-tied models.
arXiv Detail & Related papers (2022-05-27T08:00:13Z) - Deep Equilibrium Optical Flow Estimation [80.80992684796566]
Recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms.
These RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation.
We propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer.
arXiv Detail & Related papers (2022-04-18T17:53:44Z) - Temporal-Difference Value Estimation via Uncertainty-Guided Soft Updates [110.92598350897192]
Q-Learning has proven effective at learning a policy to perform control tasks.
estimation noise becomes a bias after the max operator in the policy improvement step.
We present Unbiased Soft Q-Learning (UQL), which extends the work of EQL from two action, finite state spaces to multi-action, infinite state Markov Decision Processes.
arXiv Detail & Related papers (2021-10-28T00:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.