Related papers: Faster Approximate Dynamic Programming by Freezing Slow States

Related papers

Adaptive Visual Autoregressive Acceleration via Dual-Linkage Entropy Analysis [50.48301331112126]
We propose NOVA, a training-free token reduction acceleration framework for Visual AutoRegressive modeling.<n>NOVA adaptively determines the acceleration activation scale during inference by online identifying the inflection point of scale entropy growth.<n>Experiments and analyses validate NOVA as a simple yet effective training-free acceleration framework.
arXiv Detail & Related papers (2026-02-01T17:29:42Z)
Bi-Level Online Provisioning and Scheduling with Switching Costs and Cross-Level Constraints [1.639795325203038]
We study a bi-level online provisioning and scheduling problem motivated by network resource allocation.<n>We model this two-time-scale interaction using an upper-level online convex optimization problem and a lower-level constrained Markov decision process.
arXiv Detail & Related papers (2026-01-26T20:16:13Z)
Informed Routing in LLMs: Smarter Token-Level Computation for Faster Inference [7.690958366125321]
This paper introduces informed routing, a new paradigm that proactively addresses these issues.<n>We propose the Lightweight Feature Forecaster (LFF), a small predictive module that estimates a unit's output before routing decisions are made.<n>Experiments on both language modeling and reasoning tasks show that informed routing achieves state-of-the-art efficiency-performance trade-offs.
arXiv Detail & Related papers (2025-10-10T09:59:36Z)
Slow dynamics from a nested hierarchy of frozen states [0.0]
We identify the mechanism of slow heterogeneous relaxation in quantum kinetically constrained models.<n>We reveal a hierarchy of states that remain frozen on time scales determined by powers of the coupling.
arXiv Detail & Related papers (2025-10-03T16:30:01Z)
Disentangling Slow and Fast Temporal Dynamics in Degradation Inference with Hierarchical Differential Models [21.067477404456174]
Residual-based methods are widely employed, but the residuals remain entangled with operational history.<n>We propose a novel Hierarchical Controlled Differential Equation (H-CDE) framework that incorporates a slow (degradation) and a fast (operation) CDE component in a unified architecture.
arXiv Detail & Related papers (2025-08-30T23:58:46Z)
R-Stitch: Dynamic Trajectory Stitching for Efficient Reasoning [80.104336426172]
Chain-of-thought (CoT) enhances problem-solving ability of large language models.<n>CoT incurs substantial inference cost due to long autoregressive trajectories.<n>We introduce R-Stitch, a training-free hybrid decoding framework.
arXiv Detail & Related papers (2025-07-23T08:14:36Z)
MPQ-DMv2: Flexible Residual Mixed Precision Quantization for Low-Bit Diffusion Models with Temporal Distillation [74.34220141721231]
We present MPQ-DMv2, an improved textbfMixed textbfPrecision textbfQuantization framework for extremely low-bit textbfDiffusion textbfModels.
arXiv Detail & Related papers (2025-07-06T08:16:50Z)
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs [52.663816303997194]
A key factor influencing answer quality is the length of the thinking stage.<n>This paper explores and exploits the mechanisms by which LLMs understand and regulate the length of their reasoning.<n>Our results demonstrate that this "overclocking" method mitigates overthinking, improves answer accuracy, and reduces inference latency.
arXiv Detail & Related papers (2025-06-08T17:54:33Z)
OP-LoRA: The Blessing of Dimensionality [93.08208871549557]
Low-rank adapters enable fine-tuning of large models with only a small number of parameters. They often pose optimization challenges, with poor convergence. We introduce an over- parameterized approach that accelerates training without increasing inference costs. We achieve improvements in vision-language tasks and especially notable increases in image generation.
arXiv Detail & Related papers (2024-12-13T18:55:19Z)
Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective [66.80315289020487]
Warmup-Stable-Decay (WSD) schedule uses a constant learning rate to produce a main branch of iterates that can continue indefinitely without a pre-specified compute budget. We show that pretraining loss exhibits a river valley landscape, which resembles a deep valley with a river at its bottom. Inspired by the theory, we introduce WSD-S, a variant of WSD that reuses previous checkpoints' decay phases and keeps only one main branch.
arXiv Detail & Related papers (2024-10-07T16:49:39Z)
Accelerating Dissipative State Preparation with Adaptive Open Quantum Dynamics [0.0]
A variety of dissipative state preparation schemes suffer from a basic time-entanglement tradeoff. We show how a minimal kind of adaptive dynamics can be used to completely circumvent this tradeoff.
arXiv Detail & Related papers (2024-09-09T19:11:07Z)
Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising. We introduce a novel quantization framework that includes three strategies. This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z)
Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models [4.156757591117864]
We propose a novel algorithm that achieves improved regret bounds while minimizing assumptions about the problem. Our method extends beyond linear valuation models commonly used in dynamic pricing by considering general function spaces.
arXiv Detail & Related papers (2024-06-24T23:43:56Z)
A physics-informed neural network method for the approximation of slow invariant manifolds for the general class of stiff systems of ODEs [0.0]
We present a physics-informed neural network (PINN) approach for the discovery of slow invariant manifold (SIMs) In contrast to other machine learning (ML) approaches that construct reduced order black box surrogate models, our approach, simultaneously decomposes the vector field into fast and slow components. We show that the proposed PINN scheme provides SIM approximations, of equivalent or even higher accuracy, than those provided by QSSA, PEA and CSP.
arXiv Detail & Related papers (2024-03-18T09:10:39Z)
Fast, Scalable, Warm-Start Semidefinite Programming with Spectral Bundling and Sketching [53.91395791840179]
We present Unified Spectral Bundling with Sketching (USBS), a provably correct, fast and scalable algorithm for solving massive SDPs. USBS provides a 500x speed-up over the state-of-the-art scalable SDP solver on an instance with over 2 billion decision variables.
arXiv Detail & Related papers (2023-12-19T02:27:22Z)
StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences [31.210626775505407]
Occlusions between consecutive frames have long posed a significant challenge in optical flow estimation. We present a Streamlined In-batch Multi-frame (SIM) pipeline tailored to video input, attaining a similar level of time efficiency to two-frame networks. StreamFlow not only excels in terms of performance on challenging KITTI and Sintel datasets, with particular improvement in occluded areas.
arXiv Detail & Related papers (2023-11-28T07:53:51Z)
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models [52.454274602380124]
Diffusion models heavily depend on the time-step $t$ to achieve satisfactory multi-round denoising. We propose a Temporal Feature Maintenance Quantization (TFMQ) framework building upon a Temporal Information Block. Powered by the pioneering block design, we devise temporal information aware reconstruction (TIAR) and finite set calibration (FSC) to align the full-precision temporal features.
arXiv Detail & Related papers (2023-11-27T12:59:52Z)
Non-stationary Reinforcement Learning under General Function Approximation [60.430936031067006]
We first propose a new complexity metric called dynamic Bellman Eluder (DBE) dimension for non-stationary MDPs. Based on the proposed complexity metric, we propose a novel confidence-set based model-free algorithm called SW-OPEA. We show that SW-OPEA is provably efficient as long as the variation budget is not significantly large.
arXiv Detail & Related papers (2023-06-01T16:19:37Z)
Initial-state-dependent quantum speed limit for dissipative state preparation: Framework and optimization [6.211723927647019]
We focus on a Markovian dissipative state preparation scheme where the prepared state is one of the energy eigenstates. We derive an initial-state-dependent quantum speed limit (QSL) that offers a more refined measure of the actual evolution time. We demonstrate the effectiveness of our strategy in a dissipative Rydberg atom system for preparing the Bell state.
arXiv Detail & Related papers (2023-03-23T00:19:32Z)
Intermittently Observable Markov Decision Processes [17.75610745277615]
We consider a scenario where the controller perceives the state information of the process via an unreliable communication channel. The transmissions of state information over the whole time horizon are modeled as a Bernoulli lossy process. We develop two finite-state approximations to the tree MDP to find near-optimal policies efficiently.
arXiv Detail & Related papers (2023-02-23T03:38:03Z)
Toward Efficient Gradient-Based Value Estimation [4.365720395124051]
Gradient-based methods for value estimation in reinforcement learning are typically much slower than Temporal Difference (TD) learning methods. We study the root causes of this slowness and show that Mean Square Bellman Error (MSBE) is an ill-conditioned loss function in the sense that its Hessian has large condition-number. We propose a low complexity batch-free proximal method that approximately follows the Gauss-Newton direction and is robust to parameterization. Our main algorithm, called RANS, is efficient in the sense that it is significantly faster than the residual gradient methods while having almost the same
arXiv Detail & Related papers (2023-01-31T16:45:49Z)
SMDP-Based Dynamic Batching for Efficient Inference on GPU-Based Platforms [14.42787221783853]
This paper aims to provide a dynamic graphics policy that strikes a balance between efficiency and latency. The proposed solution has notable flexibility in balancing power consumption and latency.
arXiv Detail & Related papers (2023-01-30T13:19:16Z)
Shortcuts to adiabatic population inversion via time-rescaling: stability and thermodynamic cost [0.0]
We study the problem of speeding up the population inversion of a two-level quantum system. The fidelity of the dynamics versus systematic errors in the control parameters are shown to be comparable with other STA schemes.
arXiv Detail & Related papers (2022-04-29T20:27:02Z)
Collaborative Intelligent Reflecting Surface Networks with Multi-Agent Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks. In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z)
An Adaptive State Aggregation Algorithm for Markov Decision Processes [10.494611365482028]
We propose an intuitive algorithm for solving MDPs that reduces the cost of value iteration updates by dynamically grouping together states with similar cost-to-go values. Our algorithm converges almost surely to within (2varepsilon / (1 - gamma) of the true optimal value in the (ellinfty) norm, where (gamma) is the discount factor and aggregated states differ by at most (varepsilon)
arXiv Detail & Related papers (2021-07-23T07:19:43Z)
A Fully Problem-Dependent Regret Lower Bound for Finite-Horizon MDPs [117.82903457289584]
We derive a novel problem-dependent lower-bound for regret in finite-horizon Markov Decision Processes (MDPs) We show that our lower-bound is considerably smaller than in the general case and it does not scale with the minimum action gap at all. We show that this last result is attainable (up to $poly(H)$ terms, where $H$ is the horizon) by providing a regret upper-bound based on policy gaps for an optimistic algorithm.
arXiv Detail & Related papers (2021-06-24T13:46:09Z)
Acting in Delayed Environments with Non-Stationary Markov Policies [57.52103323209643]
We introduce a framework for learning and planning in MDPs where the decision-maker commits actions that are executed with a delay of $m$ steps. We prove that with execution delay, deterministic Markov policies in the original state-space are sufficient for attaining maximal reward, but need to be non-stationary. We devise a non-stationary Q-learning style model-based algorithm that solves delayed execution tasks without resorting to state-augmentation.
arXiv Detail & Related papers (2021-01-28T13:35:37Z)
Time-Varying Parameters as Ridge Regressions [0.0]
Time-varying parameters (TVPs) models are frequently used in economics to capture structural change. I highlight a rather underutilized fact -- that these are actually ridge regressions. I use it to study the evolution of monetary policy in Canada using large time-varying local projections.
arXiv Detail & Related papers (2020-09-01T13:07:04Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error. Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z)
Dynamic of Stochastic Gradient Descent with State-Dependent Noise [84.64013284862733]
gradient descent (SGD) and its variants are mainstream methods to train deep neural networks. We show that the covariance of the noise of SGD in the local region of the local minima is a quadratic function of the state. We propose a novel power-law dynamic with state-dependent diffusion to approximate the dynamic of SGD.
arXiv Detail & Related papers (2020-06-24T13:34:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.