Related papers: Adaptive Visual Autoregressive Acceleration via Dual-Linkage Entropy Analysis

Adaptive Visual Autoregressive Acceleration via Dual-Linkage Entropy Analysis

URL: http://arxiv.org/abs/2602.01345v1
Date: Sun, 01 Feb 2026 17:29:42 GMT
Title: Adaptive Visual Autoregressive Acceleration via Dual-Linkage Entropy Analysis
Authors: Yu Zhang, Jingyi Liu, Feng Liu, Duoqian Miao, Qi Zhang, Kexue Fu, Changwei Wang, Longbing Cao,
Abstract summary: We propose NOVA, a training-free token reduction acceleration framework for Visual AutoRegressive modeling.<n>NOVA adaptively determines the acceleration activation scale during inference by online identifying the inflection point of scale entropy growth.<n>Experiments and analyses validate NOVA as a simple yet effective training-free acceleration framework.
Score: 50.48301331112126
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Visual AutoRegressive modeling (VAR) suffers from substantial computational cost due to the massive token count involved. Failing to account for the continuous evolution of modeling dynamics, existing VAR token reduction methods face three key limitations: heuristic stage partition, non-adaptive schedules, and limited acceleration scope, thereby leaving significant acceleration potential untapped. Since entropy variation intrinsically reflects the transition of predictive uncertainty, it offers a principled measure to capture modeling dynamics evolution. Therefore, we propose NOVA, a training-free token reduction acceleration framework for VAR models via entropy analysis. NOVA adaptively determines the acceleration activation scale during inference by online identifying the inflection point of scale entropy growth. Through scale-linkage and layer-linkage ratio adjustment, NOVA dynamically computes distinct token reduction ratios for each scale and layer, pruning low-entropy tokens while reusing the cache derived from the residuals at the prior scale to accelerate inference and maintain generation quality. Extensive experiments and analyses validate NOVA as a simple yet effective training-free acceleration framework.

Related papers

StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models [69.07782637329315]
Visual Autoregressive ( VAR) modeling departs from the next-token prediction paradigm of traditional Autoregressive (AR) models through next-scale prediction.<n>Existing acceleration methods reduce runtime for large-scale steps, but rely on manual step selection and overlook the varying importance of different stages in the generation process.<n>We present Stage VAR, a systematic study and stage-aware acceleration framework for VAR models.
arXiv Detail & Related papers (2025-12-18T12:51:19Z)
Gradient Descent as a Perceptron Algorithm: Understanding Dynamics and Implicit Acceleration [67.12978375116599]
We show that the steps of gradient descent (GD) reduce to those of generalized perceptron algorithms.<n>This helps explain the optimization dynamics and the implicit acceleration phenomenon observed in neural networks.
arXiv Detail & Related papers (2025-12-12T14:16:35Z)
Plug-and-Play Homeostatic Spark: Zero-Cost Acceleration for SNN Training Across Paradigms [40.57310813106791]
Spiking neural networks offer event driven computation, sparse activation, and hardware efficiency, yet training often converges slowly and lacks stability.<n>We present Adaptive Homeostatic Spiking Activity Regulation (AHSAR), an extremely simple plug in and training paradigm method.<n>AHSAR stabilizes optimization and accelerates convergence without changing the model architecture, loss, or gradients.
arXiv Detail & Related papers (2025-12-04T17:26:46Z)
Revisiting Direct Encoding: Learnable Temporal Dynamics for Static Image Spiking Neural Networks [0.0]
Handling static images that lack inherent temporal dynamics is a fundamental challenge for spiking neural networks (SNNs)<n>In directly trained SNNs, static inputs are typically repeated across time steps, causing the temporal dimension to collapse into a rate like representation and preventing meaningful temporal modeling.<n>This work revisits the reported performance gap between direct and rate based encodings and shows that it primarily stems from convolutional learnability and surrogate gradient formulations rather than the encoding schemes themselves.
arXiv Detail & Related papers (2025-12-01T13:55:00Z)
VVS: Accelerating Speculative Decoding for Visual Autoregressive Generation via Partial Verification Skipping [52.58270801983525]
speculative decoding (SD) has been proven effective for accelerating visual AR models.<n>We propose a novel framework VVS to accelerate visual AR generation via partial verification skipping.
arXiv Detail & Related papers (2025-11-17T16:50:58Z)
The Curious Case of In-Training Compression of State Space Models [49.819321766705514]
State Space Models (SSMs) tackle long sequence modeling tasks efficiently, offer both parallelizable training and fast inference.<n>Key design challenge is striking the right balance between maximizing expressivity and limiting this computational burden.<n>Our approach, textscCompreSSM, applies to Linear Time-Invariant SSMs such as Linear Recurrent Units, but is also extendable to selective models.
arXiv Detail & Related papers (2025-10-03T09:02:33Z)
Time-Scale Coupling Between States and Parameters in Recurrent Neural Networks [3.924071936547547]
Gated neural networks (RNNs) implicitly induce adaptive learning-rate behavior.<n>Effect arises from the coupling between state-space time scales--parametrized by the gates--and parameter-space dynamics.<n> Empirical simulations corroborate these claims.
arXiv Detail & Related papers (2025-08-16T18:19:34Z)
AiGAS-dEVL: An Adaptive Incremental Neural Gas Model for Drifting Data Streams under Extreme Verification Latency [6.7236795813629]
In streaming setups data flows are affected by factors that yield non-stationarities in the patterns (concept drift) We propose a novel approach, AiGAS-dEVL, which relies on growing neural gas to characterize the distributions of all concepts detected within the stream over time. Our approach exposes that the online analysis of the behavior of these points over time facilitates the definition of the evolution of concepts in the feature space.
arXiv Detail & Related papers (2024-07-07T14:04:57Z)
The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion [29.489737359897312]
We study the limiting dynamics of deep neural networks trained with gradient descent (SGD) We show that the key ingredient driving these dynamics is not the original training loss, but rather the combination of a modified loss, which implicitly regularizes the velocity and probability currents, which cause oscillations in phase space.
arXiv Detail & Related papers (2021-07-19T20:18:57Z)
Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem) AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient. Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z)
Hessian-Free High-Resolution Nesterov Acceleration for Sampling [55.498092486970364]
Nesterov's Accelerated Gradient (NAG) for optimization has better performance than its continuous time limit (noiseless kinetic Langevin) when a finite step-size is employed. This work explores the sampling counterpart of this phenonemon and proposes a diffusion process, whose discretizations can yield accelerated gradient-based MCMC methods.
arXiv Detail & Related papers (2020-06-16T15:07:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.