TAP: A Token-Adaptive Predictor Framework for Training-Free Diffusion Acceleration
- URL: http://arxiv.org/abs/2603.03792v1
- Date: Wed, 04 Mar 2026 07:10:11 GMT
- Title: TAP: A Token-Adaptive Predictor Framework for Training-Free Diffusion Acceleration
- Authors: Haowei Zhu, Tingxuan Huang, Xing Wang, Tianyu Zhao, Jiexi Wang, Weifeng Chen, Xurui Peng, Fangmin Chen, Junhai Yong, Bin Wang,
- Abstract summary: Token-Adaptive Predictor (TAP) is a training-free, probe-driven framework that adaptively selects a predictor for each token at every sampling step.<n>TAP incurs negligible overhead while enabling large speedups with little or no perceptual quality loss.
- Score: 19.18455910385295
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models achieve strong generative performance but remain slow at inference due to the need for repeated full-model denoising passes. We present Token-Adaptive Predictor (TAP), a training-free, probe-driven framework that adaptively selects a predictor for each token at every sampling step. TAP uses a single full evaluation of the model's first layer as a low-cost probe to compute proxy losses for a compact family of candidate predictors (instantiated primarily with Taylor expansions of varying order and horizon), then assigns each token the predictor with the smallest proxy error. This per-token "probe-then-select" strategy exploits heterogeneous temporal dynamics, requires no additional training, and is compatible with various predictor designs. TAP incurs negligible overhead while enabling large speedups with little or no perceptual quality loss. Extensive experiments across multiple diffusion architectures and generation tasks show that TAP substantially improves the accuracy-efficiency frontier compared to fixed global predictors and caching-only baselines.
Related papers
- Benchmarking Few-shot Transferability of Pre-trained Models with Improved Evaluation Protocols [123.73663884421272]
Few-shot transfer has been revolutionized by stronger pre-trained models and improved adaptation algorithms.<n>We establish FEWTRANS, a comprehensive benchmark containing 10 diverse datasets.<n>By releasing FEWTRANS, we aim to provide a rigorous "ruler" to streamline reproducible advances in few-shot transfer learning research.
arXiv Detail & Related papers (2026-02-28T05:41:57Z) - Amortized Predictability-aware Training Framework for Time Series Forecasting and Classification [10.816479922364097]
We propose a general Amortized Predictability-aware Training Framework (APTF) for both time series forecasting (TSF) and time series classification (TSC)<n>APTF introduces two key designs that enable the model to focus on high-predictability samples while still learning appropriately from low-predictability ones.
arXiv Detail & Related papers (2026-02-18T06:59:05Z) - Temporal Pair Consistency for Variance-Reduced Flow Matching [13.328987133593154]
Temporal Pair Consistency (TPC) is a lightweight variance-reduction principle that couples velocity predictions at paired timesteps along the same probability path.<n>Instantiated within flow matching, TPC improves sample quality and efficiency across CIFAR-10 and ImageNet at multiple resolutions.
arXiv Detail & Related papers (2026-02-04T00:05:21Z) - Controllable Probabilistic Forecasting with Stochastic Decomposition Layers [1.3995263206621]
We introduce Decomposition Layers (SDL) for converting deterministic machine learning weather models into ensemble systems.<n>SDL applies learned perturbations at three decoder scales through latent-driven modulation, per-pixel noise, and channel scaling.<n>When applied to WXFormer via transfer learning, SDL requires less than 2% of the computational cost needed to train the baseline model.
arXiv Detail & Related papers (2025-12-21T17:10:00Z) - SPREAD: Sampling-based Pareto front Refinement via Efficient Adaptive Diffusion [0.8594140167290097]
SPREAD is a generative framework based on Denoising Diffusion Probabilistic Models (DDPMs)<n>It learns a conditional diffusion process over points sampled from the decision space.<n>It refines candidates via a sampling scheme that uses an adaptive multiple gradient descent-inspired update for fast convergence.
arXiv Detail & Related papers (2025-09-25T12:09:37Z) - Robust Representation Consistency Model via Contrastive Denoising [83.47584074390842]
randomized smoothing provides theoretical guarantees for certifying robustness against adversarial perturbations.<n> diffusion models have been successfully employed for randomized smoothing to purify noise-perturbed samples.<n>We reformulate the generative modeling task along the diffusion trajectories in pixel space as a discriminative task in the latent space.
arXiv Detail & Related papers (2025-01-22T18:52:06Z) - Adaptive Sampling to Reduce Epistemic Uncertainty Using Prediction Interval-Generation Neural Networks [0.0]
This paper presents an adaptive sampling approach designed to reduce epistemic uncertainty in predictive models.<n>Our primary contribution is the development of a metric that estimates potential epistemic uncertainty.<n>A batch sampling strategy based on Gaussian processes (GPs) is also proposed.<n>We test our approach on three unidimensional synthetic problems and a multi-dimensional dataset based on an agricultural field for selecting experimental fertilizer rates.
arXiv Detail & Related papers (2024-12-13T21:21:47Z) - Informed Correctors for Discrete Diffusion Models [27.295990499157814]
We propose a predictor-corrector sampling scheme for discrete diffusion models.<n>We show that our informed corrector consistently produces superior samples with fewer errors or improved FID scores.<n>Our results underscore the potential of informed correctors for fast and high-fidelity generation using discrete diffusion.
arXiv Detail & Related papers (2024-07-30T23:29:29Z) - An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning [55.467047686093025]
A common approach to alleviate such forgetting is to rehearse samples from prior tasks during fine-tuning.<n>We propose a sampling scheme, textttbf mix-cd, that prioritizes rehearsal of collateral damage'' samples.<n>Our approach is computationally efficient, easy to implement, and outperforms several leading continual learning methods in compute-constrained settings.
arXiv Detail & Related papers (2024-02-12T22:32:12Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - BERT Loses Patience: Fast and Robust Inference with Early Exit [91.26199404912019]
We propose Patience-based Early Exit as a plug-and-play technique to improve the efficiency and robustness of a pretrained language model.
Our approach improves inference efficiency as it allows the model to make a prediction with fewer layers.
arXiv Detail & Related papers (2020-06-07T13:38:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.