AdaFuse: Adaptive Ensemble Decoding with Test-Time Scaling for LLMs
- URL: http://arxiv.org/abs/2601.06022v1
- Date: Fri, 09 Jan 2026 18:58:22 GMT
- Title: AdaFuse: Adaptive Ensemble Decoding with Test-Time Scaling for LLMs
- Authors: Chengming Cui, Tianxin Wei, Ziyi Chen, Ruizhong Qiu, Zhichen Zeng, Zhining Liu, Xuying Ning, Duo Zhou, Jingrui He,
- Abstract summary: Inference-time ensembling provides a practical way to combine large language model capabilities without retraining.<n>We propose AdaFuse, an adaptive ensemble decoding framework that dynamically selects semantically appropriate fusion units during generation.<n>AdaFuse consistently outperforms strong ensemble baselines, achieving an average relative improvement of 6.88%.
- Score: 46.52320938421707
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) exhibit complementary strengths arising from differences in pretraining data, model architectures, and decoding behaviors. Inference-time ensembling provides a practical way to combine these capabilities without retraining. However, existing ensemble approaches suffer from fundamental limitations. Most rely on fixed fusion granularity, which lacks the flexibility required for mid-generation adaptation and fails to adapt to different generation characteristics across tasks. To address these challenges, we propose AdaFuse, an adaptive ensemble decoding framework that dynamically selects semantically appropriate fusion units during generation. Rather than committing to a fixed granularity, AdaFuse adjusts fusion behavior on the fly based on the decoding context, with words serving as basic building blocks for alignment. To be specific, we introduce an uncertainty-based criterion to decide whether to apply ensembling at each decoding step. Under confident decoding states, the model continues generation directly. In less certain states, AdaFuse invokes a diversity-aware scaling strategy to explore alternative candidate continuations and inform ensemble decisions. This design establishes a synergistic interaction between adaptive ensembling and test-time scaling, where ensemble decisions guide targeted exploration, and the resulting diversity in turn strengthens ensemble quality. Experiments on open-domain question answering, arithmetic reasoning, and machine translation demonstrate that AdaFuse consistently outperforms strong ensemble baselines, achieving an average relative improvement of 6.88%. The code is available at https://github.com/CCM0111/AdaFuse.
Related papers
- UniRoute: Unified Routing Mixture-of-Experts for Modality-Adaptive Remote Sensing Change Detection [6.323154336421137]
UniRoute is a unified framework for modality-adaptive learning.<n>We introduce an Adaptive Receptive Field Routing MoE module to disentangle local spatial details from global semantic context.<n>We also propose a Consistency-Aware Self-Distillation strategy that stabilizes unified training under data-scarce heterogeneous settings.
arXiv Detail & Related papers (2026-01-21T09:21:25Z) - ACD-CLIP: Decoupling Representation and Dynamic Fusion for Zero-Shot Anomaly Detection [21.26826497960086]
Pre-trained Vision-Language Models (VLMs) struggle with Zero-Shot Anomaly Detection (ZSAD)<n>We propose a parameter-efficient Convolutional Low-Rank Adaptation (Conv-LoRA) adapter to inject local inductive biases for fine-grained representation.<n>We also introduce a Dynamic Fusion Gateway (DFG) that leverages visual context to adaptively modulate text prompts.
arXiv Detail & Related papers (2025-08-11T10:03:45Z) - Unified modality separation: A vision-language framework for unsupervised domain adaptation [60.8391821117794]
Unsupervised domain adaptation (UDA) enables models trained on a labeled source domain to handle new unlabeled domains.<n>We propose a unified modality separation framework that accommodates both modality-specific and modality-invariant components.<n>Our methods achieve up to 9% performance gain with 9 times of computational efficiencies.
arXiv Detail & Related papers (2025-08-07T02:51:10Z) - Uncertainty-driven Embedding Convolution [16.523816971857787]
We propose Uncertainty-driven Embedding Convolution (UEC)<n>UEC transforms deterministic embeddings into probabilistic ones in a post-hoc manner.<n>It then computes adaptive ensemble weights based on embedding uncertainty.
arXiv Detail & Related papers (2025-07-28T11:15:25Z) - DiCoFlex: Model-agnostic diverse counterfactuals with flexible control [0.0]
We propose DiCoFlex, a model-agnostic, conditional generative framework that produces multiple diverse counterfactuals in a single forward pass.<n>We show that DiCoFlex outperforms existing methods in terms of validity, diversity, proximity, and constraint adherence.
arXiv Detail & Related papers (2025-05-29T17:37:47Z) - MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging [29.58798660724693]
Continual model merging integrates independently fine-tuned models sequentially without access to the original training data.<n>We propose MINGLE, a novel framework for Test-Time Continual Model Merging.<n> MINGLE achieves robust generalization, significantly reduces forgetting, and consistently surpasses previous state-of-the-art methods by 7-9% on average.
arXiv Detail & Related papers (2025-05-17T07:24:22Z) - Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution [61.80716438091887]
GenDiE (Generate, Discriminate, Evolve) is a novel self-evolving framework that enhances context faithfulness through fine-grained sentence-level optimization.<n>By treating each sentence in a response as an independent optimization unit, GenDiE effectively addresses the limitations of previous approaches.<n>Experiments on ASQA (in-domain LFQA) and ConFiQA datasets demonstrate that GenDiE surpasses various baselines in both faithfulness and correctness.
arXiv Detail & Related papers (2025-03-03T16:08:33Z) - On the Power of Adaptive Weighted Aggregation in Heterogeneous Federated Learning and Beyond [37.894835756324454]
Federated averaging (FedAvg) is the most fundamental algorithm in Federated learning (FL)<n>Recent empirical results show that FedAvg can perform well in many real-world heterogeneous tasks.<n>We present a simple and effective FedAvg variant termed FedAWARE.
arXiv Detail & Related papers (2023-10-04T10:15:57Z) - Adaptive Spot-Guided Transformer for Consistent Local Feature Matching [64.30749838423922]
We propose Adaptive Spot-Guided Transformer (ASTR) for local feature matching.
ASTR models the local consistency and scale variations in a unified coarse-to-fine architecture.
arXiv Detail & Related papers (2023-03-29T12:28:01Z) - Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank.
Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z) - Dynamic Federated Learning [57.14673504239551]
Federated learning has emerged as an umbrella term for centralized coordination strategies in multi-agent environments.
We consider a federated learning model where at every iteration, a random subset of available agents perform local updates based on their data.
Under a non-stationary random walk model on the true minimizer for the aggregate optimization problem, we establish that the performance of the architecture is determined by three factors, namely, the data variability at each agent, the model variability across all agents, and a tracking term that is inversely proportional to the learning rate of the algorithm.
arXiv Detail & Related papers (2020-02-20T15:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.