Related papers: Breaking the Pre-Sampling Barrier: Activation-Informed Difficulty-Aware Self-Consistency

Breaking the Pre-Sampling Barrier: Activation-Informed Difficulty-Aware Self-Consistency

URL: http://arxiv.org/abs/2602.09438v1
Date: Tue, 10 Feb 2026 06:05:11 GMT
Title: Breaking the Pre-Sampling Barrier: Activation-Informed Difficulty-Aware Self-Consistency
Authors: Taewoong Yoon, Geunyeong Jeong, Geon Park, Sihyeong Yeom, Harksoo Kim,
Abstract summary: Self-Consistency (SC) is an effective decoding strategy that improves the reasoning performance of Large Language Models (LLMs)<n>It suffers from substantial inference costs because it requires a large number of samples.<n>We propose Activation-Informed Difficulty-Aware Self-Consistency (ACTSC) to address these limitations.
Score: 10.079669716138763
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-Consistency (SC) is an effective decoding strategy that improves the reasoning performance of Large Language Models (LLMs) by generating multiple chain-of-thought reasoning paths and selecting the final answer via majority voting. However, it suffers from substantial inference costs because it requires a large number of samples. To mitigate this issue, Difficulty-Adaptive Self-Consistency (DSC) was proposed to reduce unnecessary token usage for easy problems by adjusting the number of samples according to problem difficulty. However, DSC requires additional model calls and pre-sampling to estimate difficulty, and this process is repeated when applying to each dataset, leading to significant computational overhead. In this work, we propose Activation-Informed Difficulty-Aware Self-Consistency (ACTSC) to address these limitations. ACTSC leverages internal difficulty signals reflected in the feed-forward network neuron activations to construct a lightweight difficulty estimation probe, without any additional token generation or model calls. The probe dynamically adjusts the number of samples for SC and can be applied to new datasets without requiring pre-sampling for difficulty estimation. To validate its effectiveness, we conduct experiments on five benchmarks. Experimental results show that ACTSC effectively reduces inference costs while maintaining accuracy relative to existing methods.

Related papers

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference [0.4999814847776097]
This paper introduces Nested Slice Sampling (NSS), a GPU-friendly, vectorized formulation of Nested Sampling.<n>A tuning analysis yields a simple near-optimal rule for setting the slice width, improving high-dimensional behavior and making per-step compute more predictable.
arXiv Detail & Related papers (2026-01-30T18:20:32Z)
TS-DP: Reinforcement Speculative Decoding For Temporal Adaptive Diffusion Policy Acceleration [64.32072516882947]
Diffusion Policy excels in embodied control but suffers from high inference latency and computational cost.<n>We propose Temporal-aware Reinforcement-based Speculative Diffusion Policy (TS-DP)<n>TS-DP achieves up to 4.17 times faster inference with over 94% accepted drafts, reaching an inference frequency of 25 Hz.
arXiv Detail & Related papers (2025-12-13T07:53:14Z)
Optimal Self-Consistency for Efficient Reasoning with Large Language Models [3.74203477986748]
Self-consistency (SC) is a widely used test-time inference technique for improving performance in chain-of-thought reasoning.<n>We provide the first comprehensive analysis of SC's scaling behavior and its variants, drawing on mode estimation and voting theory.<n>We introduce Blend-ASC, a novel variant of self-consistency that dynamically allocates samples to questions during inference.
arXiv Detail & Related papers (2025-11-15T17:45:42Z)
The LLM Already Knows: Estimating LLM-Perceived Question Difficulty via Hidden Representations [33.65540900920885]
Estimating the difficulty of input questions as perceived by large language models (LLMs) is essential for accurate performance evaluation and adaptive inference.<n>We propose a novel approach for difficulty estimation that leverages only the hidden representations produced by the target LLM.
arXiv Detail & Related papers (2025-09-16T09:38:41Z)
Staying in the Sweet Spot: Responsive Reasoning Evolution via Capability-Adaptive Hint Scaffolding [59.60915947702282]
Reinforcement learning with verifiable rewards (RLVR) has achieved remarkable success in enhancing the reasoning capabilities of large language models (LLMs)<n>Existing RLVR methods often suffer from exploration inefficiency due to mismatches between the training data's difficulty and the model's capability.<n>We propose SEELE, a novel supervision-aided RLVR framework that dynamically adjusts problem difficulty to stay within the high-efficiency region.
arXiv Detail & Related papers (2025-09-08T17:36:21Z)
Self-Route: Automatic Mode Switching via Capability Estimation for Efficient Reasoning [36.470695895695044]
Self-Route is a dynamic reasoning framework that automatically selects between general and reasoning modes.<n>We show that Self-Route achieves comparable accuracy to reasoning models while reducing token consumption by 30-55%.
arXiv Detail & Related papers (2025-05-27T03:18:31Z)
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive.<n> LCD can distort the global distribution over strings, sampling tokens based only on local information.<n>We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
One-step Noisy Label Mitigation [86.57572253460125]
Mitigating the detrimental effects of noisy labels on the training process has become increasingly critical. We propose One-step Anti-Noise (OSA), a model-agnostic noisy label mitigation paradigm. We empirically demonstrate the superiority of OSA, highlighting its enhanced training robustness, improved task transferability, ease of deployment, and reduced computational costs.
arXiv Detail & Related papers (2024-10-02T18:42:56Z)
Reasoning Aware Self-Consistency: Leveraging Reasoning Paths for Efficient LLM Sampling [9.44858963874474]
Self-Consistency mitigates hallucinations in Large Language Models (LLMs) by sampling multiple reasoning paths.<n>We introduce Reasoning-Aware Self-Consistency (RASC), a novel framework that enhances sampling efficiency and reasoning faithfulness.
arXiv Detail & Related papers (2024-08-30T05:14:59Z)
Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning [19.408941114068444]
Self-consistency (SC) is a widely used decoding strategy for chain-of-thought reasoning.<n>Its variants, Adaptive self-consistency (ASC) and Early-stopping self-consistency (ESC), dynamically adjust the number of samples based on the posterior distribution of a set of pre-samples.<n>We propose Difficulty-Adaptive Self-Consistency (DSC), which leverages the difficulty information of batch queries to adaptively allocate inference resources.
arXiv Detail & Related papers (2024-08-24T04:03:35Z)
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs [60.58434523646137]
A popular approach for improving the correctness of output from large language models (LLMs) is Self-Consistency. We introduce Adaptive-Consistency, a cost-efficient, model-agnostic technique that dynamically adjusts the number of samples per question. Our experiments show that Adaptive-Consistency reduces sample budget by up to 7.9 times with an average accuracy drop of less than 0.1%.
arXiv Detail & Related papers (2023-05-19T17:49:25Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.