Related papers: Thermodynamic Focusing for Inference-Time Search: Practical Methods for Target-Conditioned Sampling and Prompted Inference

Thermodynamic Focusing for Inference-Time Search: Practical Methods for Target-Conditioned Sampling and Prompted Inference

URL: http://arxiv.org/abs/2512.19717v1
Date: Tue, 16 Dec 2025 09:39:12 GMT
Title: Thermodynamic Focusing for Inference-Time Search: Practical Methods for Target-Conditioned Sampling and Prompted Inference
Authors: Zhan Zhang,
Abstract summary: We present a framework that treats search as a target-conditioned reweighting process.<n>ICFA reuses an available proposal sampler and a task-specific similarity function to form a focused sampling distribution.<n>We show how structured prompts instantiate an approximate, language-level form of ICFA and describe a hybrid architecture combining prompted inference with algorithmic reweighting.
Score: 8.489464814859442
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Finding rare but useful solutions in very large candidate spaces is a recurring practical challenge across language generation, planning, and reinforcement learning. We present a practical framework, \emph{Inverted Causality Focusing Algorithm} (ICFA), that treats search as a target-conditioned reweighting process. ICFA reuses an available proposal sampler and a task-specific similarity function to form a focused sampling distribution, while adaptively controlling focusing strength to avoid degeneracy. We provide a clear recipe, a stability diagnostic based on effective sample size, a compact theoretical sketch explaining when ICFA can reduce sample needs, and two reproducible experiments: constrained language generation and sparse-reward navigation. We further show how structured prompts instantiate an approximate, language-level form of ICFA and describe a hybrid architecture combining prompted inference with algorithmic reweighting.

Related papers

Accelerate Speculative Decoding with Sparse Computation in Verification [49.74839681322316]
Speculative decoding accelerates autoregressive language model inference by verifying multiple draft tokens in parallel.<n>Existing sparsification methods are designed primarily for standard token-by-token autoregressive decoding.<n>We propose a sparse verification framework that jointly sparsifies attention, FFN, and MoE components during the verification stage to reduce the dominant computation cost.
arXiv Detail & Related papers (2025-12-26T07:53:41Z)
Latent Chain-of-Thought for Visual Reasoning [53.541579327424046]
Chain-of-thought (CoT) reasoning is critical for improving the interpretability and reliability of Large Vision-Language Models (LVLMs)<n>We reformulate reasoning in LVLMs as posterior inference and propose a scalable training algorithm based on amortized variational inference.<n>We empirically demonstrate that the proposed method enhances the state-of-the-art LVLMs on seven reasoning benchmarks.
arXiv Detail & Related papers (2025-10-27T23:10:06Z)
Topic Identification in LLM Input-Output Pairs through the Lens of Information Bottleneck [0.0]
We develop a principled topic identification method grounded in the Deterministic Information Bottleneck (DIB) for geometric clustering.<n>Our key contribution is to transform the DIB method into a practical algorithm for high-dimensional data by substituting its intractable KL divergence term with a computationally efficient upper bound.
arXiv Detail & Related papers (2025-08-26T20:00:51Z)
AdaptiveK Sparse Autoencoders: Dynamic Sparsity Allocation for Interpretable LLM Representations [28.447024168930984]
We propose AdaptiveK SAE (Adaptive Top K Sparse Autoencoders), a novel framework that dynamically adjusts sparsity levels based on the semantic complexity of each input.<n>We show that this complexity-driven adaptation significantly outperforms fixed-sparsity approaches on reconstruction fidelity, explained variance, cosine similarity and interpretability metrics.
arXiv Detail & Related papers (2025-08-24T12:00:41Z)
Adaptive Inference-Time Scaling via Cyclic Diffusion Search [61.42700671176343]
We introduce the challenge of adaptive inference-time scaling-dynamically adjusting computational effort during inference.<n>We propose Adaptive Bi-directional Cyclic Diffusion (ABCD), a flexible, search-based inference framework.<n>ABCD refines outputs through bi-directional diffusion cycles while adaptively controlling exploration depth and termination.
arXiv Detail & Related papers (2025-05-20T07:31:38Z)
Speculative Decoding for Multi-Sample Inference [21.64693536216534]
We propose a novel speculative decoding method tailored for multi-sample reasoning scenarios.<n>Our method exploits the intrinsic consensus of parallel generation paths to synthesize high-quality draft tokens.
arXiv Detail & Related papers (2025-03-07T11:15:36Z)
In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z)
Self-regulating Prompts: Foundational Model Adaptation without Forgetting [112.66832145320434]
We introduce a self-regularization framework for prompting called PromptSRC. PromptSRC guides the prompts to optimize for both task-specific and task-agnostic general representations.
arXiv Detail & Related papers (2023-07-13T17:59:35Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.