Related papers: Scalable Thompson Sampling using Sparse Gaussian Process Models

Related papers

Nested Slice Sampling: Vectorized Nested Sampling for GPU-Accelerated Inference [0.4999814847776097]
This paper introduces Nested Slice Sampling (NSS), a GPU-friendly, vectorized formulation of Nested Sampling.<n>A tuning analysis yields a simple near-optimal rule for setting the slice width, improving high-dimensional behavior and making per-step compute more predictable.
arXiv Detail & Related papers (2026-01-30T18:20:32Z)
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes [23.128590225272575]
Thompson sampling (TS) is a powerful strategy for sequential decision-making.<n>Despite its success, the theoretical foundations of TS remain limited, particularly in settings with complex temporal structure such as reinforcement learning (RL)<n>This work advances the understanding of TS in RL and highlights how structural assumptions and model uncertainty shape its performance in finite-horizon Markov Decision Processes.
arXiv Detail & Related papers (2025-10-23T16:44:31Z)
Thompson Sampling via Fine-Tuning of LLMs [68.1722422968855]
We propose an alternative based on Thompson sampling that eliminates the need for scalable large acquisition functions.<n>Our approach Thompson Sampling via Finening (ToSFiT) leverages the prior knowledge embedded in prompt-conditioned language models, and adapts incrementally toward the posterior.<n>Our analysis reveals the critical role of careful adaptation to the posterior probability of maximality-a principle that underpins our ToSFiT algorithm.
arXiv Detail & Related papers (2025-10-15T09:13:59Z)
Rethinking Langevin Thompson Sampling from A Stochastic Approximation Perspective [17.53150194998013]
We introduce TS-SA, which incorporates approximation (SA) within the Thompson Sampling (TS) framework.<n>In each round, TS-SA constructs a posterior approximation only using the most recent reward(s) and applies an SA step to average noisy proposals over time.<n>This can be interpreted as approximating a stationary posterior target throughout the entire algorithm.
arXiv Detail & Related papers (2025-10-06T17:01:29Z)
QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution [53.13952833016505]
We propose a low-bit quantization model for real-world video super-resolution (VSR)<n>We use a calibration dataset to measure both spatial and temporal complexity for each layer.<n>We refine the FP and low-bit branches to achieve simultaneous optimization.
arXiv Detail & Related papers (2025-08-06T14:35:59Z)
Fast Gaussian Processes under Monotonicity Constraints [4.184089306037526]
We present a novel virtual point-based framework for building constrained GP models under monotonicity constraints.<n>The framework is further applied to construct surrogate models for systems of differential equations.
arXiv Detail & Related papers (2025-07-09T09:09:00Z)
Accelerated Test-Time Scaling with Model-Free Speculative Sampling [58.69141724095398]
We introduce STAND (STochastic Adaptive N-gram Drafting), a novel model-free speculative decoding approach.<n>We show that STAND reduces inference latency by 60-65% compared to standard autoregressive decoding.<n>As a model-free approach, STAND can be applied to any existing language model without additional training.
arXiv Detail & Related papers (2025-06-05T07:31:18Z)
Preconditioned Inexact Stochastic ADMM for Deep Model [35.37705488695026]
This paper develops an algorithm, PISA, which enables scalable parallel computing and supports various second-moment schemes. Grounded in rigorous theoretical guarantees, the algorithm converges under the sole assumption of Lipschitz of the gradient. Comprehensive experimental evaluations for or fine-tuning diverse FMs, including vision models, large language models, reinforcement learning models, generative adversarial networks, and recurrent neural networks, demonstrate its superior numerical performance compared to various state-of-the-art Directions.
arXiv Detail & Related papers (2025-02-15T12:28:51Z)
RoSTE: An Efficient Quantization-Aware Supervised Fine-Tuning Approach for Large Language Models [53.571195477043496]
We propose an algorithm named Rotated Straight-Through-Estimator (RoSTE) RoSTE combines quantization-aware supervised fine-tuning (QA-SFT) with an adaptive rotation strategy to reduce activation outliers. Our findings reveal that the prediction error is directly proportional to the quantization error of the converged weights, which can be effectively managed through an optimized rotation configuration.
arXiv Detail & Related papers (2025-02-13T06:44:33Z)
Sparse Variational Student-t Processes [8.46450148172407]
Student-t Processes are used to model heavy-tailed distributions and datasets with outliers. We propose a sparse representation framework to allow Student-t Processes to be more flexible for real-world datasets. We evaluate two proposed approaches on various synthetic and real-world datasets from UCI and Kaggle.
arXiv Detail & Related papers (2023-12-09T12:55:20Z)
Exact and general decoupled solutions of the LMC Multitask Gaussian Process model [28.32223907511862]
The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification. Recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes. We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model.
arXiv Detail & Related papers (2023-10-18T15:16:24Z)
Langevin Thompson Sampling with Logarithmic Communication: Bandits and Reinforcement Learning [34.4255062106615]
Thompson sampling (TS) is widely used in sequential decision making due to its ease of use and appealing empirical performance. We propose batched $textitLangevin Thompson Sampling$ algorithms that leverage MCMC methods to sample from approximate posteriors with only logarithmic communication costs in terms of batches. Our algorithms are computationally efficient and maintain the same order-optimal regret guarantees of $mathcalO(log T)$ for MABs, and $mathcalO(sqrtT)$ for RL.
arXiv Detail & Related papers (2023-06-15T01:16:29Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy. To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN. Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z)
FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels [82.53569355337586]
This work offers an efficient solution to temporal point processes inference using general parametric kernels with finite support. The method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG) Results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
arXiv Detail & Related papers (2022-10-10T12:35:02Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
Gaussian Process Inference Using Mini-batch Stochastic Gradient Descent: Convergence Guarantees and Empirical Benefits [21.353189917487512]
gradient descent (SGD) and its variants have established themselves as the go-to algorithms for machine learning problems. We take a step forward by proving minibatch SGD converges to a critical point of the full log-likelihood loss function. Our theoretical guarantees hold provided that the kernel functions exhibit exponential or eigendecay.
arXiv Detail & Related papers (2021-11-19T22:28:47Z)
Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z)
Likelihood-Free Inference with Deep Gaussian Processes [70.74203794847344]
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations. We propose a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions. Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases.
arXiv Detail & Related papers (2020-06-18T14:24:05Z)
A conditional one-output likelihood formulation for multitask Gaussian processes [0.0]
Multitask Gaussian processes (MTGP) are the Gaussian process framework's solution for multioutput regression problems. Here we introduce a novel approach that simplifies the multitask learning. We show that it is computationally competitive with state of the art options.
arXiv Detail & Related papers (2020-06-05T14:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.