Actor-critic algorithms for fiber sampling problems
- URL: http://arxiv.org/abs/2405.13950v1
- Date: Wed, 22 May 2024 19:33:58 GMT
- Title: Actor-critic algorithms for fiber sampling problems
- Authors: Ivan Gvozdanović, Sonja Petrović,
- Abstract summary: We propose an actor-critic algorithm for a family of complex problems arising in statistics and discrete optimization.
We translate the problem into a Markov decision process and devise an actor-critic reinforcement learning (RL) algorithm to learn a set of good moves that can be used for sampling.
We prove that the actor-critic algorithm converges to an approximately optimal sampling policy.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose an actor-critic algorithm for a family of complex problems arising in algebraic statistics and discrete optimization. The core task is to produce a sample from a finite subset of the non-negative integer lattice defined by a high-dimensional polytope. We translate the problem into a Markov decision process and devise an actor-critic reinforcement learning (RL) algorithm to learn a set of good moves that can be used for sampling. We prove that the actor-critic algorithm converges to an approximately optimal sampling policy. To tackle complexity issues that typically arise in these sampling problems, and to allow the RL to function at scale, our solution strategy takes three steps: decomposing the starting point of the sample, using RL on each induced subproblem, and reconstructing to obtain a sample in the original polytope. In this setup, the proof of convergence applies to each subproblem in the decomposition. We test the method in two regimes. In statistical applications, a high-dimensional polytope arises as the support set for the reference distribution in a model/data fit test for a broad family of statistical models for categorical data. We demonstrate how RL can be used for model fit testing problems for data sets for which traditional MCMC samplers converge too slowly due to problem size and sparsity structure. To test the robustness of the algorithm and explore its generalization properties, we apply it to synthetically generated data of various sizes and sparsity levels.
Related papers
- Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - CS4ML: A general framework for active learning with arbitrary data based
on Christoffel functions [0.7366405857677226]
We introduce a general framework for active learning in regression problems.
Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces.
This paper focuses on applications in scientific computing, where active learning is often desirable.
arXiv Detail & Related papers (2023-06-01T17:44:19Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - Stochastic Constrained DRO with a Complexity Independent of Sample Size [38.56406595022129]
We propose and analyze algorithms that apply to both non- and convex losses for solving Kullback divergence constrained DRO problem.
We establish a nearly optimal complexity for finding an $$ilon stationary solution for non- losses and an optimal batch complexity for finding an optimal solution for broad applications.
arXiv Detail & Related papers (2022-10-11T19:11:19Z) - Sampling Approximately Low-Rank Ising Models: MCMC meets Variational
Methods [35.24886589614034]
We consider quadratic definite Ising models on the hypercube with a general interaction $J$.
Our general result implies the first time sampling algorithms for low-rank Ising models.
arXiv Detail & Related papers (2022-02-17T21:43:50Z) - Global Convergence of the ODE Limit for Online Actor-Critic Algorithms
in Reinforcement Learning [7.65995376636176]
Actor-critic algorithms are widely used in reinforcement learning, but are challenging to mathematically analyse due to the online arrival of non-i.i.d. data samples.
We prove that, under a time rescaling, the online actor-critic algorithm converges to an ordinary differential equation (ODE) as the number of updates becomes large.
Our convergence analysis holds under specific choices for the learning rates and exploration rates in the actor-critic algorithm, which could provide guidance for the implementation of actor-critic algorithms in practice.
arXiv Detail & Related papers (2021-08-19T12:37:58Z) - Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment.
Policy gradients for local search are often obtained from random perturbations.
We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z) - Exploiting Local Convergence of Quasi-Newton Methods Globally: Adaptive
Sample Size Approach [33.21301562890201]
We use an adaptive sample size scheme that exploits the superlinear convergence of quasi-Newton methods globally and throughout the learning process.
We show that if the initial sample size is sufficiently large and we use quasi-Newton methods to solve each subproblem, the subproblems can be solved superlinearly fast.
arXiv Detail & Related papers (2021-06-10T01:08:51Z) - Non-Adaptive Adaptive Sampling on Turnstile Streams [57.619901304728366]
We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume on turnstile streams that use space sublinear in $n$.
Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model.
arXiv Detail & Related papers (2020-04-23T05:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.