Related papers: CAS4DL: Christoffel Adaptive Sampling for function approximation via Deep Learning

CAS4DL: Christoffel Adaptive Sampling for function approximation via Deep Learning

URL: http://arxiv.org/abs/2208.12190v1
Date: Thu, 25 Aug 2022 16:21:17 GMT
Title: CAS4DL: Christoffel Adaptive Sampling for function approximation via Deep Learning
Authors: Ben Adcock, Juan M. Cardenas and Nick Dexter
Abstract summary: We propose an adaptive sampling strategy, CAS4DL, to increase the sample efficiency of Deep Learning (DL) Our results show that CAS4DL often yields substantial savings in the number of samples required to achieve a given accuracy.
Score: 2.513785998932353
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The problem of approximating smooth, multivariate functions from sample points arises in many applications in scientific computing, e.g., in computational Uncertainty Quantification (UQ) for science and engineering. In these applications, the target function may represent a desired quantity of interest of a parameterized Partial Differential Equation (PDE). Due to the large cost of solving such problems, where each sample is computed by solving a PDE, sample efficiency is a key concerning these applications. Recently, there has been increasing focus on the use of Deep Neural Networks (DNN) and Deep Learning (DL) for learning such functions from data. In this work, we propose an adaptive sampling strategy, CAS4DL (Christoffel Adaptive Sampling for Deep Learning) to increase the sample efficiency of DL for multivariate function approximation. Our novel approach is based on interpreting the second to last layer of a DNN as a dictionary of functions defined by the nodes on that layer. With this viewpoint, we then define an adaptive sampling strategy motivated by adaptive sampling schemes recently proposed for linear approximation schemes, wherein samples are drawn randomly with respect to the Christoffel function of the subspace spanned by this dictionary. We present numerical experiments comparing CAS4DL with standard Monte Carlo (MC) sampling. Our results demonstrate that CAS4DL often yields substantial savings in the number of samples required to achieve a given accuracy, particularly in the case of smooth activation functions, and it shows a better stability in comparison to MC. These results therefore are a promising step towards fully adapting DL towards scientific computing applications.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Model-aware reinforcement learning for high-performance Bayesian experimental design in quantum metrology [0.5461938536945721]
Quantum sensors offer control flexibility during estimation by allowing manipulation by the experimenter across various parameters. We introduce a versatile procedure capable of optimizing a wide range of problems in quantum metrology, estimation, and hypothesis testing. We combine model-aware reinforcement learning (RL) with Bayesian estimation based on particle filtering.
arXiv Detail & Related papers (2023-12-28T12:04:15Z)
Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions. We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training. As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z)
Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML. This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z)
CS4ML: A general framework for active learning with arbitrary data based on Christoffel functions [0.7366405857677226]
We introduce a general framework for active learning in regression problems. Our framework considers random sampling according to a finite number of sampling measures and arbitrary nonlinear approximation spaces. This paper focuses on applications in scientific computing, where active learning is often desirable.
arXiv Detail & Related papers (2023-06-01T17:44:19Z)
Sparse high-dimensional linear regression with a partitioned empirical Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression. Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates. The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z)
Pairwise Learning via Stagewise Training in Proximal Setting [0.0]
We combine adaptive sample size and importance sampling techniques for pairwise learning, with convergence guarantees for nonsmooth convex pairwise loss functions. We demonstrate that sampling opposite instances at each reduces the variance of the gradient, hence accelerating convergence.
arXiv Detail & Related papers (2022-08-08T11:51:01Z)
Sensing Cox Processes via Posterior Sampling and Positive Bases [56.82162768921196]
We study adaptive sensing of point processes, a widely used model from spatial statistics. We model the intensity function as a sample from a truncated Gaussian process, represented in a specially constructed positive basis. Our adaptive sensing algorithms use Langevin dynamics and are based on posterior sampling (textscCox-Thompson) and top-two posterior sampling (textscTop2) principles.
arXiv Detail & Related papers (2021-10-21T14:47:06Z)
Local policy search with Bayesian optimization [73.0364959221845]
Reinforcement learning aims to find an optimal policy by interaction with an environment. Policy gradients for local search are often obtained from random perturbations. We develop an algorithm utilizing a probabilistic model of the objective function and its gradient.
arXiv Detail & Related papers (2021-06-22T16:07:02Z)
Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling. Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z)
Bayesian Sparse learning with preconditioned stochastic gradient MCMC and its applications [5.660384137948734]
The proposed algorithm converges to the correct distribution with a controllable bias under mild conditions. We show that the proposed algorithm canally converge to the correct distribution with a controllable bias under mild conditions.
arXiv Detail & Related papers (2020-06-29T20:57:20Z)
Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence [30.393999722555154]
We propose a variant of the classical Polyak step-size (Polyak, 1987) commonly used in the subgradient method. The proposed Polyak step-size (SPS) is an attractive choice for setting the learning rate for gradient descent.
arXiv Detail & Related papers (2020-02-24T20:57:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.