Related papers: Coresets for Near-Convex Functions

Coresets for Near-Convex Functions

URL: http://arxiv.org/abs/2006.05482v2
Date: Thu, 18 Jun 2020 22:07:15 GMT
Title: Coresets for Near-Convex Functions
Authors: Murad Tukan and Alaa Maalouf and Dan Feldman
Abstract summary: Coreset is usually a small weighted subset of $n$ input points in $mathbbRd$, that provably approximates their loss function for a given set of queries. We suggest a generic framework for computing sensitivities (and thus coresets) for wide family of loss functions. Examples include SVM, Logistic M-estimators, and $ell_z$-regression.
Score: 25.922075279588757
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Coreset is usually a small weighted subset of $n$ input points in $\mathbb{R}^d$, that provably approximates their loss function for a given set of queries (models, classifiers, etc.). Coresets become increasingly common in machine learning since existing heuristics or inefficient algorithms may be improved by running them possibly many times on the small coreset that can be maintained for streaming distributed data. Coresets can be obtained by sensitivity (importance) sampling, where its size is proportional to the total sum of sensitivities. Unfortunately, computing the sensitivity of each point is problem dependent and may be harder to compute than the original optimization problem at hand. We suggest a generic framework for computing sensitivities (and thus coresets) for wide family of loss functions which we call near-convex functions. This is by suggesting the $f$-SVD factorization that generalizes the SVD factorization of matrices to functions. Example applications include coresets that are either new or significantly improves previous results, such as SVM, Logistic regression, M-estimators, and $\ell_z$-regression. Experimental results and open source are also provided.

Related papers

Accurate Coresets for Latent Variable Models and Regularized Regression [1.9567015559455132]
We introduce a unified framework for constructing accurate coresets. We present accurate coreset construction algorithms for general problems. We substantiate our theoretical findings with extensive experimental evaluations on real datasets.
arXiv Detail & Related papers (2024-12-28T16:01:49Z)
No Dimensional Sampling Coresets for Classification [8.234735564035567]
We refine and generalize what is known about coresets for classification problems via the sensitivity sampling framework. Our analysis provides the first no dimensional coresets, so the size does not depend on the dimension. A key tool we develop is a Radamacher complexity version of the main sensitivity sampling approach, which can be of independent interest.
arXiv Detail & Related papers (2024-02-07T21:53:01Z)
AutoCoreset: An Automatic Practical Coreset Construction Framework [65.37876706107764]
A coreset is a tiny weighted subset of an input set, that closely resembles the loss function. We propose an automatic framework for constructing coresets, which requires only the input data and the desired cost function from the user. We show that while this set is limited, the coreset is quite general.
arXiv Detail & Related papers (2023-05-19T19:59:52Z)
Optimal approximation using complex-valued neural networks [0.0]
Complex-valued neural networks (CVNNs) have recently shown promising empirical success. We analyze the expressivity of CVNNs by studying their approximation properties.
arXiv Detail & Related papers (2023-03-29T15:56:43Z)
Provable Data Subset Selection For Efficient Neural Network Training [73.34254513162898]
We introduce the first algorithm to construct coresets for emphRBFNNs, i.e., small weighted subsets that approximate the loss of the input data on any radial basis function network. We then perform empirical evaluations on function approximation and dataset subset selection on popular network architectures and data sets.
arXiv Detail & Related papers (2023-03-09T10:08:34Z)
Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes [80.89852729380425]
We propose the first computationally efficient algorithm that achieves the nearly minimax optimal regret $tilde O(dsqrtH3K)$. Our work provides a complete answer to optimal RL with linear MDPs, and the developed algorithm and theoretical tools may be of independent interest.
arXiv Detail & Related papers (2022-12-12T18:58:59Z)
Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel. Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU. Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z)
Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems. Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z)
Deletion and Insertion Tests in Regression Models [1.2891210250935148]
A basic task in explainable AI (XAI) is to identify the most important features behind a prediction made by a black box function $f$. The insertion and deletion tests of Petsiuk et al. Kernel can be used to judge the quality of algorithms that rank pixels from most to at least important for a classification.
arXiv Detail & Related papers (2022-05-25T00:55:47Z)
Learning to extrapolate using continued fractions: Predicting the critical temperature of superconductor materials [5.905364646955811]
In the field of Artificial Intelligence (AI) and Machine Learning (ML), the approximation of unknown target functions $y=f(mathbfx)$ is a common objective. We refer to $S$ as the training set and aim to identify a low-complexity mathematical model that can effectively approximate this target function for new instances $mathbfx$.
arXiv Detail & Related papers (2020-11-27T04:57:40Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.