A Theory of the Risk for Optimization with Relaxation and its
Application to Support Vector Machines
- URL: http://arxiv.org/abs/2004.05839v4
- Date: Mon, 8 Jan 2024 11:00:50 GMT
- Title: A Theory of the Risk for Optimization with Relaxation and its
Application to Support Vector Machines
- Authors: Marco C. Campi and Simone Garatti
- Abstract summary: We consider optimization with relaxation, an ample paradigm to make data-driven designs.
This approach was previously considered by the same authors of this work in Garatti and Campi.
- Score: 3.045851438458641
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we consider optimization with relaxation, an ample paradigm to
make data-driven designs. This approach was previously considered by the same
authors of this work in Garatti and Campi (2019), a study that revealed a
deep-seated connection between two concepts: risk (probability of not
satisfying a new, out-of-sample, constraint) and complexity (according to a
definition introduced in paper Garatti and Campi (2019)). This connection was
shown to have profound implications in applications because it implied that the
risk can be estimated from the complexity, a quantity that can be measured from
the data without any knowledge of the data-generation mechanism. In the present
work we establish new results. First, we expand the scope of Garatti and Campi
(2019) so as to embrace a more general setup that covers various algorithms in
machine learning. Then, we study classical support vector methods - including
SVM (Support Vector Machine), SVR (Support Vector Regression) and SVDD (Support
Vector Data Description) - and derive new results for the ability of these
methods to generalize. All results are valid for any finite size of the data
set. When the sample size tends to infinity, we establish the unprecedented
result that the risk approaches the ratio between the complexity and the
cardinality of the data sample, regardless of the value of the complexity.
Related papers
- Collaborative Learning with Different Labeling Functions [7.228285747845779]
We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions.
We show that, when the data distributions satisfy a weaker realizability assumption, sample-efficient learning is still feasible.
arXiv Detail & Related papers (2024-02-16T04:32:22Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - RGM: A Robust Generalizable Matching Model [49.60975442871967]
We propose a deep model for sparse and dense matching, termed RGM (Robust Generalist Matching)
To narrow the gap between synthetic training samples and real-world scenarios, we build a new, large-scale dataset with sparse correspondence ground truth.
We are able to mix up various dense and sparse matching datasets, significantly improving the training diversity.
arXiv Detail & Related papers (2023-10-18T07:30:08Z) - CORE: Common Random Reconstruction for Distributed Optimization with
Provable Low Communication Complexity [110.50364486645852]
Communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers.
We propose Common Om REOm, which can be used to compress information transmitted between machines.
arXiv Detail & Related papers (2023-09-23T08:45:27Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Optimal Heterogeneous Collaborative Linear Regression and Contextual
Bandits [34.121889149071684]
We study collaborative linear regression and contextual bandits, where each instance's associated parameters are equal to a global parameter plus a sparse instance-specific term.
We propose a novel two-stage estimator called MOLAR that leverages this structure by first constructing an entry-wise median of the instances' linear regression estimates, and then shrinking the instance-specific estimates towards the median.
We then apply MOLAR to develop methods for sparsely heterogeneous collaborative contextual bandits, which lead to improved regret guarantees compared to independent bandit methods.
arXiv Detail & Related papers (2023-06-09T22:48:13Z) - Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features [0.11184789007828977]
We focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor vector.
We design a novel Monte Carlo sampling algorithm that estimates them at a reduced complexity that depends linearly on the size of the background dataset.
arXiv Detail & Related papers (2023-03-17T19:17:06Z) - GEC: A Unified Framework for Interactive Decision Making in MDP, POMDP,
and Beyond [101.5329678997916]
We study sample efficient reinforcement learning (RL) under the general framework of interactive decision making.
We propose a novel complexity measure, generalized eluder coefficient (GEC), which characterizes the fundamental tradeoff between exploration and exploitation.
We show that RL problems with low GEC form a remarkably rich class, which subsumes low Bellman eluder dimension problems, bilinear class, low witness rank problems, PO-bilinear class, and generalized regular PSR.
arXiv Detail & Related papers (2022-11-03T16:42:40Z) - Deep Active Ensemble Sampling For Image Classification [8.31483061185317]
Active learning frameworks aim to reduce the cost of data annotation by actively requesting the labeling for the most informative data points.
Some proposed approaches include uncertainty-based techniques, geometric methods, implicit combination of uncertainty-based and geometric approaches.
We present an innovative integration of recent progress in both uncertainty-based and geometric frameworks to enable an efficient exploration/exploitation trade-off in sample selection strategy.
Our framework provides two advantages: (1) accurate posterior estimation, and (2) tune-able trade-off between computational overhead and higher accuracy.
arXiv Detail & Related papers (2022-10-11T20:20:20Z) - SreaMRAK a Streaming Multi-Resolution Adaptive Kernel Algorithm [60.61943386819384]
Existing implementations of KRR require that all the data is stored in the main memory.
We propose StreaMRAK - a streaming version of KRR.
We present a showcase study on two synthetic problems and the prediction of the trajectory of a double pendulum.
arXiv Detail & Related papers (2021-08-23T21:03:09Z) - Semi-supervised Vector-valued Learning: Improved Bounds and Algorithms [20.53130700587322]
We derive novel semi-supervised excess risk bounds for general vector-valued learning from both kernel perspective and linear perspective.
Motivated by our theoretical analysis, we propose a general semi-supervised algorithm for efficiently learning vector-valued functions.
arXiv Detail & Related papers (2019-09-11T07:30:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.