Support vector machines and Radon's theorem
- URL: http://arxiv.org/abs/2011.00617v4
- Date: Fri, 16 Sep 2022 14:39:12 GMT
- Title: Support vector machines and Radon's theorem
- Authors: Henry Adams, Elin Farnell, Brittany Story
- Abstract summary: A support vector machine (SVM) is an algorithm that finds a hyperplane which optimally separates labeled data points in $mathbbRn$ into positive and negative classes.
We connect the possible configurations of support vectors to Radon's theorem, which provides guarantees for when a set of points can be divided into two classes.
- Score: 0.5027571997864707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A support vector machine (SVM) is an algorithm that finds a hyperplane which
optimally separates labeled data points in $\mathbb{R}^n$ into positive and
negative classes. The data points on the margin of this separating hyperplane
are called support vectors. We connect the possible configurations of support
vectors to Radon's theorem, which provides guarantees for when a set of points
can be divided into two classes (positive and negative) whose convex hulls
intersect. If the convex hulls of the positive and negative support vectors are
projected onto a separating hyperplane, then the projections intersect if and
only if the hyperplane is optimal. Further, with a particular type of general
position, we show that (a) the projected convex hulls of the support vectors
intersect in exactly one point, (b) the support vectors are stable under
perturbation, (c) there are at most $n+1$ support vectors, and (d) every number
of support vectors from 2 up to $n+1$ is possible. Finally, we perform computer
simulations studying the expected number of support vectors, and their
configurations, for randomly generated data. We observe that as the distance
between classes of points increases for this type of randomly generated data,
configurations with fewer support vectors become more likely.
Related papers
- Tverberg's theorem and multi-class support vector machines [0.0]
We show how we can design new models of multi-class support vector machines (SVMs)
These protocols require fewer conditions to classify sets of points, and can be computed using existing binary SVM algorithms in higher-dimensional spaces.
We give a new simple proof of a geometric characterization of support vectors for largest margin SVMs by Veelaert.
arXiv Detail & Related papers (2024-04-25T16:37:58Z) - Detection of Correlated Random Vectors [7.320365821066746]
Two random vectors $mathsfXinmathbbRn$ and $mathsfYinmathbbRn$ are correlated or not.
We analyze the thresholds at which optimal testing is information-theoretically impossible and possible.
arXiv Detail & Related papers (2024-01-24T12:58:08Z) - Data-Driven Linear Complexity Low-Rank Approximation of General Kernel
Matrices: A Geometric Approach [0.9453554184019107]
A kernel matrix may be defined as $K_ij = kappa(x_i,y_j)$ where $kappa(x,y)$ is a kernel function.
We seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large.
arXiv Detail & Related papers (2022-12-24T07:15:00Z) - Vector-valued Distance and Gyrocalculus on the Space of Symmetric
Positive Definite Matrices [7.752212921476838]
We use the vector-valued distance to compute distances and extract geometric information from the manifold of symmetric positive definite matrices.
We develop gyrovector calculus, constructing analogs of vector space operations in this curved space.
arXiv Detail & Related papers (2021-10-26T08:17:51Z) - Spectral properties of sample covariance matrices arising from random
matrices with independent non identically distributed columns [50.053491972003656]
It was previously shown that the functionals $texttr(AR(z))$, for $R(z) = (frac1nXXT- zI_p)-1$ and $Ain mathcal M_p$ deterministic, have a standard deviation of order $O(|A|_* / sqrt n)$.
Here, we show that $|mathbb E[R(z)] - tilde R(z)|_F
arXiv Detail & Related papers (2021-09-06T14:21:43Z) - Pseudo-Euclidean Attract-Repel Embeddings for Undirected Graphs [73.0261182389643]
Dot product embeddings take a graph and construct vectors for nodes such that dot products between two vectors give the strength of the edge.
We remove the transitivity assumption by embedding nodes into a pseudo-Euclidean space.
Pseudo-Euclidean embeddings can compress networks efficiently, allow for multiple notions of nearest neighbors each with their own interpretation, and can be slotted' into existing models.
arXiv Detail & Related papers (2021-06-17T17:23:56Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - A Precise Performance Analysis of Support Vector Regression [105.94855998235232]
We study the hard and soft support vector regression techniques applied to a set of $n$ linear measurements.
Our results are then used to optimally tune the parameters intervening in the design of hard and soft support vector regression algorithms.
arXiv Detail & Related papers (2021-05-21T14:26:28Z) - Learning Sparse Graph Laplacian with K Eigenvector Prior via Iterative
GLASSO and Projection [58.5350491065936]
We consider a structural assumption on the graph Laplacian matrix $L$.
The first $K$ eigenvectors of $L$ are pre-selected, e.g., based on domain-specific criteria.
We design an efficient hybrid graphical lasso/projection algorithm to compute the most suitable graph Laplacian matrix $L* in H_u+$ given $barC$.
arXiv Detail & Related papers (2020-10-25T18:12:50Z) - On the proliferation of support vectors in high dimensions [24.63581896788434]
The support vector machine (SVM) is a well-established classification method whose name refers to the particular training examples, called support vectors.
Recent research has shown that in sufficiently high-dimensional linear classification problems, the SVM can generalize well despite a proliferation of support vectors.
arXiv Detail & Related papers (2020-09-22T16:45:06Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.