Classification by Separating Hypersurfaces: An Entropic Approach
- URL: http://arxiv.org/abs/2507.02732v1
- Date: Thu, 03 Jul 2025 15:43:54 GMT
- Title: Classification by Separating Hypersurfaces: An Entropic Approach
- Authors: Argimiro Arratia, Mahmoud El Daou, Henryk Gzyl,
- Abstract summary: We consider the classification problem of individuals characterized by a set of attributes represented as a vector in $mathbb RN$.<n>The goal is to find a hyperplane in $mathbb RN$ that separates two sets of points corresponding to two distinct classes.<n>We propose a novel approach by searching for a vector of parameters in a bounded $N-dimensional hypercube.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We consider the following classification problem: Given a population of individuals characterized by a set of attributes represented as a vector in ${\mathbb R}^N$, the goal is to find a hyperplane in ${\mathbb R}^N$ that separates two sets of points corresponding to two distinct classes. This problem, with a history dating back to the perceptron model, remains central to machine learning. In this paper we propose a novel approach by searching for a vector of parameters in a bounded $N$-dimensional hypercube centered at the origin and a positive vector in ${\mathbb R}^M$, obtained through the minimization of an entropy-based function defined over the space of unknown variables. The method extends to polynomial surfaces, allowing the separation of data points by more complex decision boundaries. This provides a robust alternative to traditional linear or quadratic optimization techniques, such as support vector machines and gradient descent. Numerical experiments demonstrate the efficiency and versatility of the method in handling diverse classification tasks, including linear and non-linear separability.
Related papers
- Inequalities for Optimization of Classification Algorithms: A Perspective Motivated by Diagnostic Testing [0.0]
We show how two main tasks in diagnostics can be recast in terms of a variation on the confusion (or error) matrix $boldsymbol rm P$.<n>We show that the largest Gershgorin radius $boldsymbol rho_m$ of the matrix $mathbb I-boldsymbol rm P$ yields uniform error bounds for both classification and prevalence estimation.
arXiv Detail & Related papers (2025-08-01T20:51:32Z) - The Generative Leap: Sharp Sample Complexity for Efficiently Learning Gaussian Multi-Index Models [71.5283441529015]
In this work we consider generic Gaussian Multi-index models, in which the labels only depend on the (Gaussian) $d$-dimensional inputs through their projection onto a low-dimensional $r = O_d(1)$ subspace.<n>We introduce the generative leap exponent $kstar$, a natural extension of the generative exponent from [Damian et al.'24] to the multi-index setting.
arXiv Detail & Related papers (2025-06-05T18:34:56Z) - A Convex formulation for linear discriminant analysis [1.3124513975412255]
We present a supervised dimensionality reduction technique called Convex Linear Discriminant Analysis (ConvexLDA)<n>We show that ConvexLDA outperforms several popular linear discriminant analysis (LDA)-based methods on a range of high-dimensional biological data, image data sets, etc.
arXiv Detail & Related papers (2025-03-17T18:17:49Z) - Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in Linear Models [76.52307406752556]
We derive a novel deterministic equivalence for the two-point function of a random resolvent.<n>We give a unified derivation of the performance of a wide variety of high-dimensional trained linear models with gradient descent.
arXiv Detail & Related papers (2025-02-07T16:45:40Z) - Computational-Statistical Gaps in Gaussian Single-Index Models [77.1473134227844]
Single-Index Models are high-dimensional regression problems with planted structure.
We show that computationally efficient algorithms, both within the Statistical Query (SQ) and the Low-Degree Polynomial (LDP) framework, necessarily require $Omega(dkstar/2)$ samples.
arXiv Detail & Related papers (2024-03-08T18:50:19Z) - Registration of algebraic varieties using Riemannian optimization [0.0]
We consider the problem of finding a transformation between two point clouds that represent the same object but are expressed in different coordinate systems.
Our approach is not based on a point-to-point correspondence, matching every point in the source point cloud to a point in the target point cloud.
arXiv Detail & Related papers (2024-01-16T18:47:38Z) - MOCK: an Algorithm for Learning Nonparametric Differential Equations via Multivariate Occupation Kernel Functions [0.6030884970981525]
A nonparametric system of ordinary differential equations from trajectories in a $d$-dimensional state space requires learning $d$ functions of $d$ variables.<n>Explicit formulations often scale quadratically in $d$ unless additional knowledge about system properties, such as sparsity and symmetries, is available.<n>We propose a linear approach, the multivariate occupation kernel method (MOCK), using the implicit formulation provided by vector-valued kernel Hilbert spaces.
arXiv Detail & Related papers (2023-06-16T21:49:36Z) - Sparse Linear Centroid-Encoder: A Convex Method for Feature Selection [1.057079240576682]
We present Sparse Centroid-Encoder (SLCE) over a novel feature selection technique.
The algorithm uses a linear network to reconstruct a neural feature at the same time.
arXiv Detail & Related papers (2023-06-07T23:07:55Z) - Deep Learning Approximation of Diffeomorphisms via Linear-Control
Systems [91.3755431537592]
We consider a control system of the form $dot x = sum_i=1lF_i(x)u_i$, with linear dependence in the controls.
We use the corresponding flow to approximate the action of a diffeomorphism on a compact ensemble of points.
arXiv Detail & Related papers (2021-10-24T08:57:46Z) - Semiparametric Nonlinear Bipartite Graph Representation Learning with
Provable Guarantees [106.91654068632882]
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
We show that the proposed objective is strongly convex in a neighborhood around the ground truth, so that a gradient descent-based method achieves linear convergence rate.
Our estimator is robust to any model misspecification within the exponential family, which is validated in extensive experiments.
arXiv Detail & Related papers (2020-03-02T16:40:36Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.