An efficient, provably exact, practical algorithm for the 0-1 loss
linear classification problem
- URL: http://arxiv.org/abs/2306.12344v2
- Date: Wed, 2 Aug 2023 21:22:26 GMT
- Title: An efficient, provably exact, practical algorithm for the 0-1 loss
linear classification problem
- Authors: Xi He, Waheed Ul Rahman, Max A. Little
- Abstract summary: We show that incremental cell (ICE) can solve the 0-1 loss classification problem exactly in time.
This is the first, rigorously-proven, practical algorithm for this long-standing problem.
- Score: 4.418462313508022
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Algorithms for solving the linear classification problem have a long history,
dating back at least to 1936 with linear discriminant analysis. For linearly
separable data, many algorithms can obtain the exact solution to the
corresponding 0-1 loss classification problem efficiently, but for data which
is not linearly separable, it has been shown that this problem, in full
generality, is NP-hard. Alternative approaches all involve approximations of
some kind, including the use of surrogates for the 0-1 loss (for example, the
hinge or logistic loss) or approximate combinatorial search, none of which can
be guaranteed to solve the problem exactly. Finding efficient algorithms to
obtain an exact i.e. globally optimal solution for the 0-1 loss linear
classification problem with fixed dimension, remains an open problem. In
research we report here, we detail the rigorous construction of a new
algorithm, incremental cell enumeration (ICE), that can solve the 0-1 loss
classification problem exactly in polynomial time. We prove correctness using
concepts from the theory of hyperplane arrangements and oriented matroids. We
demonstrate the effectiveness of this algorithm on synthetic and real-world
datasets, showing optimal accuracy both in and out-of-sample, in practical
computational time. We also empirically demonstrate how the use of approximate
upper bound leads to polynomial time run-time improvements to the algorithm
whilst retaining exactness. To our knowledge, this is the first,
rigorously-proven polynomial time, practical algorithm for this long-standing
problem.
Related papers
- Discretize Relaxed Solution of Spectral Clustering via a Non-Heuristic
Algorithm [77.53604156112144]
We develop a first-order term to bridge the original problem and discretization algorithm.
Since the non-heuristic method is aware of the original graph cut problem, the final discrete solution is more reliable.
arXiv Detail & Related papers (2023-10-19T13:57:38Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - A Data-Driven Line Search Rule for Support Recovery in High-dimensional
Data Analysis [5.180648702293017]
We propose a novel and efficient data-driven line search rule to adaptively determine the appropriate step size.
A large number of comparisons with state-of-the-art algorithms in linear and logistic regression problems show the stability, effectiveness and superiority of the proposed algorithms.
arXiv Detail & Related papers (2021-11-21T12:18:18Z) - Graph Matching via Optimal Transport [11.93151370164898]
Solving the graph matching problem is increasingly important due to it's applications in operations research, computer vision, neuroscience, and more.
Current state-of-the-art algorithms are inefficient in matching very large graphs, though they produce good accuracy.
We present GOAT, a modification to the state-of-the-art graph matching approximation algorithm "FAQ" (Vogelstein, 2015), replacing its linear sum assignment step with the "Lightspeed Optimal Transport" method of Cuturi (2013).
arXiv Detail & Related papers (2021-11-09T19:18:18Z) - PROMPT: Parallel Iterative Algorithm for $\ell_{p}$ norm linear
regression via Majorization Minimization with an application to
semi-supervised graph learning [0.0]
We consider the problem of $ell_p$ norm linear regression, which has several applications such as in sparse recovery, data clustering, and semi-supervised learning.
We propose an iterative algorithm : Parallel IteRative AlgOrithM for $ell_P$ norm regression via MajorizaTion Minimization (PROMPT)
arXiv Detail & Related papers (2021-10-23T10:19:11Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z) - Learning Sparse Classifiers: Continuous and Mixed Integer Optimization
Perspectives [10.291482850329892]
Mixed integer programming (MIP) can be used to solve (to optimality) $ell_0$-regularized regression problems.
We propose two classes of scalable algorithms: an exact algorithm that can handlepapprox 50,000$ features in a few minutes, and approximate algorithms that can address instances with $papprox6$.
In addition, we present new estimation error bounds for $ell$-regularizeds.
arXiv Detail & Related papers (2020-01-17T18:47:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.