When can you trust feature selection? -- I: A condition-based analysis
of LASSO and generalised hardness of approximation
- URL: http://arxiv.org/abs/2312.11425v1
- Date: Mon, 18 Dec 2023 18:29:01 GMT
- Title: When can you trust feature selection? -- I: A condition-based analysis
of LASSO and generalised hardness of approximation
- Authors: Alexander Bastounis, Felipe Cucker, Anders C. Hansen
- Abstract summary: We show how no (randomised) algorithm can determine the correct support sets (with probability $> 1/2$) of minimisers of LASSO when reading approximate input.
For ill-posed inputs, the algorithm runs forever, hence, it will never produce a wrong answer.
For any algorithm defined on an open set containing a point with infinite condition number, there is an input for which the algorithm will either run forever or produce a wrong answer.
- Score: 49.1574468325115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The arrival of AI techniques in computations, with the potential for
hallucinations and non-robustness, has made trustworthiness of algorithms a
focal point. However, trustworthiness of the many classical approaches are not
well understood. This is the case for feature selection, a classical problem in
the sciences, statistics, machine learning etc. Here, the LASSO optimisation
problem is standard. Despite its widespread use, it has not been established
when the output of algorithms attempting to compute support sets of minimisers
of LASSO in order to do feature selection can be trusted. In this paper we
establish how no (randomised) algorithm that works on all inputs can determine
the correct support sets (with probability $> 1/2$) of minimisers of LASSO when
reading approximate input, regardless of precision and computing power.
However, we define a LASSO condition number and design an efficient algorithm
for computing these support sets provided the input data is well-posed (has
finite condition number) in time polynomial in the dimensions and logarithm of
the condition number. For ill-posed inputs the algorithm runs forever, hence,
it will never produce a wrong answer. Furthermore, the algorithm computes an
upper bound for the condition number when this is finite. Finally, for any
algorithm defined on an open set containing a point with infinite condition
number, there is an input for which the algorithm will either run forever or
produce a wrong answer. Our impossibility results stem from generalised
hardness of approximation -- within the Solvability Complexity Index (SCI)
hierarchy framework -- that generalises the classical phenomenon of hardness of
approximation.
Related papers
- Approximation Algorithms for Combinatorial Optimization with Predictions [3.7235228254732524]
We use predictions to improve over approximation guarantees of classic algorithms.
Our algorithms produce optimal solutions when provided with perfect predictions.
Although we show our approach to be optimal for this class of problems as a whole, there is a potential for exploiting specific structural properties of individual problems to obtain improved bounds.
arXiv Detail & Related papers (2024-11-25T17:31:34Z) - Agnostic Learning of Arbitrary ReLU Activation under Gaussian Marginals [10.850101961203752]
We consider the problem of learning an arbitrarily-biased ReLU activation (or neuron) over Gaussian marginals with the squared loss objective.
Our main result is a time statistical query (SQ) algorithm that gives the first constant factor approximation for arbitrary bias.
Our algorithm presents an interesting departure from existing algorithms, which are all based on gradient descent.
arXiv Detail & Related papers (2024-11-21T17:43:51Z) - Sum-of-Squares inspired Quantum Metaheuristic for Polynomial Optimization with the Hadamard Test and Approximate Amplitude Constraints [76.53316706600717]
Recently proposed quantum algorithm arXiv:2206.14999 is based on semidefinite programming (SDP)
We generalize the SDP-inspired quantum algorithm to sum-of-squares.
Our results show that our algorithm is suitable for large problems and approximate the best known classicals.
arXiv Detail & Related papers (2024-08-14T19:04:13Z) - Leveraging modular values in quantum algorithms: the Deutsch-Jozsa [0.0]
We present a novel approach to quantum algorithms, by taking advantage of modular values.
We focus on the problem of ascertaining whether a given function acting on a set of binary values is constant.
The proposed method, relying on the use of modular values, provides a high number of degrees of freedom for optimizing the new algorithm.
arXiv Detail & Related papers (2024-06-10T21:17:07Z) - Evaluating Genetic Algorithms through the Approximability Hierarchy [55.938644481736446]
In this paper, we analyze the usefulness of using genetic algorithms depending on the approximation class the problem belongs to.
In particular, we use the standard approximability hierarchy, showing that genetic algorithms are especially useful for the most pessimistic classes of the hierarchy.
arXiv Detail & Related papers (2024-02-01T09:18:34Z) - Computing Star Discrepancies with Numerical Black-Box Optimization
Algorithms [56.08144272945755]
We compare 8 popular numerical black-box optimization algorithms on the $L_infty$ star discrepancy problem.
We show that all used solvers perform very badly on a large majority of the instances.
We suspect that state-of-the-art numerical black-box optimization techniques fail to capture the global structure of the problem.
arXiv Detail & Related papers (2023-06-29T14:57:56Z) - Accelerated Nonnegative Tensor Completion via Integer Programming [7.3149416054553065]
We present an approach to nonnegative tensor completion based on integer programming.
We explore several variants that can maintain the same theoretical guarantees as the algorithm, but offer potentially faster computation.
arXiv Detail & Related papers (2022-11-28T21:00:25Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Computing solutions of Schr\"odinger equations on unbounded domains- On
the brink of numerical algorithms [0.0]
We demonstrate how an algorithm in general does not exist, yielding a substantial classification theory of which problems in quantum mechanics that can be computed.
We show that solutions to discrete NLS equations (focusing and defocusing) on an unbounded domain can always be computed with uniform bounds on the runtime of the algorithm.
Our results have implications beyond computational quantum mechanics and are a part of the Solvability Complexity Index (SCI) hierarchy and Smale's program on the foundations of computational mathematics.
arXiv Detail & Related papers (2020-10-30T16:08:32Z) - Refined bounds for algorithm configuration: The knife-edge of dual class
approximability [94.83809668933021]
We investigate how large should a training set be to ensure that a parameter's average metrics performance over the training set is close to its expected, future performance.
We show that if this approximation holds under the L-infinity norm, we can provide strong sample complexity bounds.
We empirically evaluate our bounds in the context of integer programming, one of the most powerful tools in computer science.
arXiv Detail & Related papers (2020-06-21T15:32:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.