Related papers: Fast decision tree learning solves hard coding-theoretic problems

Fast decision tree learning solves hard coding-theoretic problems

URL: http://arxiv.org/abs/2409.13096v2
Date: Wed, 25 Sep 2024 21:46:28 GMT
Title: Fast decision tree learning solves hard coding-theoretic problems
Authors: Caleb Koch, Carmen Strassle, Li-Yang Tan,
Abstract summary: We show that improvement of Ehrenfeucht and Haussler's algorithm will yield $O(log n)$-approximation algorithms for $k$-NCP. This can be interpreted as a new avenue for designing algorithms for $k$-NCP, or as one for establishing the optimality of Ehrenfeucht and Haussler's algorithm.
Score: 7.420043502440765
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We connect the problem of properly PAC learning decision trees to the parameterized Nearest Codeword Problem ($k$-NCP). Despite significant effort by the respective communities, algorithmic progress on both problems has been stuck: the fastest known algorithm for the former runs in quasipolynomial time (Ehrenfeucht and Haussler 1989) and the best known approximation ratio for the latter is $O(n/\log n)$ (Berman and Karpinsky 2002; Alon, Panigrahy, and Yekhanin 2009). Research on both problems has thus far proceeded independently with no known connections. We show that $\textit{any}$ improvement of Ehrenfeucht and Haussler's algorithm will yield $O(\log n)$-approximation algorithms for $k$-NCP, an exponential improvement of the current state of the art. This can be interpreted either as a new avenue for designing algorithms for $k$-NCP, or as one for establishing the optimality of Ehrenfeucht and Haussler's algorithm. Furthermore, our reduction along with existing inapproximability results for $k$-NCP already rule out polynomial-time algorithms for properly learning decision trees. A notable aspect of our hardness results is that they hold even in the setting of $\textit{weak}$ learning whereas prior ones were limited to the setting of strong learning.

Related papers

Perturb-and-Project: Differentially Private Similarities and Marginals [73.98880839337873]
We revisit the input perturbations framework for differential privacy where noise is added to the input $Ain mathcalS$. We first design novel efficient algorithms to privately release pair-wise cosine similarities. We derive a novel algorithm to compute $k$-way marginal queries over $n$ features.
arXiv Detail & Related papers (2024-06-07T12:07:16Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Improved Algorithms for Allen's Interval Algebra by Dynamic Programming with Sublinear Partitioning [9.594432031144716]
Allen's interval algebra is one of the most well-known calculi in qualitative temporal reasoning. We propose a novel framework for solving NP-hard qualitative reasoning problems. We obtain a major improvement of $O*((fraccnlogn)n)$ for Allen's interval algebra.
arXiv Detail & Related papers (2023-05-25T11:45:12Z)
Differentially-Private Hierarchical Clustering with Provable Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering. We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset. We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z)
Choosing the Right Algorithm With Hints From Complexity Theory [16.33500498939925]
We show that the Metropolis algorithm is clearly the best of all algorithms regarded for reasonable problem sizes. An artificial algorithm of this type having an $O(n log n)$ runtime leads to the result that the significance-based compact genetic algorithm (sig-cGA) can solve the DLB problem in time $O(n log n)$ with high probability.
arXiv Detail & Related papers (2021-09-14T11:12:32Z)
Properly learning decision trees in almost polynomial time [25.763690981846125]
We give an $nO(loglog n)$-time membership query algorithm for properly and agnostically learning decision trees. Our algorithm shares similarities with practicals for learning decision trees. We show how every decision tree can be "pruned" so that every variable in the resulting tree is influential.
arXiv Detail & Related papers (2021-09-01T22:12:47Z)
Solving Infinite-Domain CSPs Using the Patchwork Property [17.101875753030754]
The constraint problem (CSP) has important applications in computer science and AI. A highly successful approach is to bound the treewidth of the underlying primal graph. We improve this bound to $f(w)cdot nO(1)$, for CSPs whose basic relations have patchwork property.
arXiv Detail & Related papers (2021-07-03T13:04:41Z)
Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications. We propose two new algorithms for bilevel optimization. We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Revisiting Modified Greedy Algorithm for Monotone Submodular Maximization with a Knapsack Constraint [75.85952446237599]
We show that a modified greedy algorithm can achieve an approximation factor of $0.305$. We derive a data-dependent upper bound on the optimum. It can also be used to significantly improve the efficiency of such algorithms as branch and bound.
arXiv Detail & Related papers (2020-08-12T15:40:21Z)
Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds [48.312484940846]
We revisit the problem of online learning with sleeping experts/bandits. In each time step, only a subset of the actions are available for the algorithm to choose from. We give an algorithm that provides a no-approximate-regret guarantee for the general sleeping expert/bandit problems.
arXiv Detail & Related papers (2020-03-07T02:13:21Z)
Optimal Sparse Decision Trees [25.043477914272046]
This work introduces the first practical algorithm for optimal decision trees for binary variables. The algorithm is a co-design of analytical bounds that reduce the search space and modern systems techniques. Our experiments highlight advantages in scalability, speed, and proof of optimality.
arXiv Detail & Related papers (2019-04-29T17:56:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.