Provable guarantees for decision tree induction: the agnostic setting
- URL: http://arxiv.org/abs/2006.00743v1
- Date: Mon, 1 Jun 2020 06:44:07 GMT
- Title: Provable guarantees for decision tree induction: the agnostic setting
- Authors: Guy Blanc and Jane Lange and Li-Yang Tan
- Abstract summary: We give strengthened provable guarantees on the performance of widely employed and empirically successful sl top-down decision tree learnings.
We show that for all monotone functions$f$ and parameters $sin mathN$, theses construct a decision tree of size $stildeO((log s)/varepsilon2)$ that achieves error.
We complement our algorithmic guarantee with a near-matching $stildeOmega(log s)$ lower bound.
- Score: 16.784355746717562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give strengthened provable guarantees on the performance of widely
employed and empirically successful {\sl top-down decision tree learning
heuristics}. While prior works have focused on the realizable setting, we
consider the more realistic and challenging {\sl agnostic} setting. We show
that for all monotone functions~$f$ and parameters $s\in \mathbb{N}$, these
heuristics construct a decision tree of size $s^{\tilde{O}((\log
s)/\varepsilon^2)}$ that achieves error $\le \mathsf{opt}_s + \varepsilon$,
where $\mathsf{opt}_s$ denotes the error of the optimal size-$s$ decision tree
for $f$. Previously, such a guarantee was not known to be achievable by any
algorithm, even one that is not based on top-down heuristics. We complement our
algorithmic guarantee with a near-matching $s^{\tilde{\Omega}(\log s)}$ lower
bound.
Related papers
- Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time
Guarantees [3.5509551353363644]
We give the first algorithm that maintains an approximate decision tree over an arbitrary sequence of insertions and deletions of labeled examples.
We provide a deterministic algorithm that maintains an $epsilon$-approximate tree using $O!left(fracd, f(n)n operatornamepolyfrachepsilonright)$ operations per update.
arXiv Detail & Related papers (2023-02-08T11:02:58Z) - Fully-Dynamic Decision Trees [3.0058005235097114]
We develop the first fully dynamic algorithm that maintains a decision tree over an arbitrary sequence of insertions and deletions of labeled examples.
For real-valued features the algorithm has an amortized running time per insertion/deletion of $Obig(fracd log3 nepsilon2big)$.
We show that any algorithm with similar guarantees uses amortized running time $Omega(d)$ and space $tildeOmega (n d)$.
arXiv Detail & Related papers (2022-12-01T18:58:19Z) - Superpolynomial Lower Bounds for Decision Tree Learning and Testing [5.117810469621253]
We prove, under randomized ETH, superpolynomial lower bounds for two basic problems.
Our results imply new lower bounds for distribution-free PAC learning and testing of decision trees.
We show our lower bound for learning decision trees can be improved to $nOmega(log s)$.
arXiv Detail & Related papers (2022-10-12T16:25:48Z) - Bayesian Optimistic Optimisation with Exponentially Decaying Regret [58.02542541410322]
The current practical BO algorithms have regret bounds ranging from $mathcalO(fraclogNsqrtN)$ to $mathcal O(e-sqrtN)$, where $N$ is the number of evaluations.
This paper explores the possibility of improving the regret bound in the noiseless setting by intertwining concepts from BO and tree-based optimistic optimisation.
We propose the BOO algorithm, a first practical approach which can achieve an exponential regret bound with order $mathcal O(N-sqrt
arXiv Detail & Related papers (2021-05-10T13:07:44Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - On Avoiding the Union Bound When Answering Multiple Differentially
Private Queries [49.453751858361265]
We give an algorithm for this task that achieves an expected $ell_infty$ error bound of $O(frac1epsilonsqrtk log frac1delta)$.
On the other hand, the algorithm of Dagan and Kur has a remarkable advantage that the $ell_infty$ error bound of $O(frac1epsilonsqrtk log frac1delta)$ holds not only in expectation but always.
arXiv Detail & Related papers (2020-12-16T17:58:45Z) - Testing and reconstruction via decision trees [19.304587350775385]
We study sublinear and local computation algorithms for decision trees, focusing on testing and reconstruction.
A tester that runs in $mathrmpoly(log s, 1/varepsilon)cdot nlog n$ time makes $mathrmpoly(log s,1/varepsilon)cdot n$ queries to an unknown function.
arXiv Detail & Related papers (2020-12-16T04:18:00Z) - Universal guarantees for decision tree induction via a higher-order
splitting criterion [16.832966312395126]
Our algorithm achieves provable guarantees for all target functions $f: -1,1n to -1,1$ with respect to the uniform distribution.
The crux of our extension is a new splitting criterion that takes into account the correlations between $f$ and small subsets of its attributes.
Our algorithm satisfies the following guarantee: for all target functions $f : -1,1n to -1,1$, sizes $sin mathbbN$, and error parameters $epsilon$, it constructs a decision
arXiv Detail & Related papers (2020-10-16T21:20:45Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.