Related papers: On the Trade-off between the Number of Nodes and the Number of Trees in a Random Forest

On the Trade-off between the Number of Nodes and the Number of Trees in a Random Forest

URL: http://arxiv.org/abs/2312.11540v2
Date: Sat, 3 Feb 2024 11:26:49 GMT
Title: On the Trade-off between the Number of Nodes and the Number of Trees in a Random Forest
Authors: Tatsuya Akutsu, Avraham A. Melkman, Atsuhiro Takasu
Abstract summary: We show that the majority function of $n$ variables can be represented by a bag of $T$ ($ n$) decision trees each with size if $n-T$ is a constant. A related result on the $k$-out-of-$n$ functions is presented too.
Score: 8.340919965932443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we focus on the prediction phase of a random forest and study the problem of representing a bag of decision trees using a smaller bag of decision trees, where we only consider binary decision problems on the binary domain and simple decision trees in which an internal node is limited to querying the Boolean value of a single variable. As a main result, we show that the majority function of $n$ variables can be represented by a bag of $T$ ($< n$) decision trees each with polynomial size if $n-T$ is a constant, where $n$ and $T$ must be odd (in order to avoid the tie break). We also show that a bag of $n$ decision trees can be represented by a bag of $T$ decision trees each with polynomial size if $n-T$ is a constant and a small classification error is allowed. A related result on the $k$-out-of-$n$ functions is presented too.

Related papers

Fully-Dynamic Approximate Decision Trees With Worst-Case Update Time Guarantees [3.5509551353363644]
We give the first algorithm that maintains an approximate decision tree over an arbitrary sequence of insertions and deletions of labeled examples. We provide a deterministic algorithm that maintains an $epsilon$-approximate tree using $O!left(fracd, f(n)n operatornamepolyfrachepsilonright)$ operations per update.
arXiv Detail & Related papers (2023-02-08T11:02:58Z)
Superpolynomial Lower Bounds for Decision Tree Learning and Testing [5.117810469621253]
We prove, under randomized ETH, superpolynomial lower bounds for two basic problems. Our results imply new lower bounds for distribution-free PAC learning and testing of decision trees. We show our lower bound for learning decision trees can be improved to $nOmega(log s)$.
arXiv Detail & Related papers (2022-10-12T16:25:48Z)
The Approximate Degree of DNF and CNF Formulas [95.94432031144716]
For every $delta>0,$ we construct CNF and formulas of size with approximate degree $Omega(n1-delta),$ essentially matching the trivial upper bound of $n. We show that for every $delta>0$, these models require $Omega(n1-delta)$, $Omega(n/4kk2)1-delta$, and $Omega(n/4kk2)1-delta$, respectively.
arXiv Detail & Related papers (2022-09-04T10:01:39Z)
On Computing Probabilistic Explanations for Decision Trees [4.406418914680962]
"sufficient reasons" are a kind of explanation in which given a decision tree $T$ and an instance $x$, one explains the decision $T(x)$. Our paper settles the computational complexity of $delta$-sufficient-reasons over decision trees. We identify structural restrictions of decision trees that make the problem tractable, and show how SAT solvers might be able to tackle these problems in practical settings.
arXiv Detail & Related papers (2022-06-30T21:58:31Z)
Exponential Separation between Quantum and Classical Ordered Binary Decision Diagrams, Reordering Method and Hierarchies [68.93512627479197]
We study quantum Ordered Binary Decision Diagrams($OBDD$) model. We prove lower bounds and upper bounds for OBDD with arbitrary order of input variables. We extend hierarchy for read$k$-times Ordered Binary Decision Diagrams ($k$-OBDD$) of width.
arXiv Detail & Related papers (2022-04-22T12:37:56Z)
On Learning and Testing Decision Tree [0.22843885788439797]
We give a new algorithm that achieves $phi(s,1/epsilon)=2O(log3s)/epsilon3)$. We also study the testability of depth-$d$ decision tree and give a it distribution free tester.
arXiv Detail & Related papers (2021-08-10T11:04:06Z)
Growing Deep Forests Efficiently with Soft Routing and Learned Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects. We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions. Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z)
Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary. We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z)
Decision trees as partitioning machines to characterize their generalization properties [2.370481325034443]
We revisit binary decision trees on real-valued features from the perspective of partitions of the data. We show that the VC dimension of a binary tree structure with $N$ internal nodes is of order $N log(Nell)$. We elaborate a pruning algorithm based on these results that performs better than the CART algorithm on a number of datasets.
arXiv Detail & Related papers (2020-10-14T19:25:58Z)
On Distributed Differential Privacy and Counting Distinct Elements [52.701425652208734]
We study the setup where each of $n$ users holds an element from a discrete set. The goal is to count the number of distinct elements across all users.
arXiv Detail & Related papers (2020-09-21T04:13:34Z)
An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z)
On $\ell_p$-norm Robustness of Ensemble Stumps and Trees [83.81523991945018]
We develop an efficient programming based algorithm for sound verification of ensemble stumps. We demonstrate the first certified defense method for training ensemble stumps and trees with respect to $ell_p$ norm perturbations.
arXiv Detail & Related papers (2020-08-20T03:42:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.