Tree Learning: Optimal Algorithms and Sample Complexity
- URL: http://arxiv.org/abs/2302.04492v1
- Date: Thu, 9 Feb 2023 08:35:17 GMT
- Title: Tree Learning: Optimal Algorithms and Sample Complexity
- Authors: Dmitrii Avdiukhin, Grigory Yaroslavtsev, Danny Vainstein, Orr Fischer,
Sauman Das, Faraz Mirza
- Abstract summary: We study the problem of learning a hierarchical tree representation of data from labeled samples taken from an arbitrary distribution.
We present optimal sample bounds for this problem in several learning settings, including (agnostic) learning and online learning.
- Score: 10.638365461509
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning a hierarchical tree representation of data
from labeled samples, taken from an arbitrary (and possibly adversarial)
distribution. Consider a collection of data tuples labeled according to their
hierarchical structure. The smallest number of such tuples required in order to
be able to accurately label subsequent tuples is of interest for data
collection in machine learning. We present optimal sample complexity bounds for
this problem in several learning settings, including (agnostic) PAC learning
and online learning. Our results are based on tight bounds of the Natarajan and
Littlestone dimensions of the associated problem. The corresponding tree
classifiers can be constructed efficiently in near-linear time.
Related papers
- Discovering Data Structures: Nearest Neighbor Search and Beyond [18.774836778996544]
We propose a general framework for end-to-end learning of data structures.
Our framework adapts to the underlying data distribution and provides fine-grained control over query and space complexity.
We first apply this framework to the problem of nearest neighbor search.
arXiv Detail & Related papers (2024-11-05T16:50:54Z) - Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain.
We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression.
We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z) - Optimal Sample Complexity of Contrastive Learning [12.108926584457736]
We study the complexity of contrastive learning, i.e. the minimum number of labeled generalizations sufficient for getting high accuracy.
We give tight bounds on the sample complexity in a variety of settings, focusing on arbitrary distance functions.
We show that the theoretical bounds on sample complexity obtained via VC/Natarajan can have strong predictive power for experimental results.
arXiv Detail & Related papers (2023-12-01T06:57:11Z) - Learning bounded-degree polytrees with known skeleton [15.137372516678143]
We establish finite-sample guarantees for efficient proper learning of bounded-degree polytrees.
We provide an efficient algorithm which learns $d$-polytrees in time and sample complexity for any bounded $d$ when the underlying undirected graph is known.
arXiv Detail & Related papers (2023-10-10T06:03:51Z) - Information-Computation Tradeoffs for Learning Margin Halfspaces with
Random Classification Noise [50.64137465792738]
We study the problem of PAC $gamma$-margin halfspaces with Random Classification Noise.
We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms.
arXiv Detail & Related papers (2023-06-28T16:33:39Z) - Hierarchical clustering with dot products recovers hidden tree structure [53.68551192799585]
In this paper we offer a new perspective on the well established agglomerative clustering algorithm, focusing on recovery of hierarchical structure.
We recommend a simple variant of the standard algorithm, in which clusters are merged by maximum average dot product and not, for example, by minimum distance or within-cluster variance.
We demonstrate that the tree output by this algorithm provides a bona fide estimate of generative hierarchical structure in data, under a generic probabilistic graphical model.
arXiv Detail & Related papers (2023-05-24T11:05:12Z) - Dive into Layers: Neural Network Capacity Bounding using Algebraic
Geometry [55.57953219617467]
We show that the learnability of a neural network is directly related to its size.
We use Betti numbers to measure the topological geometric complexity of input data and the neural network.
We perform the experiments on a real-world dataset MNIST and the results verify our analysis and conclusion.
arXiv Detail & Related papers (2021-09-03T11:45:51Z) - Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG)
We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product.
We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z) - Probabilistic Label Trees for Extreme Multi-label Classification [8.347190888362194]
Problems of extreme multi-label classification (XMLC) are efficiently handled by organizing labels as a tree.
PLTs can be treated as a generalization of hierarchical softmax for multi-label problems.
We introduce the model and discuss training and inference procedures and their computational costs.
We prove a specific equivalence between the fully online algorithm and an algorithm with a tree structure given in advance.
arXiv Detail & Related papers (2020-09-23T15:30:00Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z) - OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax
Layer [77.90012156266324]
This paper aims to find a subspace of neural networks that can facilitate a large decision margin.
We propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain during both the training and test processes.
Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets.
arXiv Detail & Related papers (2020-04-20T02:41:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.