Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees
- URL: http://arxiv.org/abs/2110.14341v2
- Date: Thu, 28 Oct 2021 09:47:58 GMT
- Title: Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees
- Authors: Fengzhuo Zhang, Anshoo Tandon, Vincent Y. F. Tan
- Abstract summary: We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
- Score: 75.93186954061943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Chow-Liu algorithm (IEEE Trans.~Inform.~Theory, 1968) has been a mainstay
for the learning of tree-structured graphical models from i.i.d.\ sampled data
vectors. Its theoretical properties have been well-studied and are
well-understood. In this paper, we focus on the class of trees that are
arguably even more fundamental, namely {\em homogeneous} trees in which each
pair of nodes that forms an edge has the same correlation $\rho$. We ask
whether we are able to further reduce the error probability of learning the
structure of the homogeneous tree model when {\em active learning} or {\em
active sampling of nodes or variables} is allowed. Our figure of merit is the
{\em error exponent}, which quantifies the exponential rate of decay of the
error probability with an increasing number of data samples. At first sight, an
improvement in the error exponent seems impossible, as all the edges are
statistically identical. We design and analyze an algorithm Active Learning
Algorithm for Trees with Homogeneous Edge (Active-LATHE), which surprisingly
boosts the error exponent by at least 40\% when $\rho$ is at least $0.8$. For
all other values of $\rho$, we also observe commensurate, but more modest,
improvements in the error exponent. Our analysis hinges on judiciously
exploiting the minute but detectable statistical variation of the samples to
allocate more data to parts of the graph in which we are less confident of
being correct.
Related papers
- NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - How Does Pseudo-Labeling Affect the Generalization Error of the
Semi-Supervised Gibbs Algorithm? [73.80001705134147]
We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm.
The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset.
arXiv Detail & Related papers (2022-10-15T04:11:56Z) - Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise
Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances.
Online descent (OGD) is a popular approach to handle streaming data in pairwise learning.
In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Decentralized Learning of Tree-Structured Gaussian Graphical Models from
Noisy Data [1.52292571922932]
This paper studies the decentralized learning of tree-structured Gaussian graphical models (GGMs) from noisy data.
GGMs are widely used to model complex networks such as gene regulatory networks and social networks.
The proposed decentralized learning uses the Chow-Liu algorithm for estimating the tree-structured GGM.
arXiv Detail & Related papers (2021-09-22T10:41:18Z) - Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models [30.62276301751904]
We introduce a new algorithm that combines elements of the Chow-Liu algorithm with tree metric reconstruction methods.
Our algorithm is robust to model misspecification and adversarial corruptions.
arXiv Detail & Related papers (2021-06-07T21:09:29Z) - Outlier-Robust Learning of Ising Models Under Dobrushin's Condition [57.89518300699042]
We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted.
Our main result is to provide the first computationally efficient robust learning algorithm for this problem with near-optimal error guarantees.
arXiv Detail & Related papers (2021-02-03T18:00:57Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - Sample-Optimal and Efficient Learning of Tree Ising models [24.201827888085944]
We show that $n$-variable tree-structured Ising models can be learned computationally-efficiently to within total variation distance $epsilon$ from an optimal $O(n ln n/epsilon2)$ samples.
Our guarantees do not follow from known results for the Chow-Liu algorithm.
arXiv Detail & Related papers (2020-10-28T10:17:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.