Structure learning in polynomial time: Greedy algorithms, Bregman
information, and exponential families
- URL: http://arxiv.org/abs/2110.04719v1
- Date: Sun, 10 Oct 2021 06:37:51 GMT
- Title: Structure learning in polynomial time: Greedy algorithms, Bregman
information, and exponential families
- Authors: Goutham Rajendran, Bohdan Kivva, Ming Gao, Bryon Aragam
- Abstract summary: We study a general greedy score-based algorithm for learning DAGs.
We show how recent algorithm-time algorithms for learning DAG models are a special case of this algorithm.
This observation suggests new score functions and optimality conditions based on the duality between Bregman divergences and exponential families.
- Score: 12.936601424755944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Greedy algorithms have long been a workhorse for learning graphical models,
and more broadly for learning statistical models with sparse structure. In the
context of learning directed acyclic graphs, greedy algorithms are popular
despite their worst-case exponential runtime. In practice, however, they are
very efficient. We provide new insight into this phenomenon by studying a
general greedy score-based algorithm for learning DAGs. Unlike edge-greedy
algorithms such as the popular GES and hill-climbing algorithms, our approach
is vertex-greedy and requires at most a polynomial number of score evaluations.
We then show how recent polynomial-time algorithms for learning DAG models are
a special case of this algorithm, thereby illustrating how these order-based
algorithms can be rigourously interpreted as score-based algorithms. This
observation suggests new score functions and optimality conditions based on the
duality between Bregman divergences and exponential families, which we explore
in detail. Explicit sample and computational complexity bounds are derived.
Finally, we provide extensive experiments suggesting that this algorithm indeed
optimizes the score in a variety of settings.
Related papers
- A Differentially Private Clustering Algorithm for Well-Clustered Graphs [6.523602840064548]
We provide an efficient ($epsilon,$delta$)-DP algorithm tailored specifically for such graphs.
Our algorithm works for well-clustered graphs with $k$ nearly-balanced clusters.
arXiv Detail & Related papers (2024-03-21T11:57:16Z) - Dual Algorithmic Reasoning [9.701208207491879]
We propose to learn algorithms by exploiting duality of the underlying algorithmic problem.
We demonstrate that simultaneously learning the dual definition of these optimisation problems in algorithmic learning allows for better learning.
We then validate the real-world utility of our dual algorithmic reasoner by deploying it on a challenging brain vessel classification task.
arXiv Detail & Related papers (2023-02-09T08:46:23Z) - Differentially-Private Hierarchical Clustering with Provable
Approximation Guarantees [79.59010418610625]
We study differentially private approximation algorithms for hierarchical clustering.
We show strong lower bounds for the problem: that any $epsilon$-DP algorithm must exhibit $O(|V|2/ epsilon)$-additive error for an input dataset.
We propose a private $1+o(1)$ approximation algorithm which also recovers the blocks exactly.
arXiv Detail & Related papers (2023-01-31T19:14:30Z) - A Pragmatic Look at Deep Imitation Learning [0.3626013617212666]
We re-implement 6 different adversarial imitation learning algorithms.
We evaluate them on a widely-used expert trajectory dataset.
GAIL consistently performs well across a range of sample sizes.
arXiv Detail & Related papers (2021-08-04T06:33:10Z) - The Bayesian Learning Rule [14.141964578853262]
We show that many machine-learning algorithms are specific instances of a single algorithm called the emphBayesian learning rule
The rule, derived from Bayesian principles, yields a wide-range of algorithms from fields such as optimization, deep learning, and graphical models.
arXiv Detail & Related papers (2021-07-09T17:28:55Z) - Provably Faster Algorithms for Bilevel Optimization [54.83583213812667]
Bilevel optimization has been widely applied in many important machine learning applications.
We propose two new algorithms for bilevel optimization.
We show that both algorithms achieve the complexity of $mathcalO(epsilon-1.5)$, which outperforms all existing algorithms by the order of magnitude.
arXiv Detail & Related papers (2021-06-08T21:05:30Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Efficient Permutation Discovery in Causal DAGs [9.22466799504763]
We introduce an efficient algorithm for finding sparse permutations of a directed acyclic graph.
We show that our method with depth $w$ runs in $O(pw+3)$ time.
We also compare our algorithm to provably consistent causal structure learning algorithms, such as the PC algorithm, GES, and GSP, and show that our method achieves comparable performance with a shorter runtime.
arXiv Detail & Related papers (2020-11-06T21:56:41Z) - Efficient Computation of Expectations under Spanning Tree Distributions [67.71280539312536]
We propose unified algorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree models.
Our algorithms exploit a fundamental connection between gradients and expectations, which allows us to derive efficient algorithms.
arXiv Detail & Related papers (2020-08-29T14:58:26Z) - Run2Survive: A Decision-theoretic Approach to Algorithm Selection based
on Survival Analysis [75.64261155172856]
survival analysis (SA) naturally supports censored data and offers appropriate ways to use such data for learning distributional models of algorithm runtime.
We leverage such models as a basis of a sophisticated decision-theoretic approach to algorithm selection, which we dub Run2Survive.
In an extensive experimental study with the standard benchmark ASlib, our approach is shown to be highly competitive and in many cases even superior to state-of-the-art AS approaches.
arXiv Detail & Related papers (2020-07-06T15:20:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.