On the Parameterized Complexity of Polytree Learning
- URL: http://arxiv.org/abs/2105.09675v1
- Date: Thu, 20 May 2021 11:29:12 GMT
- Title: On the Parameterized Complexity of Polytree Learning
- Authors: Niels Gr\"uttemeier, Christian Komusiewicz, Nils Morawietz
- Abstract summary: A fundamental task in data science is to learn a Bayesian network from observed data.
We show that textscPolytree Learning can be solved in $3n cdot |I|mathcalO(1)$ time.
- Score: 4.060731229044571
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A Bayesian network is a directed acyclic graph that represents statistical
dependencies between variables of a joint probability distribution. A
fundamental task in data science is to learn a Bayesian network from observed
data. \textsc{Polytree Learning} is the problem of learning an optimal Bayesian
network that fulfills the additional property that its underlying undirected
graph is a forest. In this work, we revisit the complexity of \textsc{Polytree
Learning}. We show that \textsc{Polytree Learning} can be solved in $3^n \cdot
|I|^{\mathcal{O}(1)}$ time where $n$ is the number of variables and $|I|$ is
the total instance size. Moreover, we consider the influence of the number of
variables $d$ that might receive a nonempty parent set in the final DAG on the
complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree
Learning} has no $f(d)\cdot |I|^{\mathcal{O}(1)}$-time algorithm, unlike
Bayesian network learning which can be solved in $2^d \cdot
|I|^{\mathcal{O}(1)}$ time. We show that, in contrast, if $d$ and the maximum
parent set size are bounded, then we can obtain efficient algorithms.
Related papers
- Learning Networks from Wide-Sense Stationary Stochastic Processes [7.59499154221528]
A key inference problem here is to learn edge connectivity from node outputs (potentials)
We use a Whittle's maximum likelihood estimator (MLE) to learn the support of $Last$ from temporally correlated samples.
We show that the MLE problem is strictly convex, admitting a unique solution.
arXiv Detail & Related papers (2024-12-04T23:14:00Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Most Neural Networks Are Almost Learnable [52.40331776572531]
We show that for any fixed $epsilon>0$ and depth $i$, there is a poly-time algorithm that learns random Xavier networks of depth $i$.
The algorithm runs in time and sample complexity of $(bard)mathrmpoly(epsilon-1)$, where $bar d$ is the size of the network.
For some cases of sigmoid and ReLU-like activations the bound can be improved to $(bard)mathrmpolylog(eps
arXiv Detail & Related papers (2023-05-25T22:27:42Z) - Learning and Covering Sums of Independent Random Variables with
Unbounded Support [4.458210211781738]
We study the problem of covering and learning sums $X = X_1 + cdots + X_n$ of independent integer-valued random variables $X_i$ with unbounded, or even infinite, support.
We show that the maximum value of the collective support of $X_i$'s necessarily appears in the sample complexity of learning $X$.
arXiv Detail & Related papers (2022-10-24T15:03:55Z) - Superpolynomial Lower Bounds for Decision Tree Learning and Testing [5.117810469621253]
We prove, under randomized ETH, superpolynomial lower bounds for two basic problems.
Our results imply new lower bounds for distribution-free PAC learning and testing of decision trees.
We show our lower bound for learning decision trees can be improved to $nOmega(log s)$.
arXiv Detail & Related papers (2022-10-12T16:25:48Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Active-LATHE: An Active Learning Algorithm for Boosting the Error
Exponent for Learning Homogeneous Ising Trees [75.93186954061943]
We design and analyze an algorithm that boosts the error exponent by at least 40% when $rho$ is at least $0.8$.
Our analysis hinges on judiciously exploiting the minute but detectable statistical variation of the samples to allocate more data to parts of the graph.
arXiv Detail & Related papers (2021-10-27T10:45:21Z) - Testing network correlation efficiently via counting trees [19.165942326142538]
We propose a new procedure for testing whether two networks are edge-correlated through some latent correspondence.
The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees.
arXiv Detail & Related papers (2021-10-22T14:47:20Z) - Small Covers for Near-Zero Sets of Polynomials and Learning Latent
Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$.
Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z) - An Algorithm for Learning Smaller Representations of Models With Scarce
Data [0.0]
We present a greedy algorithm for solving binary classification problems in situations where the dataset is too small or not fully representative.
It relies on a trained model with loose accuracy constraints, an iterative hyperparameter pruning procedure, and a function used to generate new data.
arXiv Detail & Related papers (2020-10-15T19:17:51Z) - Average-case Complexity of Teaching Convex Polytopes via Halfspace
Queries [55.28642461328172]
We show that the average-case teaching complexity is $Theta(d)$, which is in sharp contrast to the worst-case teaching complexity of $Theta(n)$.
Our insights allow us to establish a tight bound on the average-case complexity for $phi$-separable dichotomies.
arXiv Detail & Related papers (2020-06-25T19:59:24Z) - Learning Bayesian Networks Under Sparsity Constraints: A Parameterized
Complexity Analysis [7.99536002595393]
We study the problem of learning the structure of an optimal Bayesian network when additional constraints are posed on the network or on its moralized graph.
We show that learning an optimal network with at most $k$ edges in the moralized graph presumably has no $f(k)cdot |I|O(1)$-time algorithm.
arXiv Detail & Related papers (2020-04-30T12:31:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.