Hierarchical Infinite Relational Model
- URL: http://arxiv.org/abs/2108.07208v1
- Date: Mon, 16 Aug 2021 16:32:13 GMT
- Title: Hierarchical Infinite Relational Model
- Authors: Feras A. Saad, Vikash K. Mansinghka
- Abstract summary: The hierarchical infinite relational model (HIRM) is a new probabilistic generative model for noisy, sparse, and heterogeneous relational data.
We present new algorithms for fully Bayesian posterior inference via Gibbs sampling.
- Score: 3.731168012111833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes the hierarchical infinite relational model (HIRM), a new
probabilistic generative model for noisy, sparse, and heterogeneous relational
data. Given a set of relations defined over a collection of domains, the model
first infers multiple non-overlapping clusters of relations using a top-level
Chinese restaurant process. Within each cluster of relations, a Dirichlet
process mixture is then used to partition the domain entities and model the
probability distribution of relation values. The HIRM generalizes the standard
infinite relational model and can be used for a variety of data analysis tasks
including dependence detection, clustering, and density estimation. We present
new algorithms for fully Bayesian posterior inference via Gibbs sampling. We
illustrate the efficacy of the method on a density estimation benchmark of
twenty object-attribute datasets with up to 18 million cells and use it to
discover relational structure in real-world datasets from politics and
genomics.
Related papers
- Empirical Density Estimation based on Spline Quasi-Interpolation with
applications to Copulas clustering modeling [0.0]
Density estimation is a fundamental technique employed in various fields to model and to understand the underlying distribution of data.
In this paper we propose the mono-variate approximation of the density using quasi-interpolation.
The presented algorithm is validated on artificial and real datasets.
arXiv Detail & Related papers (2024-02-18T11:49:38Z) - Optimal Heterogeneous Collaborative Linear Regression and Contextual
Bandits [34.121889149071684]
We study collaborative linear regression and contextual bandits, where each instance's associated parameters are equal to a global parameter plus a sparse instance-specific term.
We propose a novel two-stage estimator called MOLAR that leverages this structure by first constructing an entry-wise median of the instances' linear regression estimates, and then shrinking the instance-specific estimates towards the median.
We then apply MOLAR to develop methods for sparsely heterogeneous collaborative contextual bandits, which lead to improved regret guarantees compared to independent bandit methods.
arXiv Detail & Related papers (2023-06-09T22:48:13Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Scalable Bayesian Network Structure Learning with Splines [2.741266294612776]
A Bayesian Network (BN) is a probabilistic graphical model consisting of a directed acyclic graph (DAG)
We present a novel approach capable of learning the global DAG structure of a BN and modelling linear and non-linear local relationships between variables.
arXiv Detail & Related papers (2021-10-27T17:54:53Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Clustering-based Unsupervised Generative Relation Extraction [3.342376225738321]
We propose a Clustering-based Unsupervised generative Relation Extraction framework (CURE)
We use an "Encoder-Decoder" architecture to perform self-supervised learning so the encoder can extract relation information.
Our model performs better than state-of-the-art models on both New York Times (NYT) and United Nations Parallel Corpus (UNPC) standard datasets.
arXiv Detail & Related papers (2020-09-26T20:36:40Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - Bayesian Sparse Factor Analysis with Kernelized Observations [67.60224656603823]
Multi-view problems can be faced with latent variable models.
High-dimensionality and non-linear issues are traditionally handled by kernel methods.
We propose merging both approaches into single model.
arXiv Detail & Related papers (2020-06-01T14:25:38Z) - Automated extraction of mutual independence patterns using Bayesian
comparison of partition models [7.6146285961466]
Mutual independence is a key concept in statistics that characterizes the structural relationships between variables.
Existing methods to investigate mutual independence rely on the definition of two competing models.
We propose a general Markov chain Monte Carlo (MCMC) algorithm to numerically approximate the posterior distribution on the space of all patterns of mutual independence.
arXiv Detail & Related papers (2020-01-15T16:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.