BELIEF in Dependence: Leveraging Atomic Linearity in Data Bits for
Rethinking Generalized Linear Models
- URL: http://arxiv.org/abs/2210.10852v2
- Date: Mon, 4 Dec 2023 13:33:52 GMT
- Title: BELIEF in Dependence: Leveraging Atomic Linearity in Data Bits for
Rethinking Generalized Linear Models
- Authors: Benjamin Brown, Kai Zhang, Xiao-Li Meng
- Abstract summary: We develop a framework called binary expansion linear effect (BELIEF) for understanding arbitrary relationships with a binary outcome.
Models from the BELIEF framework are easily interpretable because they describe the association of binary variables in the language of linear models.
- Score: 6.435660232678891
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two linearly uncorrelated binary variables must be also independent because
non-linear dependence cannot manifest with only two possible states. This
inherent linearity is the atom of dependency constituting any complex form of
relationship. Inspired by this observation, we develop a framework called
binary expansion linear effect (BELIEF) for understanding arbitrary
relationships with a binary outcome. Models from the BELIEF framework are
easily interpretable because they describe the association of binary variables
in the language of linear models, yielding convenient theoretical insight and
striking Gaussian parallels. With BELIEF, one may study generalized linear
models (GLM) through transparent linear models, providing insight into how the
choice of link affects modeling. For example, setting a GLM interaction
coefficient to zero does not necessarily lead to the kind of no-interaction
model assumption as understood under their linear model counterparts.
Furthermore, for a binary response, maximum likelihood estimation for GLMs
paradoxically fails under complete separation, when the data are most
discriminative, whereas BELIEF estimation automatically reveals the perfect
predictor in the data that is responsible for complete separation. We explore
these phenomena and provide related theoretical results. We also provide
preliminary empirical demonstration of some theoretical results.
Related papers
- Graph-based Unsupervised Disentangled Representation Learning via Multimodal Large Language Models [42.17166746027585]
We introduce a bidirectional weighted graph-based framework to learn factorized attributes and their interrelations within complex data.
Specifically, we propose a $beta$-VAE based module to extract factors as the initial nodes of the graph.
By integrating these complementary modules, our model successfully achieves fine-grained, practical and unsupervised disentanglement.
arXiv Detail & Related papers (2024-07-26T15:32:21Z) - Weight-based Decomposition: A Case for Bilinear MLPs [0.0]
Gated Linear Units (GLUs) have become a common building block in modern foundation models.
Bilinear layers drop the non-linearity in the "gate" but still have comparable performance to other GLUs.
We develop a method to decompose the bilinear tensor into a set of interacting eigenvectors.
arXiv Detail & Related papers (2024-06-06T10:46:51Z) - On the Origins of Linear Representations in Large Language Models [51.88404605700344]
We introduce a simple latent variable model to formalize the concept dynamics of the next token prediction.
Experiments show that linear representations emerge when learning from data matching the latent variable model.
We additionally confirm some predictions of the theory using the LLaMA-2 large language model.
arXiv Detail & Related papers (2024-03-06T17:17:36Z) - SLEM: Machine Learning for Path Modeling and Causal Inference with Super
Learner Equation Modeling [3.988614978933934]
Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions using observational data.
Path models, Structural Equation Models (SEMs) and Directed Acyclic Graphs (DAGs) provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon.
We propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles.
arXiv Detail & Related papers (2023-08-08T16:04:42Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Hessian Eigenspectra of More Realistic Nonlinear Models [73.31363313577941]
We make a emphprecise characterization of the Hessian eigenspectra for a broad family of nonlinear models.
Our analysis takes a step forward to identify the origin of many striking features observed in more complex machine learning models.
arXiv Detail & Related papers (2021-03-02T06:59:52Z) - Disentangling Observed Causal Effects from Latent Confounders using
Method of Moments [67.27068846108047]
We provide guarantees on identifiability and learnability under mild assumptions.
We develop efficient algorithms based on coupled tensor decomposition with linear constraints to obtain scalable and guaranteed solutions.
arXiv Detail & Related papers (2021-01-17T07:48:45Z) - LowFER: Low-rank Bilinear Pooling for Link Prediction [4.110108749051657]
We propose a factorized bilinear pooling model, commonly used in multi-modal learning, for better fusion of entities and relations.
Our model naturally generalizes decomposition Tucker based TuckER model, which has been shown to generalize other models.
We evaluate on real-world datasets, reaching on par or state-of-the-art performance.
arXiv Detail & Related papers (2020-08-25T07:33:52Z) - Non-parametric Models for Non-negative Functions [48.7576911714538]
We provide the first model for non-negative functions from the same good linear models.
We prove that it admits a representer theorem and provide an efficient dual formulation for convex problems.
arXiv Detail & Related papers (2020-07-08T07:17:28Z) - ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on
Nonlinear ICA [11.919315372249802]
We consider the identifiability theory of probabilistic models.
We show that our model can be used for the estimation of the components in the framework of Independently Modulated Component Analysis.
arXiv Detail & Related papers (2020-02-26T14:43:30Z) - Learning Bijective Feature Maps for Linear ICA [73.85904548374575]
We show that existing probabilistic deep generative models (DGMs) which are tailor-made for image data, underperform on non-linear ICA tasks.
To address this, we propose a DGM which combines bijective feature maps with a linear ICA model to learn interpretable latent structures for high-dimensional data.
We create models that converge quickly, are easy to train, and achieve better unsupervised latent factor discovery than flow-based models, linear ICA, and Variational Autoencoders on images.
arXiv Detail & Related papers (2020-02-18T17:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.