Joints in Random Forests
- URL: http://arxiv.org/abs/2006.14937v3
- Date: Thu, 19 Nov 2020 16:15:10 GMT
- Title: Joints in Random Forests
- Authors: Alvaro H. C. Correia, Robert Peharz, Cassio de Campos
- Abstract summary: Decision Trees (DTs) and Random Forests (RFs) are powerful discriminative learners and tools of central importance to the everyday machine learning practitioner and data scientist.
We show that DTs and RFs can naturally be interpreted as generative models, by drawing a connection to Probabilistic Circuits.
This reinterpretation equips them with a full joint distribution over the feature space and leads to Generative Decision Trees (GeDTs) and Generative Forests (GeFs)
- Score: 13.096855747795303
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Decision Trees (DTs) and Random Forests (RFs) are powerful discriminative
learners and tools of central importance to the everyday machine learning
practitioner and data scientist. Due to their discriminative nature, however,
they lack principled methods to process inputs with missing features or to
detect outliers, which requires pairing them with imputation techniques or a
separate generative model. In this paper, we demonstrate that DTs and RFs can
naturally be interpreted as generative models, by drawing a connection to
Probabilistic Circuits, a prominent class of tractable probabilistic models.
This reinterpretation equips them with a full joint distribution over the
feature space and leads to Generative Decision Trees (GeDTs) and Generative
Forests (GeFs), a family of novel hybrid generative-discriminative models. This
family of models retains the overall characteristics of DTs and RFs while
additionally being able to handle missing features by means of marginalisation.
Under certain assumptions, frequently made for Bayes consistency results, we
show that consistency in GeDTs and GeFs extend to any pattern of missing input
features, if missing at random. Empirically, we show that our models often
outperform common routines to treat missing data, such as K-nearest neighbour
imputation, and moreover, that our models can naturally detect outliers by
monitoring the marginal probability of input features.
Related papers
- Gaussian Mixture Models for Affordance Learning using Bayesian Networks [50.18477618198277]
Affordances are fundamental descriptors of relationships between actions, objects and effects.
This paper approaches the problem of an embodied agent exploring the world and learning these affordances autonomously from its sensory experiences.
arXiv Detail & Related papers (2024-02-08T22:05:45Z) - Accurate generation of stochastic dynamics based on multi-model
Generative Adversarial Networks [0.0]
Generative Adversarial Networks (GANs) have shown immense potential in fields such as text and image generation.
Here we quantitatively test this approach by applying it to a prototypical process on a lattice.
Importantly, the discreteness of the model is retained despite the noise.
arXiv Detail & Related papers (2023-05-25T10:41:02Z) - Bayesian Networks for the robust and unbiased prediction of depression
and its symptoms utilizing speech and multimodal data [65.28160163774274]
We apply a Bayesian framework to capture the relationships between depression, depression symptoms, and features derived from speech, facial expression and cognitive game data collected at thymia.
arXiv Detail & Related papers (2022-11-09T14:48:13Z) - Relational Neural Markov Random Fields [29.43155380361715]
We introduce Markov Random Fields (RN-MRFs) which allow handling of complex hybrid domains.
We propose a maximum pseudolikelihood estimation-based learning algorithm with importance for training the potential parameters.
arXiv Detail & Related papers (2021-10-18T22:52:54Z) - Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning.
We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities.
We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z) - Dense open-set recognition with synthetic outliers generated by Real NVP [1.278093617645299]
We consider an outlier detection approach based on discriminative training with jointly learned synthetic outliers.
We show that this approach can be adapted for simultaneous semantic segmentation and dense outlier detection.
Our models perform competitively with respect to the state of the art despite producing predictions with only one forward pass.
arXiv Detail & Related papers (2020-11-22T19:40:26Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Open Set Recognition with Conditional Probabilistic Generative Models [51.40872765917125]
We propose Conditional Probabilistic Generative Models (CPGM) for open set recognition.
CPGM can detect unknown samples but also classify known classes by forcing different latent features to approximate conditional Gaussian distributions.
Experiment results on multiple benchmark datasets reveal that the proposed method significantly outperforms the baselines.
arXiv Detail & Related papers (2020-08-12T06:23:49Z) - Towards Robust Classification with Deep Generative Forests [13.096855747795303]
Decision Trees and Random Forests are among the most widely used machine learning models.
Being primarily discriminative models they lack principled methods to manipulate the uncertainty of predictions.
We exploit Generative Forests (GeFs) to extend Random Forests to generative models representing the full joint distribution over the feature space.
arXiv Detail & Related papers (2020-07-11T08:57:52Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.