Interpretable transformed ANOVA approximation on the example of the
prevention of forest fires
- URL: http://arxiv.org/abs/2110.07353v1
- Date: Thu, 14 Oct 2021 13:39:05 GMT
- Title: Interpretable transformed ANOVA approximation on the example of the
prevention of forest fires
- Authors: Daniel Potts and Michael Schmischke
- Abstract summary: In this paper, we apply transformation ideas in order to design a complete orthonormal system in the $mathrmL$ space of functions.
We are able to apply the explainable ANOVA approximation for this basis and use Z-score transformed data in the UCI method.
We demonstrate the applicability of this procedure on the well-known forest fires data set from the machine learning repository.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The distribution of data points is a key component in machine learning. In
most cases, one uses min-max normalization to obtain nodes in $[0,1]$ or
Z-score normalization for standard normal distributed data. In this paper, we
apply transformation ideas in order to design a complete orthonormal system in
the $\mathrm{L}_2$ space of functions with the standard normal distribution as
integration weight. Subsequently, we are able to apply the explainable ANOVA
approximation for this basis and use Z-score transformed data in the method. We
demonstrate the applicability of this procedure on the well-known forest fires
data set from the UCI machine learning repository. The attribute ranking
obtained from the ANOVA approximation provides us with crucial information
about which variables in the data set are the most important for the detection
of fires.
Related papers
- Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis.
We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z) - Approximate sampling and estimation of partition functions using neural
networks [0.0]
We show how variational autoencoders (VAEs) can be applied to this task.
We invert the logic and train the VAE to fit a simple and tractable distribution, on the assumption of a complex and intractable latent distribution, specified up to normalization.
This procedure constructs approximations without the use of training data or Markov chain Monte Carlo sampling.
arXiv Detail & Related papers (2022-09-21T15:16:45Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Missing Data Imputation and Acquisition with Deep Hierarchical Models
and Hamiltonian Monte Carlo [2.666288135543677]
We present HH-VAEM, a Hierarchical VAE model for mixed-type incomplete data.
Our experiments show that HH-VAEM outperforms existing baselines in the tasks of missing data imputation, supervised learning and outlier identification.
We also present a sampling-based approach for efficiently computing the information gain when missing features are to be acquired with HH-VAEM.
arXiv Detail & Related papers (2022-02-09T17:50:52Z) - Gaussian Graphical Models as an Ensemble Method for Distributed Gaussian
Processes [8.4159776055506]
We propose a novel approach for aggregating the Gaussian experts' predictions by Gaussian graphical model (GGM)
We first estimate the joint distribution of latent and observed variables using the Expectation-Maximization (EM) algorithm.
Our new method outperforms other state-of-the-art DGP approaches.
arXiv Detail & Related papers (2022-02-07T15:22:56Z) - Adapting deep generative approaches for getting synthetic data with
realistic marginal distributions [0.0]
Deep generative models, such as variational autoencoders (VAEs), are a popular approach for creating such synthetic datasets from original data.
We propose a novel method, pre-transformation variational autoencoders (PTVAEs), to specifically address bimodal and skewed data.
The results show that the PTVAE approach can outperform others in both bimodal and skewed data generation.
arXiv Detail & Related papers (2021-05-14T15:47:20Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - One-shot Distributed Algorithm for Generalized Eigenvalue Problem [23.9525986377055]
Generalized eigenvalue problem (GEP) plays a vital role in a large family of high-dimensional statistical models.
Here we propose a general distributed GEP framework with one-shot communication for GEP.
arXiv Detail & Related papers (2020-10-22T11:43:16Z) - Variational Hyper-Encoding Networks [62.74164588885455]
We propose a framework called HyperVAE for encoding distributions of neural network parameters theta.
We predict the posterior distribution of the latent code, then use a matrix-network decoder to generate a posterior distribution q(theta)
arXiv Detail & Related papers (2020-05-18T06:46:09Z) - Unshuffling Data for Improved Generalization [65.57124325257409]
Generalization beyond the training distribution is a core challenge in machine learning.
We show that partitioning the data into well-chosen, non-i.i.d. subsets treated as multiple training environments can guide the learning of models with better out-of-distribution generalization.
arXiv Detail & Related papers (2020-02-27T03:07:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.