Visual Neural Decomposition to Explain Multivariate Data Sets
- URL: http://arxiv.org/abs/2009.05502v1
- Date: Fri, 11 Sep 2020 15:53:37 GMT
- Title: Visual Neural Decomposition to Explain Multivariate Data Sets
- Authors: Johannes Knittel, Andres Lalama, Steffen Koch, and Thomas Ertl
- Abstract summary: Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers.
We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables.
- Score: 13.117139248511783
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Investigating relationships between variables in multi-dimensional data sets
is a common task for data analysts and engineers. More specifically, it is
often valuable to understand which ranges of which input variables lead to
particular values of a given target variable. Unfortunately, with an increasing
number of independent variables, this process may become cumbersome and
time-consuming due to the many possible combinations that have to be explored.
In this paper, we propose a novel approach to visualize correlations between
input variables and a target output variable that scales to hundreds of
variables. We developed a visual model based on neural networks that can be
explored in a guided way to help analysts find and understand such
correlations. First, we train a neural network to predict the target from the
input variables. Then, we visualize the inner workings of the resulting model
to help understand relations within the data set. We further introduce a new
regularization term for the backpropagation algorithm that encourages the
neural network to learn representations that are easier to interpret visually.
We apply our method to artificial and real-world data sets to show its utility.
Related papers
- Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms.
We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z) - Structured Latent Variable Models for Articulated Object Interaction [24.97457132614502]
We investigate a scenario in which a robot learns a low-dimensional representation of a door given a video of the door opening or closing.
This representation can be used to infer door-related parameters and predict the outcomes of interacting with the door.
arXiv Detail & Related papers (2023-05-26T01:22:35Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Detection of Interacting Variables for Generalized Linear Models via
Neural Networks [0.0]
We present an approach to automating the process of finding interactions that should be added to generalized linear models (GLMs)
Our approach relies on neural networks and a model-specific interaction detection method, which is computationally faster than the traditionally used methods like Friedman H-Statistic or SHAP values.
In numerical studies, we provide the results of our approach on artificially generated data as well as open-source data.
arXiv Detail & Related papers (2022-09-16T16:16:45Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Triplot: model agnostic measures and visualisations for variable
importance in predictive models that take into account the hierarchical
correlation structure [3.0036519884678894]
We propose new methods to support model analysis by exploiting the information about the correlation between variables.
We show how to analyze groups of variables (aspects) both when they are proposed by the user and when they should be determined automatically.
We also present the new type of model visualisation, triplot, which exploits a hierarchical structure of variable grouping to produce a high information density model visualisation.
arXiv Detail & Related papers (2021-04-07T21:29:03Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z) - Malicious Network Traffic Detection via Deep Learning: An Information
Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset.
Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z) - Connecting the Dots: Multivariate Time Series Forecasting with Graph
Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data.
Our approach automatically extracts the uni-directed relations among variables through a graph learning module.
Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Learning Latent Causal Structures with a Redundant Input Neural Network [9.044150926401574]
It is known that inputs cause outputs, and these causal relationships are encoded by a causal network among a set of latent variables.
We develop a deep learning model, which we call a redundant input neural network (RINN), with a modified architecture and a regularized objective function.
A series of simulation experiments provide support that the RINN method can successfully recover latent causal structure between input and output variables.
arXiv Detail & Related papers (2020-03-29T20:52:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.