Related papers: Visual Neural Decomposition to Explain Multivariate Data Sets

Visual Neural Decomposition to Explain Multivariate Data Sets

URL: http://arxiv.org/abs/2009.05502v1
Date: Fri, 11 Sep 2020 15:53:37 GMT
Title: Visual Neural Decomposition to Explain Multivariate Data Sets
Authors: Johannes Knittel, Andres Lalama, Steffen Koch, and Thomas Ertl
Abstract summary: Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables.
Score: 13.117139248511783
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. More specifically, it is often valuable to understand which ranges of which input variables lead to particular values of a given target variable. Unfortunately, with an increasing number of independent variables, this process may become cumbersome and time-consuming due to the many possible combinations that have to be explored. In this paper, we propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables. We developed a visual model based on neural networks that can be explored in a guided way to help analysts find and understand such correlations. First, we train a neural network to predict the target from the input variables. Then, we visualize the inner workings of the resulting model to help understand relations within the data set. We further introduce a new regularization term for the backpropagation algorithm that encourages the neural network to learn representations that are easier to interpret visually. We apply our method to artificial and real-world data sets to show its utility.

Related papers

Multivariate Temporal Regression at Scale: A Three-Pillar Framework Combining ML, XAI, and NLP [1.331812695405053]
This paper dives into the hurdles of analyzing high-dimensional data, especially when it gets too complex. Traditional methods in data analysis often look at direct connections between input variables, which can miss out on the more complicated relationships within the data. We consider the role of synthetic data and how information can sometimes be redundant across different sensors.
arXiv Detail & Related papers (2025-04-02T21:53:03Z)
Large-Scale Targeted Cause Discovery with Data-Driven Learning [66.86881771339145]
We propose a novel machine learning approach for inferring causal variables of a target variable from observations. By employing a local-inference strategy, our approach scales with linear complexity in the number of variables, efficiently scaling up to thousands of variables. Empirical results demonstrate superior performance in identifying causal relationships within large-scale gene regulatory networks.
arXiv Detail & Related papers (2024-08-29T02:21:11Z)
Unsupervised Learning of Invariance Transformations [105.54048699217668]
We develop an algorithmic framework for finding approximate graph automorphisms. We discuss how this framework can be used to find approximate automorphisms in weighted graphs in general.
arXiv Detail & Related papers (2023-07-24T17:03:28Z)
Structured Latent Variable Models for Articulated Object Interaction [24.97457132614502]
We investigate a scenario in which a robot learns a low-dimensional representation of a door given a video of the door opening or closing. This representation can be used to infer door-related parameters and predict the outcomes of interacting with the door.
arXiv Detail & Related papers (2023-05-26T01:22:35Z)
Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data. Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z)
Detection of Interacting Variables for Generalized Linear Models via Neural Networks [0.0]
We present an approach to automating the process of finding interactions that should be added to generalized linear models (GLMs) Our approach relies on neural networks and a model-specific interaction detection method, which is computationally faster than the traditionally used methods like Friedman H-Statistic or SHAP values. In numerical studies, we provide the results of our approach on artificially generated data as well as open-source data.
arXiv Detail & Related papers (2022-09-16T16:16:45Z)
Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution. We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z)
Triplot: model agnostic measures and visualisations for variable importance in predictive models that take into account the hierarchical correlation structure [3.0036519884678894]
We propose new methods to support model analysis by exploiting the information about the correlation between variables. We show how to analyze groups of variables (aspects) both when they are proposed by the user and when they should be determined automatically. We also present the new type of model visualisation, triplot, which exploits a hierarchical structure of variable grouping to produce a high information density model visualisation.
arXiv Detail & Related papers (2021-04-07T21:29:03Z)
Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. We propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z)
Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View [0.0]
We study how homeomorphism affects learned representation of a malware traffic dataset. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same.
arXiv Detail & Related papers (2020-09-16T15:37:44Z)
Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module. Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z)
Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks. We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task. Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
Learning Latent Causal Structures with a Redundant Input Neural Network [9.044150926401574]
It is known that inputs cause outputs, and these causal relationships are encoded by a causal network among a set of latent variables. We develop a deep learning model, which we call a redundant input neural network (RINN), with a modified architecture and a regularized objective function. A series of simulation experiments provide support that the RINN method can successfully recover latent causal structure between input and output variables.
arXiv Detail & Related papers (2020-03-29T20:52:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.