Related papers: Attributions Beyond Neural Networks: The Linear Program Case

Attributions Beyond Neural Networks: The Linear Program Case

URL: http://arxiv.org/abs/2206.07203v1
Date: Tue, 14 Jun 2022 23:08:43 GMT
Title: Attributions Beyond Neural Networks: The Linear Program Case
Authors: Florian Peter Busch and Matej Ze\v{c}evi\'c and Kristian Kersting and Devendra Singh Dhami
Abstract summary: Linear programs (LPs) have been one of the building blocks in machine learning and have championed recent strides in differentiables for learning systems. We introduce an approach where we consider neural encodings for LPs that justify the application of attribution methods from explainable artificial intelligence (XAI) designed for neural learning systems.
Score: 17.103787431518683
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Linear Programs (LPs) have been one of the building blocks in machine learning and have championed recent strides in differentiable optimizers for learning systems. While there exist solvers for even high-dimensional LPs, understanding said high-dimensional solutions poses an orthogonal and unresolved problem. We introduce an approach where we consider neural encodings for LPs that justify the application of attribution methods from explainable artificial intelligence (XAI) designed for neural learning systems. The several encoding functions we propose take into account aspects such as feasibility of the decision space, the cost attached to each input, or the distance to special points of interest. We investigate the mathematical consequences of several XAI methods on said neural LP encodings. We empirically show that the attribution methods Saliency and LIME reveal indistinguishable results up to perturbation levels, and we propose the property of Directedness as the main discriminative criterion between Saliency and LIME on one hand, and a perturbation-based Feature Permutation approach on the other hand. Directedness indicates whether an attribution method gives feature attributions with respect to an increase of that feature. We further notice the baseline selection problem beyond the classical computer vision setting for Integrated Gradients.

Related papers

Neural network interpretability with layer-wise relevance propagation: novel techniques for neuron selection and visualization [0.49478969093606673]
We present a novel approach that improves the parsing of selected neurons during. LRP backward propagation, using the Visual Geometry Group 16 (VGG16) architecture as a case study. Our approach enhances interpretability and supports the development of more transparent artificial intelligence (AI) systems for computer vision applications.
arXiv Detail & Related papers (2024-12-07T15:49:14Z)
Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution [38.53065398127086]
This study investigates the potential of feature attribution methods to filter out uninformative features in input data for regression problems. We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space. To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.
arXiv Detail & Related papers (2024-09-25T09:50:51Z)
(Un)supervised Learning of Maximal Lyapunov Functions [0.4910937238451484]
We design a novel neural network architecture, which we prove to be a universal approximator of (maximal) Lyapunov functions.<n>We formulate the problem of training the Lyapunov function as an unsupervised optimization problem with dynamical constraints.<n>We show that it matches or outperforms state-of-the-art alternatives in the accuracy of the approximated regions of attraction.
arXiv Detail & Related papers (2024-08-30T12:40:12Z)
ODE Discovery for Longitudinal Heterogeneous Treatment Effects Inference [69.24516189971929]
In this paper, we introduce a new type of solution in the longitudinal setting: a closed-form ordinary differential equation (ODE) While we still rely on continuous optimization to learn an ODE, the resulting inference machine is no longer a neural network.
arXiv Detail & Related papers (2024-03-16T02:07:45Z)
Optimizing Solution-Samplers for Combinatorial Problems: The Landscape of Policy-Gradient Methods [52.0617030129699]
We introduce a novel theoretical framework for analyzing the effectiveness of DeepMatching Networks and Reinforcement Learning methods. Our main contribution holds for a broad class of problems including Max-and Min-Cut, Max-$k$-Bipartite-Bi, Maximum-Weight-Bipartite-Bi, and Traveling Salesman Problem. As a byproduct of our analysis we introduce a novel regularization process over vanilla descent and provide theoretical and experimental evidence that it helps address vanishing-gradient issues and escape bad stationary points.
arXiv Detail & Related papers (2023-10-08T23:39:38Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
Layer-wise Feedback feedback (LFP) is a novel training principle for neural network-like predictors.<n>LFP decomposes a reward to individual neurons based on their respective contributions.<n>Our method then implements a greedy reinforcing approach helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Unsupervised Legendre-Galerkin Neural Network for Stiff Partial Differential Equations [9.659504024299896]
We propose an unsupervised machine learning algorithm based on the Legendre-Galerkin neural network to find an accurate approximation to the solution of different types of PDEs. The proposed neural network is applied to the general 1D and 2D PDEs as well as singularly perturbed PDEs that possess boundary layer behavior.
arXiv Detail & Related papers (2022-07-21T00:47:47Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z)
Towards the One Learning Algorithm Hypothesis: A System-theoretic Approach [0.0]
The existence of a universal learning architecture in human cognition is a widely spread conjecture supported by experimental findings from neuroscience. We develop a closed-loop system with three main components: (i) a multi-resolution analysis pre-processor, (ii) a group-invariant feature extractor, and (iii) a progressive knowledge-based learning module. We introduce a novel learning algorithm that constructs progressively growing knowledge representations in multiple resolutions.
arXiv Detail & Related papers (2021-12-04T05:54:33Z)
Stochastic Mutual Information Gradient Estimation for Dimensionality Reduction Networks [11.634729459989996]
We introduce emerging information theoretic feature transformation protocols as an end-to-end neural network training approach. We present a dimensionality reduction network (MMINet) training procedure based on the estimate of the mutual information gradient. We experimentally evaluate our method with applications to high-dimensional biological data sets, and relate it to conventional feature selection algorithms.
arXiv Detail & Related papers (2021-05-01T08:20:04Z)
Hierarchical Deep CNN Feature Set-Based Representation Learning for Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics. Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space. In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z)
Physical System for Non Time Sequence Data [0.0]
We propose a novelty approach to connect machine learning to causal structure learning by jacobian matrix of neural network w.r.t. input variables.
arXiv Detail & Related papers (2020-10-07T06:27:15Z)
Towards Efficient Processing and Learning with Spikes: New Approaches for Multi-Spike Learning [59.249322621035056]
We propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks. In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented. Our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied.
arXiv Detail & Related papers (2020-05-02T06:41:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.