Related papers: On Modifying a Neural Network's Perception

On Modifying a Neural Network's Perception

URL: http://arxiv.org/abs/2303.02655v1
Date: Sun, 5 Mar 2023 12:09:37 GMT
Title: On Modifying a Neural Network's Perception
Authors: Manuel de Sousa Ribeiro and Jo\~ao Leite
Abstract summary: We propose a method which allows one to modify what an artificial neural network is perceiving regarding specific human-defined concepts. We test the proposed method on different models, assessing whether the performed manipulations are well interpreted by the models, and analyzing how they react to them.
Score: 3.42658286826597
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Artificial neural networks have proven to be extremely useful models that have allowed for multiple recent breakthroughs in the field of Artificial Intelligence and many others. However, they are typically regarded as black boxes, given how difficult it is for humans to interpret how these models reach their results. In this work, we propose a method which allows one to modify what an artificial neural network is perceiving regarding specific human-defined concepts, enabling the generation of hypothetical scenarios that could help understand and even debug the neural network model. Through empirical evaluation, in a synthetic dataset and in the ImageNet dataset, we test the proposed method on different models, assessing whether the performed manipulations are well interpreted by the models, and analyzing how they react to them.

Related papers

Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
From Neurons to Neutrons: A Case Study in Interpretability [5.242869847419834]
We argue that high-dimensional neural networks can learn low-dimensional representations of their training data that are useful beyond simply making good predictions. This indicates that such approaches to interpretability can be useful for deriving a new understanding of a problem from models trained to solve it.
arXiv Detail & Related papers (2024-05-27T17:59:35Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process. We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z)
A Spectral Theory of Neural Prediction and Alignment [8.65717258105897]
We use a recent theoretical framework that relates the generalization error from regression to the spectral properties of the model and the target. We test a large number of deep neural networks that predict visual cortical activity and show that there are multiple types of geometries that result in low neural prediction error as measured via regression.
arXiv Detail & Related papers (2023-09-22T12:24:06Z)
EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting. We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z)
Creating Powerful and Interpretable Models withRegression Networks [2.2049183478692584]
We propose a novel architecture, Regression Networks, which combines the power of neural networks with the understandability of regression analysis. We demonstrate that the models exceed the state-of-the-art performance of interpretable models on several benchmark datasets.
arXiv Detail & Related papers (2021-07-30T03:37:00Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
Human-Understandable Decision Making for Visual Recognition [30.30163407674527]
We propose a new framework to train a deep neural network by incorporating the prior of human perception into the model learning process. The effectiveness of our proposed model is evaluated on two classical visual recognition tasks.
arXiv Detail & Related papers (2021-03-05T02:07:33Z)
The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain. In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z)
On the Transferability of Adversarial Attacksagainst Neural Text Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models. We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models. We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.