Related papers: A Survey on Knowledge Editing of Neural Networks

A Survey on Knowledge Editing of Neural Networks

URL: http://arxiv.org/abs/2310.19704v2
Date: Thu, 14 Dec 2023 09:16:36 GMT
Title: A Survey on Knowledge Editing of Neural Networks
Authors: Vittorio Mazzia, Alessandro Pedrani, Andrea Caciolai, Kay Rottmann, Davide Bernardi
Abstract summary: Even the largest artificial neural networks make mistakes, and once-correct predictions can become invalid as the world progresses in time. Knowledge editing is emerging as a novel area of research that aims to enable reliable, data-efficient, and fast changes to a pre-trained target model. We first introduce the problem of editing neural networks, formalize it in a common framework and differentiate it from more notorious branches of research such as continuous learning.
Score: 46.42502573973257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks are becoming increasingly pervasive in academia and industry, matching and surpassing human performance on a wide variety of fields and related tasks. However, just as humans, even the largest artificial neural networks make mistakes, and once-correct predictions can become invalid as the world progresses in time. Augmenting datasets with samples that account for mistakes or up-to-date information has become a common workaround in practical applications. However, the well-known phenomenon of catastrophic forgetting poses a challenge in achieving precise changes in the implicitly memorized knowledge of neural network parameters, often requiring a full model re-training to achieve desired behaviors. That is expensive, unreliable, and incompatible with the current trend of large self-supervised pre-training, making it necessary to find more efficient and effective methods for adapting neural network models to changing data. To address this need, knowledge editing is emerging as a novel area of research that aims to enable reliable, data-efficient, and fast changes to a pre-trained target model, without affecting model behaviors on previously learned tasks. In this survey, we provide a brief review of this recent artificial intelligence field of research. We first introduce the problem of editing neural networks, formalize it in a common framework and differentiate it from more notorious branches of research such as continuous learning. Next, we provide a review of the most relevant knowledge editing approaches and datasets proposed so far, grouping works under four different families: regularization techniques, meta-learning, direct model editing, and architectural strategies. Finally, we outline some intersections with other fields of research and potential directions for future works.

Related papers

State-Space Modeling in Long Sequence Processing: A Survey on Recurrence in the Transformer Era [59.279784235147254]
This survey provides an in-depth summary of the latest approaches that are based on recurrent models for sequential data processing. The emerging picture suggests that there is room for thinking of novel routes, constituted by learning algorithms which depart from the standard Backpropagation Through Time.
arXiv Detail & Related papers (2024-06-13T12:51:22Z)
Meta-Learning in Spiking Neural Networks with Reward-Modulated STDP [2.179313476241343]
We propose a bio-plausible meta-learning model inspired by the hippocampus and the prefrontal cortex. Our new model can easily be applied to spike-based neuromorphic devices and enables fast learning in neuromorphic hardware.
arXiv Detail & Related papers (2023-06-07T13:08:46Z)
Cooperative data-driven modeling [44.99833362998488]
Data-driven modeling in mechanics is evolving rapidly based on recent machine learning advances. New data and models created by different groups become available, opening possibilities for cooperative modeling. Artificial neural networks suffer from catastrophic forgetting, i.e. they forget how to perform an old task when trained on a new one. This hinders cooperation because adapting an existing model for a new task affects the performance on a previous task trained by someone else.
arXiv Detail & Related papers (2022-11-23T14:27:25Z)
Augmented Bilinear Network for Incremental Multi-Stock Time-Series Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities. In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed. This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z)
Causal Discovery and Knowledge Injection for Contestable Neural Networks (with Appendices) [10.616061367794385]
We propose a two-way interaction whereby neural-network-empowered machines can expose the underpinning learnt causal graphs. We show that our method improves predictive performance up to 2.4x while producing parsimonious networks, up to 7x smaller in the input layer.
arXiv Detail & Related papers (2022-05-19T18:21:12Z)
Neural Architecture Search for Dense Prediction Tasks in Computer Vision [74.9839082859151]
Deep learning has led to a rising demand for neural network architecture engineering. neural architecture search (NAS) aims at automatically designing neural network architectures in a data-driven manner rather than manually. NAS has become applicable to a much wider range of problems in computer vision.
arXiv Detail & Related papers (2022-02-15T08:06:50Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Being Friends Instead of Adversaries: Deep Networks Learn from Data Simplified by Other Networks [23.886422706697882]
A different idea has been recently proposed, named Friendly Training, which consists in altering the input data by adding an automatically estimated perturbation. We revisit and extend this idea inspired by the effectiveness of neural generators in the context of Adversarial Machine Learning. We propose an auxiliary multi-layer network that is responsible of altering the input data to make them easier to be handled by the classifier.
arXiv Detail & Related papers (2021-12-18T16:59:35Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks [1.1802674324027231]
Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data. We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains.
arXiv Detail & Related papers (2020-07-01T22:55:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.