Related papers: Explaining Deep Models through Forgettable Learning Dynamics

Explaining Deep Models through Forgettable Learning Dynamics

URL: http://arxiv.org/abs/2301.04221v1
Date: Tue, 10 Jan 2023 21:59:20 GMT
Title: Explaining Deep Models through Forgettable Learning Dynamics
Authors: Ryan Benkert, Oluwaseun Joseph Aribido, and Ghassan AlRegib
Abstract summary: We visualize the learning behaviour during training by tracking how often samples are learned and forgotten in subsequent training epochs. Inspired by this phenomenon, we present a novel segmentation method that actively uses this information to alter the data representation within the model.
Score: 12.653673008542155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Even though deep neural networks have shown tremendous success in countless applications, explaining model behaviour or predictions is an open research problem. In this paper, we address this issue by employing a simple yet effective method by analysing the learning dynamics of deep neural networks in semantic segmentation tasks. Specifically, we visualize the learning behaviour during training by tracking how often samples are learned and forgotten in subsequent training epochs. This further allows us to derive important information about the proximity to the class decision boundary and identify regions that pose a particular challenge to the model. Inspired by this phenomenon, we present a novel segmentation method that actively uses this information to alter the data representation within the model by increasing the variety of difficult regions. Finally, we show that our method consistently reduces the amount of regions that are forgotten frequently. We further evaluate our method in light of the segmentation performance.

Related papers

The Importance of Being Lazy: Scaling Limits of Continual Learning [60.97756735877614]
We show that increasing model width is only beneficial when it reduces the amount of feature learning, yielding more laziness.<n>We study the intricate relationship between feature learning, task non-stationarity, and forgetting, finding that high feature learning is only beneficial with highly similar tasks.
arXiv Detail & Related papers (2025-06-20T10:12:38Z)
On Local Overfitting and Forgetting in Deep Neural Networks [6.7864586321550595]
We propose a novel score that captures the forgetting rate of deep models on validation data. We show that local overfitting occurs regardless of the presence of traditional overfitting. We devise a new ensemble method that aims to recover forgotten knowledge, relying solely on the training history of a single network.
arXiv Detail & Related papers (2024-12-17T14:53:38Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance. We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes. To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z)
Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning. Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z)
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond [69.64364187449773]
Masked modeling has emerged as a distinctive approach that involves predicting parts of the original data that are proportionally masked during training. We elaborate on the details of techniques within masked modeling, including diverse masking strategies, recovering targets, network architectures, and more. We conclude by discussing the limitations of current techniques and point out several potential avenues for advancing masked modeling research.
arXiv Detail & Related papers (2023-12-31T12:03:21Z)
Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation [3.2340528215722553]
A systematic task formulation of continual neural information retrieval is presented. A comprehensive continual neural information retrieval framework is proposed. Empirical evaluations illustrate that the proposed framework can successfully prevent catastrophic forgetting in neural information retrieval.
arXiv Detail & Related papers (2023-08-16T14:01:25Z)
Understanding Activation Patterns in Artificial Neural Networks by Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far. We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains. We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z)
Example Forgetting: A Novel Approach to Explain and Interpret Deep Neural Networks in Seismic Interpretation [12.653673008542155]
deep neural networks are an attractive component for the common interpretation pipeline. Deep neural networks are frequently met with distrust due to their property of producing semantically incorrect outputs when exposed to sections the model was not trained on. We introduce a method that effectively relates semantically malfunctioned predictions to their respectful positions within the neural network representation manifold.
arXiv Detail & Related papers (2023-02-24T19:19:22Z)
On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z)
Extracting Global Dynamics of Loss Landscape in Deep Learning Models [0.0]
We present a toolkit for the Dynamical Organization Of Deep Learning Loss Landscapes, or DOODL3. DOODL3 formulates the training of neural networks as a dynamical system, analyzes the learning process, and presents an interpretable global view of trajectories in the loss landscape.
arXiv Detail & Related papers (2021-06-14T18:07:05Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
A Visual Analytics Framework for Explaining and Diagnosing Transfer Learning Processes [42.57604833160855]
We present a visual analytics framework for the multi-level exploration of the transfer learning processes when training deep neural networks. Our framework establishes a multi-aspect design to explain how the learned knowledge from the existing model is transferred into the new learning task when training deep neural networks.
arXiv Detail & Related papers (2020-09-15T05:59:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.