CHALLENGER: Training with Attribution Maps
- URL: http://arxiv.org/abs/2205.15094v1
- Date: Mon, 30 May 2022 13:34:46 GMT
- Title: CHALLENGER: Training with Attribution Maps
- Authors: Christian Tomani and Daniel Cremers
- Abstract summary: We show that utilizing attribution maps for training neural networks can improve regularization of models and thus increase performance.
In particular, we show that our generic domain-independent approach yields state-of-the-art results in vision, natural language processing and on time series tasks.
- Score: 63.736435657236505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show that utilizing attribution maps for training neural networks can
improve regularization of models and thus increase performance. Regularization
is key in deep learning, especially when training complex models on relatively
small datasets. In order to understand inner workings of neural networks,
attribution methods such as Layer-wise Relevance Propagation (LRP) have been
extensively studied, particularly for interpreting the relevance of input
features. We introduce Challenger, a module that leverages the explainable
power of attribution maps in order to manipulate particularly relevant input
patterns. Therefore, exposing and subsequently resolving regions of ambiguity
towards separating classes on the ground-truth data manifold, an issue that
arises particularly when training models on rather small datasets. Our
Challenger module increases model performance through building more diverse
filters within the network and can be applied to any input data domain. We
demonstrate that our approach results in substantially better classification as
well as calibration performance on datasets with only a few samples up to
datasets with thousands of samples. In particular, we show that our generic
domain-independent approach yields state-of-the-art results in vision, natural
language processing and on time series tasks.
Related papers
- Hierarchical Multi-Label Classification with Missing Information for Benthic Habitat Imagery [1.6492989697868894]
We show the capacity to conduct HML training in scenarios where there exist multiple levels of missing annotation information.
We find that, when using smaller one-hot image label datasets typical of local or regional scale benthic science projects, models pre-trained with self-supervision on a larger collection of in-domain benthic data outperform models pre-trained on ImageNet.
arXiv Detail & Related papers (2024-09-10T16:15:01Z) - Generative Expansion of Small Datasets: An Expansive Graph Approach [13.053285552524052]
We introduce an Expansive Synthesis model generating large-scale, information-rich datasets from minimal samples.
An autoencoder with self-attention layers and optimal transport refines distributional consistency.
Results show comparable performance, demonstrating the model's potential to augment training data effectively.
arXiv Detail & Related papers (2024-06-25T02:59:02Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z) - Multimodal Prototypical Networks for Few-shot Learning [20.100480009813953]
Cross-modal feature generation framework is used to enrich the low populated embedding space in few-shot scenarios.
We show that in such cases nearest neighbor classification is a viable approach and outperform state-of-the-art single-modal and multimodal few-shot learning methods.
arXiv Detail & Related papers (2020-11-17T19:32:59Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.