Related papers: When less is more: Simplifying inputs aids neural network understanding

When less is more: Simplifying inputs aids neural network understanding

URL: http://arxiv.org/abs/2201.05610v1
Date: Fri, 14 Jan 2022 18:58:36 GMT
Title: When less is more: Simplifying inputs aids neural network understanding
Authors: Robin Tibor Schirrmeister, Rosanne Liu, Sara Hooker, Tonio Ball
Abstract summary: In this work, we measure simplicity with the encoding bit size given by a pretrained generative model. We investigate the effect of such simplification in several scenarios: conventional training, dataset condensation and post-hoc explanations.
Score: 12.73748893809092
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: How do neural network image classifiers respond to simpler and simpler inputs? And what do such responses reveal about the learning process? To answer these questions, we need a clear measure of input simplicity (or inversely, complexity), an optimization objective that correlates with simplification, and a framework to incorporate such objective into training and inference. Lastly we need a variety of testbeds to experiment and evaluate the impact of such simplification on learning. In this work, we measure simplicity with the encoding bit size given by a pretrained generative model, and minimize the bit size to simplify inputs in training and inference. We investigate the effect of such simplification in several scenarios: conventional training, dataset condensation and post-hoc explanations. In all settings, inputs are simplified along with the original classification task, and we investigate the trade-off between input simplicity and task performance. For images with injected distractors, such simplification naturally removes superfluous information. For dataset condensation, we find that inputs can be simplified with almost no accuracy degradation. When used in post-hoc explanation, our learning-based simplification approach offers a valuable new tool to explore the basis of network decisions.

Related papers

Log Optimization Simplification Method for Predicting Remaining Time [9.196871811517026]
We present a prediction point selection algorithm designed to avoid the simplification of all points that function similarly. Experiments indicate that the simplified event log retains its predictive performance and, in some cases, enhances its predictive accuracy compared to the original event log.
arXiv Detail & Related papers (2025-03-10T11:54:44Z)
Pretraining with Random Noise for Fast and Robust Learning without Weight Transport [6.916179672407521]
We show that pretraining neural networks with random noise increases the learning efficiency as well as generalization abilities without weight transport. Sequential training with both random noise and data brings weights closer to synaptic feedback than training solely with data. This pre-regularization allows the network to learn simple solutions of a low rank, reducing the generalization loss during subsequent training.
arXiv Detail & Related papers (2024-05-27T00:12:51Z)
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs. We take an incremental computing approach, looking to reuse calculations as the inputs change. We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z)
Is Cross-modal Information Retrieval Possible without Training? [4.616703548353372]
We take a simple mapping computed from the least squares and singular value decomposition (SVD) for a solution to the Procrustes problem. That is, given information in one modality such as text, the mapping helps us locate a semantically equivalent data item in another modality such as image. Using off-the-shelf pretrained deep learning models, we have experimented the aforementioned simple cross-modal mappings in tasks of text-to-image and image-to-text retrieval.
arXiv Detail & Related papers (2023-04-20T02:36:18Z)
ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds. The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled. The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z)
Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant. One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning. Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks. We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z)
BatchFormer: Learning to Explore Sample Relationships for Robust Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch. BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training. We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z)
Towards Sample-efficient Overparameterized Meta-learning [37.676063120293044]
An overarching goal in machine learning is to build a generalizable model with few samples. This paper aims to demystify over parameterization for meta-learning. We show that learning the optimal representation coincides with the problem of designing a task-aware regularization.
arXiv Detail & Related papers (2022-01-16T21:57:17Z)
Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque. Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z)
Robust Cell-Load Learning with a Small Sample Set [35.07023055409166]
Learning of the cell-load in radio access networks (RANs) has to be performed within a short time period. We propose a learning framework that is robust against uncertainties resulting from the need for learning based on a relatively small training sample set.
arXiv Detail & Related papers (2021-03-21T19:17:01Z)
Predicting What You Already Know Helps: Provable Self-Supervised Learning [60.27658820909876]
Self-supervised representation learning solves auxiliary prediction tasks (known as pretext tasks) without requiring labeled data. We show a mechanism exploiting the statistical connections between certain em reconstruction-based pretext tasks that guarantee to learn a good representation. We prove the linear layer yields small approximation error even for complex ground truth function class.
arXiv Detail & Related papers (2020-08-03T17:56:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.