Depth and Representation in Vision Models
- URL: http://arxiv.org/abs/2211.06496v1
- Date: Fri, 11 Nov 2022 22:16:40 GMT
- Title: Depth and Representation in Vision Models
- Authors: Benjamin L. Badger
- Abstract summary: We find that the deeper the layer, the less accurate that layer's representation of the input is before training.
This work provides support for the theory that the tasks of image recognition and input generation are inseparable even for models trained exclusively to classify.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models develop successive representations of their input in
sequential layers, the last of which maps the final representation to the
output. Here we investigate the informational content of these representations
by observing the ability of convolutional image classification models to
autoencode the model's input using embeddings existing in various layers. We
find that the deeper the layer, the less accurate that layer's representation
of the input is before training. Inaccurate representation results from
non-uniqueness in which various distinct inputs give approximately the same
embedding. Non-unique representation is a consequence of both exact and
approximate non-invertibility of transformations present in the forward pass.
Learning to classify natural images leads to an increase in representation
clarity for early but not late layers, which instead form abstract images.
Rather than simply selecting for features present in the input necessary for
classification, deep layer representations are found to transform the input so
that it matches representations of the training data such that arbitrary inputs
are mapped to manifolds learned during training. This work provides support for
the theory that the tasks of image recognition and input generation are
inseparable even for models trained exclusively to classify.
Related papers
- Data Attribution for Text-to-Image Models by Unlearning Synthesized Images [71.23012718682634]
The goal of data attribution for text-to-image models is to identify the training images that most influence the generation of a new image.
We propose a new approach that efficiently identifies highly-influential images.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - Premonition: Using Generative Models to Preempt Future Data Changes in
Continual Learning [63.850451635362425]
Continual learning requires a model to adapt to ongoing changes in the data distribution.
We show that the combination of a large language model and an image generation model can similarly provide useful premonitions.
We find that the backbone of our pre-trained networks can learn representations useful for the downstream continual learning problem.
arXiv Detail & Related papers (2024-03-12T06:29:54Z) - Discriminative Class Tokens for Text-to-Image Diffusion Models [107.98436819341592]
We propose a non-invasive fine-tuning technique that capitalizes on the expressive potential of free-form text.
Our method is fast compared to prior fine-tuning methods and does not require a collection of in-class images.
We evaluate our method extensively, showing that the generated images are: (i) more accurate and of higher quality than standard diffusion models, (ii) can be used to augment training data in a low-resource setting, and (iii) reveal information about the data used to train the guiding classifier.
arXiv Detail & Related papers (2023-03-30T05:25:20Z) - Explaining Image Classifiers Using Contrastive Counterfactuals in
Generative Latent Spaces [12.514483749037998]
We introduce a novel method to generate causal and yet interpretable counterfactual explanations for image classifiers.
We use this framework to obtain contrastive and causal sufficiency and necessity scores as global explanations for black-box classifiers.
arXiv Detail & Related papers (2022-06-10T17:54:46Z) - Robust Training Using Natural Transformation [19.455666609149567]
We present NaTra, an adversarial training scheme to improve robustness of image classification algorithms.
We target attributes of the input images that are independent of the class identification, and manipulate those attributes to mimic real-world natural transformations.
We demonstrate the efficacy of our scheme by utilizing the disentangled latent representations derived from well-trained GANs.
arXiv Detail & Related papers (2021-05-10T01:56:03Z) - Understanding invariance via feedforward inversion of discriminatively
trained classifiers [30.23199531528357]
Past research has discovered that some extraneous visual detail remains in the output logits.
We develop a feedforward inversion model that produces remarkably high fidelity reconstructions.
Our approach is based on BigGAN, with conditioning on logits instead of one-hot class labels.
arXiv Detail & Related papers (2021-03-15T17:56:06Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - Demystifying Contrastive Self-Supervised Learning: Invariances,
Augmentations and Dataset Biases [34.02639091680309]
Recent gains in performance come from training instance classification models, treating each image and it's augmented versions as samples of a single class.
We demonstrate that approaches like MOCO and PIRL learn occlusion-invariant representations.
Second, we demonstrate that these approaches obtain further gains from access to a clean object-centric training dataset like Imagenet.
arXiv Detail & Related papers (2020-07-28T00:11:31Z) - Autoregressive Unsupervised Image Segmentation [8.894935073145252]
We propose a new unsupervised image segmentation approach based on mutual information between different views constructed of the inputs.
The proposed method outperforms current state-of-the-art on unsupervised image segmentation.
arXiv Detail & Related papers (2020-07-16T10:47:40Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - Memory-Efficient Incremental Learning Through Feature Adaptation [71.1449769528535]
We introduce an approach for incremental learning that preserves feature descriptors of training images from previously learned classes.
Keeping the much lower-dimensional feature embeddings of images reduces the memory footprint significantly.
Experimental results show that our method achieves state-of-the-art classification accuracy in incremental learning benchmarks.
arXiv Detail & Related papers (2020-04-01T21:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.