Improving Fractal Pre-training
- URL: http://arxiv.org/abs/2110.03091v1
- Date: Wed, 6 Oct 2021 22:39:51 GMT
- Title: Improving Fractal Pre-training
- Authors: Connor Anderson and Ryan Farrell
- Abstract summary: We propose an improved pre-training dataset based on dynamically-generated fractal images.
Our experiments demonstrate that fine-tuning a network pre-trained using fractals attains 92.7-98.1% of the accuracy of an ImageNet pre-trained network.
- Score: 0.76146285961466
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The deep neural networks used in modern computer vision systems require
enormous image datasets to train them. These carefully-curated datasets
typically have a million or more images, across a thousand or more distinct
categories. The process of creating and curating such a dataset is a monumental
undertaking, demanding extensive effort and labelling expense and necessitating
careful navigation of technical and social issues such as label accuracy,
copyright ownership, and content bias.
What if we had a way to harness the power of large image datasets but with
few or none of the major issues and concerns currently faced? This paper
extends the recent work of Kataoka et. al. (2020), proposing an improved
pre-training dataset based on dynamically-generated fractal images. Challenging
issues with large-scale image datasets become points of elegance for fractal
pre-training: perfect label accuracy at zero cost; no need to store/transmit
large image archives; no privacy/demographic bias/concerns of inappropriate
content, as no humans are pictured; limitless supply and diversity of images;
and the images are free/open-source. Perhaps surprisingly, avoiding these
difficulties imposes only a small penalty in performance. Leveraging a
newly-proposed pre-training task -- multi-instance prediction -- our
experiments demonstrate that fine-tuning a network pre-trained using fractals
attains 92.7-98.1\% of the accuracy of an ImageNet pre-trained network.
Related papers
- Scaling Backwards: Minimal Synthetic Pre-training? [52.78699562832907]
We show that pre-training is effective even with minimal synthetic images.
We find that a substantial reduction of synthetic images from 1k to 1 can even lead to an increase in pre-training performance.
We extend our method from synthetic images to real images to see if a single real image can show similar pre-training effect.
arXiv Detail & Related papers (2024-08-01T16:20:02Z) - Deep Image Composition Meets Image Forgery [0.0]
Image forgery has been studied for many years.
Deep learning models require large amounts of labeled data for training.
We use state of the art image composition deep learning models to generate spliced images close to the quality of real-life manipulations.
arXiv Detail & Related papers (2024-04-03T17:54:37Z) - PromptMix: Text-to-image diffusion models enhance the performance of
lightweight networks [83.08625720856445]
Deep learning tasks require annotations that are too time consuming for human operators.
In this paper, we introduce PromptMix, a method for artificially boosting the size of existing datasets.
We show that PromptMix can significantly increase the performance of lightweight networks by up to 26%.
arXiv Detail & Related papers (2023-01-30T14:15:47Z) - Procedural Image Programs for Representation Learning [62.557911005179946]
We propose training with a large dataset of twenty-one thousand programs, each one generating a diverse set of synthetic images.
These programs are short code snippets, which are easy to modify and fast to execute using.
The proposed dataset can be used for both supervised and unsupervised representation learning, and reduces the gap between pre-training with real and procedurally generated images by 38%.
arXiv Detail & Related papers (2022-11-29T17:34:22Z) - Is Deep Image Prior in Need of a Good Education? [57.3399060347311]
Deep image prior was introduced as an effective prior for image reconstruction.
Despite its impressive reconstructive properties, the approach is slow when compared to learned or traditional reconstruction techniques.
We develop a two-stage learning paradigm to address the computational challenge.
arXiv Detail & Related papers (2021-11-23T15:08:26Z) - Inferring Offensiveness In Images From Natural Language Supervision [20.294073012815854]
Large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images.
We show that pre-trained transformers themselves provide a methodology for the automated curation of large-scale vision datasets.
arXiv Detail & Related papers (2021-10-08T16:19:21Z) - See through Gradients: Image Batch Recovery via GradInversion [103.26922860665039]
We introduce GradInversion, using which input images from a larger batch can also be recovered for large networks such as ResNets (50 layers)
We show that gradients encode a surprisingly large amount of information, such that all the individual images can be recovered with high fidelity via GradInversion, even for complex datasets, deep networks, and large batch sizes.
arXiv Detail & Related papers (2021-04-15T16:43:17Z) - Leveraging Self-Supervision for Cross-Domain Crowd Counting [71.75102529797549]
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density.
We train our network to recognize upside-down real images from regular ones and incorporate into it the ability to predict its own uncertainty.
This yields an algorithm that consistently outperforms state-of-the-art cross-domain crowd counting ones without any extra computation at inference time.
arXiv Detail & Related papers (2021-03-30T12:37:55Z) - Increasing the Robustness of Semantic Segmentation Models with
Painting-by-Numbers [39.95214171175713]
We build upon an insight from image classification that output can be improved by increasing the network-bias towards object shapes.
Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color.
We demonstrate the effectiveness of our training schema for DeepLabv3+ with various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it on the Cityscapes dataset.
arXiv Detail & Related papers (2020-10-12T07:42:39Z) - SlideImages: A Dataset for Educational Image Classification [8.607440622310904]
We present SlideImages, a dataset for the task of classifying educational illustrations.
We have reserved all the actual educational images as a test dataset.
We present a baseline system using a standard deep neural architecture.
arXiv Detail & Related papers (2020-01-19T13:11:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.