Related papers: Improving Fractal Pre-training

Improving Fractal Pre-training

URL: http://arxiv.org/abs/2110.03091v1
Date: Wed, 6 Oct 2021 22:39:51 GMT
Title: Improving Fractal Pre-training
Authors: Connor Anderson and Ryan Farrell
Abstract summary: We propose an improved pre-training dataset based on dynamically-generated fractal images. Our experiments demonstrate that fine-tuning a network pre-trained using fractals attains 92.7-98.1% of the accuracy of an ImageNet pre-trained network.
Score: 0.76146285961466
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The deep neural networks used in modern computer vision systems require enormous image datasets to train them. These carefully-curated datasets typically have a million or more images, across a thousand or more distinct categories. The process of creating and curating such a dataset is a monumental undertaking, demanding extensive effort and labelling expense and necessitating careful navigation of technical and social issues such as label accuracy, copyright ownership, and content bias. What if we had a way to harness the power of large image datasets but with few or none of the major issues and concerns currently faced? This paper extends the recent work of Kataoka et. al. (2020), proposing an improved pre-training dataset based on dynamically-generated fractal images. Challenging issues with large-scale image datasets become points of elegance for fractal pre-training: perfect label accuracy at zero cost; no need to store/transmit large image archives; no privacy/demographic bias/concerns of inappropriate content, as no humans are pictured; limitless supply and diversity of images; and the images are free/open-source. Perhaps surprisingly, avoiding these difficulties imposes only a small penalty in performance. Leveraging a newly-proposed pre-training task -- multi-instance prediction -- our experiments demonstrate that fine-tuning a network pre-trained using fractals attains 92.7-98.1\% of the accuracy of an ImageNet pre-trained network.

Related papers

Deepfake Detection of Face Images based on a Convolutional Neural Network [0.0]
Fake News and especially deepfakes (generated, non-real image or video content) have become a serious topic over the last years. We want to build a model based on a Convolutions Neural Network in order to detect such generated and fake images showing human portraits.
arXiv Detail & Related papers (2025-03-14T13:33:22Z)
ForAug: Recombining Foregrounds and Backgrounds to Improve Vision Transformer Training with Bias Mitigation [7.242733423663421]
Transformers have achieved state-of-the-art performance in large-scale image classification. They often require large amounts of data and can exhibit biases that limit their robustness and generalizability. This paper introduces ForAug, a novel data augmentation scheme that explicitly includes inductive biases.
arXiv Detail & Related papers (2025-03-12T13:49:45Z)
Scaling Backwards: Minimal Synthetic Pre-training? [52.78699562832907]
We show that pre-training is effective even with minimal synthetic images. We find that a substantial reduction of synthetic images from 1k to 1 can even lead to an increase in pre-training performance. We extend our method from synthetic images to real images to see if a single real image can show similar pre-training effect.
arXiv Detail & Related papers (2024-08-01T16:20:02Z)
Deep Image Composition Meets Image Forgery [0.0]
Image forgery has been studied for many years. Deep learning models require large amounts of labeled data for training. We use state of the art image composition deep learning models to generate spliced images close to the quality of real-life manipulations.
arXiv Detail & Related papers (2024-04-03T17:54:37Z)
PromptMix: Text-to-image diffusion models enhance the performance of lightweight networks [83.08625720856445]
Deep learning tasks require annotations that are too time consuming for human operators. In this paper, we introduce PromptMix, a method for artificially boosting the size of existing datasets. We show that PromptMix can significantly increase the performance of lightweight networks by up to 26%.
arXiv Detail & Related papers (2023-01-30T14:15:47Z)
Procedural Image Programs for Representation Learning [62.557911005179946]
We propose training with a large dataset of twenty-one thousand programs, each one generating a diverse set of synthetic images. These programs are short code snippets, which are easy to modify and fast to execute using. The proposed dataset can be used for both supervised and unsupervised representation learning, and reduces the gap between pre-training with real and procedurally generated images by 38%.
arXiv Detail & Related papers (2022-11-29T17:34:22Z)
Is Deep Image Prior in Need of a Good Education? [57.3399060347311]
Deep image prior was introduced as an effective prior for image reconstruction. Despite its impressive reconstructive properties, the approach is slow when compared to learned or traditional reconstruction techniques. We develop a two-stage learning paradigm to address the computational challenge.
arXiv Detail & Related papers (2021-11-23T15:08:26Z)
Inferring Offensiveness In Images From Natural Language Supervision [20.294073012815854]
Large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images. We show that pre-trained transformers themselves provide a methodology for the automated curation of large-scale vision datasets.
arXiv Detail & Related papers (2021-10-08T16:19:21Z)
See through Gradients: Image Batch Recovery via GradInversion [103.26922860665039]
We introduce GradInversion, using which input images from a larger batch can also be recovered for large networks such as ResNets (50 layers) We show that gradients encode a surprisingly large amount of information, such that all the individual images can be recovered with high fidelity via GradInversion, even for complex datasets, deep networks, and large batch sizes.
arXiv Detail & Related papers (2021-04-15T16:43:17Z)
Leveraging Self-Supervision for Cross-Domain Crowd Counting [71.75102529797549]
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density. We train our network to recognize upside-down real images from regular ones and incorporate into it the ability to predict its own uncertainty. This yields an algorithm that consistently outperforms state-of-the-art cross-domain crowd counting ones without any extra computation at inference time.
arXiv Detail & Related papers (2021-03-30T12:37:55Z)
Increasing the Robustness of Semantic Segmentation Models with Painting-by-Numbers [39.95214171175713]
We build upon an insight from image classification that output can be improved by increasing the network-bias towards object shapes. Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color. We demonstrate the effectiveness of our training schema for DeepLabv3+ with various network backbones, MobileNet-V2, ResNets, and Xception, and evaluate it on the Cityscapes dataset.
arXiv Detail & Related papers (2020-10-12T07:42:39Z)
SlideImages: A Dataset for Educational Image Classification [8.607440622310904]
We present SlideImages, a dataset for the task of classifying educational illustrations. We have reserved all the actual educational images as a test dataset. We present a baseline system using a standard deep neural architecture.
arXiv Detail & Related papers (2020-01-19T13:11:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.