Energy-Inspired Self-Supervised Pretraining for Vision Models
- URL: http://arxiv.org/abs/2302.01384v1
- Date: Thu, 2 Feb 2023 19:41:00 GMT
- Title: Energy-Inspired Self-Supervised Pretraining for Vision Models
- Authors: Ze Wang, Jiang Wang, Zicheng Liu, and Qiang Qiu
- Abstract summary: We introduce a self-supervised vision model pretraining framework inspired by energy-based models (EBMs)
In the proposed framework, we model energy estimation and data restoration as the forward and backward passes of a single network.
We show the proposed method delivers comparable and even better performance with remarkably fewer epochs of training.
- Score: 36.70550531181131
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the fact that forward and backward passes of a deep network
naturally form symmetric mappings between input and output representations, we
introduce a simple yet effective self-supervised vision model pretraining
framework inspired by energy-based models (EBMs). In the proposed framework, we
model energy estimation and data restoration as the forward and backward passes
of a single network without any auxiliary components, e.g., an extra decoder.
For the forward pass, we fit a network to an energy function that assigns low
energy scores to samples that belong to an unlabeled dataset, and high energy
otherwise. For the backward pass, we restore data from corrupted versions
iteratively using gradient-based optimization along the direction of energy
minimization. In this way, we naturally fold the encoder-decoder architecture
widely used in masked image modeling into the forward and backward passes of a
single vision model. Thus, our framework now accepts a wide range of pretext
tasks with different data corruption methods, and permits models to be
pretrained from masked image modeling, patch sorting, and image restoration,
including super-resolution, denoising, and colorization. We support our
findings with extensive experiments, and show the proposed method delivers
comparable and even better performance with remarkably fewer epochs of training
compared to the state-of-the-art self-supervised vision model pretraining
methods. Our findings shed light on further exploring self-supervised vision
model pretraining and pretext tasks beyond masked image modeling.
Related papers
- PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement [69.6035373784027]
Low-light image enhancement (LLIE) has achieved promising performance by employing conditional diffusion models.
Previous methods may neglect the importance of a sufficient formulation of task-specific condition strategy.
We propose JoReS-Diff, a novel approach that incorporates Retinex- and semantic-based priors as the additional pre-processing condition.
arXiv Detail & Related papers (2023-12-20T08:05:57Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - A Unified Conditional Framework for Diffusion-based Image Restoration [39.418415473235235]
We present a unified conditional framework based on diffusion models for image restoration.
We leverage a lightweight UNet to predict initial guidance and the diffusion model to learn the residual of the guidance.
To handle high-resolution images, we propose a simple yet effective inter-step patch-splitting strategy.
arXiv Detail & Related papers (2023-05-31T17:22:24Z) - DELAD: Deep Landweber-guided deconvolution with Hessian and sparse prior [0.22940141855172028]
We present a model for non-blind image deconvolution that incorporates the classic iterative method into a deep learning application.
We build our network based on the iterative Landweber deconvolution algorithm, which is integrated with trainable convolutional layers to enhance the recovered image structures and details.
arXiv Detail & Related papers (2022-09-30T11:15:03Z) - Top-KAST: Top-K Always Sparse Training [50.05611544535801]
We propose Top-KAST, a method that preserves constant sparsity throughout training.
We show that it performs comparably to or better than previous works when training models on the established ImageNet benchmark.
In addition to our ImageNet results, we also demonstrate our approach in the domain of language modeling.
arXiv Detail & Related papers (2021-06-07T11:13:05Z) - Pre-Trained Image Processing Transformer [95.93031793337613]
We develop a new pre-trained model, namely, image processing transformer (IPT)
We present to utilize the well-known ImageNet benchmark for generating a large amount of corrupted image pairs.
IPT model is trained on these images with multi-heads and multi-tails.
arXiv Detail & Related papers (2020-12-01T09:42:46Z) - A Generative Model for Generic Light Field Reconstruction [15.394019131959096]
We present for the first time a generative model for 4D light field patches using variational autoencoders.
We develop a generative model conditioned on the central view of the light field and incorporate this as a prior in an energy minimization framework.
Our proposed method demonstrates good reconstruction, with performance approaching end-to-end trained networks.
arXiv Detail & Related papers (2020-05-13T18:27:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.