A Deep Drift-Diffusion Model for Image Aesthetic Score Distribution
Prediction
- URL: http://arxiv.org/abs/2010.07661v1
- Date: Thu, 15 Oct 2020 11:01:46 GMT
- Title: A Deep Drift-Diffusion Model for Image Aesthetic Score Distribution
Prediction
- Authors: Xin Jin, Xiqiao Li, Heng Huang, Xiaodong Li, and Xinghui Zhou
- Abstract summary: We propose a Deep Drift-Diffusion model inspired by psychologists to predict aesthetic score distribution from images.
The DDD model can describe the psychological process of aesthetic perception instead of traditional modeling of the results of assessment.
Our novel DDD model is simple but efficient, which outperforms the state-of-the-art methods in aesthetic score distribution prediction.
- Score: 68.76594695163386
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of aesthetic quality assessment is complicated due to its
subjectivity. In recent years, the target representation of image aesthetic
quality has changed from a one-dimensional binary classification label or
numerical score to a multi-dimensional score distribution. According to current
methods, the ground truth score distributions are straightforwardly regressed.
However, the subjectivity of aesthetics is not taken into account, that is to
say, the psychological processes of human beings are not taken into
consideration, which limits the performance of the task. In this paper, we
propose a Deep Drift-Diffusion (DDD) model inspired by psychologists to predict
aesthetic score distribution from images. The DDD model can describe the
psychological process of aesthetic perception instead of traditional modeling
of the results of assessment. We use deep convolution neural networks to
regress the parameters of the drift-diffusion model. The experimental results
in large scale aesthetic image datasets reveal that our novel DDD model is
simple but efficient, which outperforms the state-of-the-art methods in
aesthetic score distribution prediction. Besides, different psychological
processes can also be predicted by our model.
Related papers
- PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - DepthART: Monocular Depth Estimation as Autoregressive Refinement Task [2.3884184860468136]
We introduce the first autoregressive depth estimation model based on the visual autoregressive transformer.
Our primary contribution is DepthART, a novel training method formulated as Depth Autoregressive Refinement Task.
Our experiments demonstrate that the proposed training approach significantly outperforms visual autoregressive modeling via next-scale prediction in the depth estimation task.
arXiv Detail & Related papers (2024-09-23T13:36:34Z) - pAE: An Efficient Autoencoder Architecture for Modeling the Lateral Geniculate Nucleus by Integrating Feedforward and Feedback Streams in Human Visual System [0.716879432974126]
We introduce a deep convolutional model that closely approximates human visual information processing.
We aim to approximate the function for the lateral geniculate nucleus (LGN) area using a trained shallow convolutional model.
The pAE model achieves the final 99.26% prediction performance and demonstrates a notable improvement of around 28% over human results in the temporal mode.
arXiv Detail & Related papers (2024-09-20T16:33:01Z) - Opinion-Unaware Blind Image Quality Assessment using Multi-Scale Deep Feature Statistics [54.08757792080732]
We propose integrating deep features from pre-trained visual models with a statistical analysis model to achieve opinion-unaware BIQA (OU-BIQA)
Our proposed model exhibits superior consistency with human visual perception compared to state-of-the-art BIQA models.
arXiv Detail & Related papers (2024-05-29T06:09:34Z) - Diffusion Models for Image Restoration and Enhancement -- A
Comprehensive Survey [96.99328714941657]
We present a comprehensive review of recent diffusion model-based methods on image restoration.
We classify and emphasize the innovative designs using diffusion models for both IR and blind/real-world IR.
We propose five potential and challenging directions for the future research of diffusion model-based IR.
arXiv Detail & Related papers (2023-08-18T08:40:38Z) - Distinguishing representational geometries with controversial stimuli:
Bayesian experimental design and its application to face dissimilarity
judgments [0.5735035463793008]
We show that a neural network trained to invert a 3D-face-model graphics is more human-aligned than the same architecture trained on identification, classification, or autoencoding.
Our results indicate that a neural network trained to invert a 3D-face-model graphics is more human-aligned than the same architecture trained on identification, classification, or autoencoding.
arXiv Detail & Related papers (2022-11-28T04:17:35Z) - Improving Fairness in Image Classification via Sketching [14.154930352612926]
Deep neural networks (DNNs) tend to make unfair predictions when the training data are collected from different sub-populations.
We propose to use sketching to handle this phenomenon.
We evaluate our method through extensive experiments on both general scene dataset and medical scene dataset.
arXiv Detail & Related papers (2022-10-31T22:26:32Z) - Modeling, Quantifying, and Predicting Subjectivity of Image Aesthetics [21.46956783120668]
We propose a novel unified probabilistic framework that can model and quantify subjective aesthetic preference based on the subjective logic.
In this framework, the rating distribution is modeled as a beta distribution, from which the probabilities of being definitely pleasing, being definitely unpleasing, and being uncertain can be obtained.
We present a method to learn deep neural networks for prediction of image aesthetics, which is shown to be effective in improving the performance of subjectivity prediction via experiments.
arXiv Detail & Related papers (2022-08-20T12:16:45Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - SIR: Self-supervised Image Rectification via Seeing the Same Scene from
Multiple Different Lenses [82.56853587380168]
We propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of the same scene from different lens should be the same.
We leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters.
Our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T08:23:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.