Aesthetic Attributes Assessment of Images with AMANv2 and DPC-CaptionsV2
- URL: http://arxiv.org/abs/2208.04522v1
- Date: Tue, 9 Aug 2022 03:20:59 GMT
- Title: Aesthetic Attributes Assessment of Images with AMANv2 and DPC-CaptionsV2
- Authors: Xinghui Zhou, Xin Jin, Jianwen Lv, Heng Huang, Ming Mao, Shuai Cui
- Abstract summary: We construct a novel dataset, named DPC-CaptionsV2, by a semi-automatic way.
Images of DPC-CaptionsV2 contain comments up to 4 aesthetic attributes: composition, lighting, color, and subject.
Our method can predict the comments on 4 aesthetic attributes, which are closer to aesthetic topics than those produced by the previous AMAN model.
- Score: 65.5524793975387
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image aesthetic quality assessment is popular during the last decade. Besides
numerical assessment, nature language assessment (aesthetic captioning) has
been proposed to describe the generally aesthetic impression of an image. In
this paper, we propose aesthetic attribute assessment, which is the aesthetic
attributes captioning, i.e., to assess the aesthetic attributes such as
composition, lighting usage and color arrangement. It is a non-trivial task to
label the comments of aesthetic attributes, which limit the scale of the
corresponding datasets. We construct a novel dataset, named DPC-CaptionsV2, by
a semi-automatic way. The knowledge is transferred from a small-scale dataset
with full annotations to large-scale professional comments from a photography
website. Images of DPC-CaptionsV2 contain comments up to 4 aesthetic
attributes: composition, lighting, color, and subject. Then, we propose a new
version of Aesthetic Multi-Attributes Networks (AMANv2) based on the BUTD model
and the VLPSA model. AMANv2 fuses features of a mixture of small-scale PCCD
dataset with full annotations and large-scale DPCCaptionsV2 dataset with full
annotations. The experimental results of DPCCaptionsV2 show that our method can
predict the comments on 4 aesthetic attributes, which are closer to aesthetic
topics than those produced by the previous AMAN model. Through the evaluation
criteria of image captioning, the specially designed AMANv2 model is better to
the CNN-LSTM model and the AMAN model.
Related papers
- ImageInWords: Unlocking Hyper-Detailed Image Descriptions [36.373619800014275]
ImageInWords (IIW) is a human-in-the-loop framework for curating hyper-detailed image descriptions.
We show major gains compared to recent datasets in comprehensiveness, specificity, hallucinations, and more.
We also show that fine-tuning with IIW data improves these metrics by +31% against models trained with prior work, even with only 9k samples.
arXiv Detail & Related papers (2024-05-05T02:15:11Z) - Predicting Scores of Various Aesthetic Attribute Sets by Learning from
Overall Score Labels [54.63611854474985]
In this paper, we propose to replace image attribute labels with feature extractors.
We use networks from different tasks to provide attribute features to our F2S model.
Our method makes it feasible to learn meaningful attribute scores for various aesthetic attribute sets in different types of images with only overall aesthetic scores.
arXiv Detail & Related papers (2023-12-06T01:41:49Z) - Image Aesthetics Assessment via Learnable Queries [59.313054821874864]
We propose the Image Aesthetics Assessment via Learnable Queries (IAA-LQ) approach.
It adapts learnable queries to extract aesthetic features from pre-trained image features obtained from a frozen image encoder.
Experiments on real-world data demonstrate the advantages of IAA-LQ, beating the best state-of-the-art method by 2.2% and 2.1% in terms of SRCC and PLCC, respectively.
arXiv Detail & Related papers (2023-09-06T09:42:16Z) - VILA: Learning Image Aesthetics from User Comments with Vision-Language
Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations.
Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels.
Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z) - Aesthetic Attribute Assessment of Images Numerically on Mixed
Multi-attribute Datasets [16.120684660965978]
We construct an image attribute dataset called aesthetic mixed dataset with attributes(AMD-A) and design external attribute features for fusion.
Our model can achieve aesthetic classification, overall scoring and attribute scoring.
Experimental results, using the MindSpore, show that our proposed method can effectively improve the performance of the aesthetic overall and attribute assessment.
arXiv Detail & Related papers (2022-07-05T04:42:10Z) - Personalized Image Aesthetics Assessment with Rich Attributes [35.61053167813472]
We conduct the most comprehensive subjective study of personalized image aesthetics and introduce a new personalized image Aesthetics database with Rich Attributes (PARA)
PARA features wealthy annotations, including 9 image-oriented objective attributes and 4 human-oriented subjective attributes.
We also propose a conditional PIAA model by utilizing subject information as conditional prior.
arXiv Detail & Related papers (2022-03-31T02:23:46Z) - Composition and Style Attributes Guided Image Aesthetic Assessment [66.60253358722538]
We propose a method for the automatic prediction of the aesthetics of an image.
The proposed network includes: a pre-trained network for semantic features extraction (the Backbone); a Multi Layer Perceptron (MLP) network that relies on the Backbone features for the prediction of image attributes (the AttributeNet)
Given an image, the proposed multi-network is able to predict: style and composition attributes, and aesthetic score distribution.
arXiv Detail & Related papers (2021-11-08T17:16:38Z) - User-Guided Personalized Image Aesthetic Assessment based on Deep
Reinforcement Learning [64.07820203919283]
We propose a novel user-guided personalized image aesthetic assessment framework.
It leverages user interactions to retouch and rank images for aesthetic assessment based on deep reinforcement learning (DRL)
It generates personalized aesthetic distribution that is more in line with the aesthetic preferences of different users.
arXiv Detail & Related papers (2021-06-14T15:19:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.