UMAAF: Unveiling Aesthetics via Multifarious Attributes of Images
- URL: http://arxiv.org/abs/2311.11306v2
- Date: Tue, 21 Nov 2023 13:59:31 GMT
- Title: UMAAF: Unveiling Aesthetics via Multifarious Attributes of Images
- Authors: Weijie Li, Yitian Wan, Xingjiao Wu, Junjie Xu, Cheng Jin, Liang He
- Abstract summary: We propose the Unified Multi-attribute Aesthetic Assessment Framework (UMAAF) to model both absolute and relative attributes of images.
UMAAF achieves state-of-the-art performance on TAD66K and AVA datasets.
- Score: 16.647573404422175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing prevalence of smartphones and websites, Image Aesthetic
Assessment (IAA) has become increasingly crucial. While the significance of
attributes in IAA is widely recognized, many attribute-based methods lack
consideration for the selection and utilization of aesthetic attributes. Our
initial step involves the acquisition of aesthetic attributes from both intra-
and inter-perspectives. Within the intra-perspective, we extract the direct
visual attributes of images, constituting the absolute attribute. In the
inter-perspective, our focus lies in modeling the relative score relationships
between images within the same sequence, forming the relative attribute. Then,
to better utilize image attributes in aesthetic assessment, we propose the
Unified Multi-attribute Aesthetic Assessment Framework (UMAAF) to model both
absolute and relative attributes of images. For absolute attributes, we
leverage multiple absolute-attribute perception modules and an
absolute-attribute interacting network. The absolute-attribute perception
modules are first pre-trained on several absolute-attribute learning tasks and
then used to extract corresponding absolute attribute features. The
absolute-attribute interacting network adaptively learns the weight of diverse
absolute-attribute features, effectively integrating them with generic
aesthetic features from various absolute-attribute perspectives and generating
the aesthetic prediction. To model the relative attribute of images, we
consider the relative ranking and relative distance relationships between
images in a Relative-Relation Loss function, which boosts the robustness of the
UMAAF. Furthermore, UMAAF achieves state-of-the-art performance on TAD66K and
AVA datasets, and multiple experiments demonstrate the effectiveness of each
module and the model's alignment with human preference.
Related papers
- ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling [32.55352435358949]
We propose a sentence generation-based retrieval formulation for attribute recognition.
For each attribute to be recognized on an image, we measure the visual-conditioned probability of generating a short sentence.
We demonstrate through experiments that generative retrieval consistently outperforms contrastive retrieval on two visual reasoning datasets.
arXiv Detail & Related papers (2024-08-07T21:44:29Z) - Predicting Scores of Various Aesthetic Attribute Sets by Learning from
Overall Score Labels [54.63611854474985]
In this paper, we propose to replace image attribute labels with feature extractors.
We use networks from different tasks to provide attribute features to our F2S model.
Our method makes it feasible to learn meaningful attribute scores for various aesthetic attribute sets in different types of images with only overall aesthetic scores.
arXiv Detail & Related papers (2023-12-06T01:41:49Z) - Learning Conditional Attributes for Compositional Zero-Shot Learning [78.24309446833398]
Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel compositional concepts.
One of the challenges is to model attributes interacted with different objects, e.g., the attribute wet" in wet apple" and wet cat" is different.
We argue that attributes are conditioned on the recognized object and input image and explore learning conditional attribute embeddings.
arXiv Detail & Related papers (2023-05-29T08:04:05Z) - Aesthetic Attribute Assessment of Images Numerically on Mixed
Multi-attribute Datasets [16.120684660965978]
We construct an image attribute dataset called aesthetic mixed dataset with attributes(AMD-A) and design external attribute features for fusion.
Our model can achieve aesthetic classification, overall scoring and attribute scoring.
Experimental results, using the MindSpore, show that our proposed method can effectively improve the performance of the aesthetic overall and attribute assessment.
arXiv Detail & Related papers (2022-07-05T04:42:10Z) - Attribute Prototype Network for Any-Shot Learning [113.50220968583353]
We argue that an image representation with integrated attribute localization ability would be beneficial for any-shot, i.e. zero-shot and few-shot, image classification tasks.
We propose a novel representation learning framework that jointly learns global and local features using only class-level attributes.
arXiv Detail & Related papers (2022-04-04T02:25:40Z) - Composition and Style Attributes Guided Image Aesthetic Assessment [66.60253358722538]
We propose a method for the automatic prediction of the aesthetics of an image.
The proposed network includes: a pre-trained network for semantic features extraction (the Backbone); a Multi Layer Perceptron (MLP) network that relies on the Backbone features for the prediction of image attributes (the AttributeNet)
Given an image, the proposed multi-network is able to predict: style and composition attributes, and aesthetic score distribution.
arXiv Detail & Related papers (2021-11-08T17:16:38Z) - Learning to Infer Unseen Attribute-Object Compositions [55.58107964602103]
A graph-based model is proposed that can flexibly recognize both single- and multi-attribute-object compositions.
We build a large-scale Multi-Attribute dataset with 116,099 images and 8,030 composition categories.
arXiv Detail & Related papers (2020-10-27T14:57:35Z) - Attribute Prototype Network for Zero-Shot Learning [113.50220968583353]
We propose a novel zero-shot representation learning framework that jointly learns discriminative global and local features.
Our model points to the visual evidence of the attributes in an image, confirming the improved attribute localization ability of our image representation.
arXiv Detail & Related papers (2020-08-19T06:46:35Z) - MulGAN: Facial Attribute Editing by Exemplar [2.272764591035106]
Methods encode attribute-related information in images into the predefined region of the latent feature space by employing a pair of images with opposite attributes as input to train model.
They suffer from three limitations: (1) the model must be trained using a pair of images with opposite attributes as input; (2) weak capability of editing multiple attributes by exemplars; and (3) poor quality of generating image.
arXiv Detail & Related papers (2019-12-28T04:02:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.