Multi-task convolutional neural network for image aesthetic assessment
- URL: http://arxiv.org/abs/2305.09373v2
- Date: Mon, 15 Jan 2024 14:52:42 GMT
- Title: Multi-task convolutional neural network for image aesthetic assessment
- Authors: Derya Soydaner, Johan Wagemans
- Abstract summary: We present a multi-task convolutional neural network that takes into account aesthetic attributes.
The proposed neural network jointly learns the attributes along with the overall aesthetic scores of images.
We achieve near-human performance in terms of overall aesthetic scores when considering the Spearman's rank correlations.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: As people's aesthetic preferences for images are far from understood, image
aesthetic assessment is a challenging artificial intelligence task. The range
of factors underlying this task is almost unlimited, but we know that some
aesthetic attributes affect those preferences. In this study, we present a
multi-task convolutional neural network that takes into account these
attributes. The proposed neural network jointly learns the attributes along
with the overall aesthetic scores of images. This multi-task learning framework
allows for effective generalization through the utilization of shared
representations. Our experiments demonstrate that the proposed method
outperforms the state-of-the-art approaches in predicting overall aesthetic
scores for images in one benchmark of image aesthetics. We achieve near-human
performance in terms of overall aesthetic scores when considering the
Spearman's rank correlations. Moreover, our model pioneers the application of
multi-tasking in another benchmark, serving as a new baseline for future
research. Notably, our approach achieves this performance while using fewer
parameters compared to existing multi-task neural networks in the literature,
and consequently makes our method more efficient in terms of computational
complexity.
Related papers
- Improving Human-Object Interaction Detection via Virtual Image Learning [68.56682347374422]
Human-Object Interaction (HOI) detection aims to understand the interactions between humans and objects.
In this paper, we propose to alleviate the impact of such an unbalanced distribution via Virtual Image Leaning (VIL)
A novel label-to-image approach, Multiple Steps Image Creation (MUSIC), is proposed to create a high-quality dataset that has a consistent distribution with real images.
arXiv Detail & Related papers (2023-08-04T10:28:48Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - Modeling, Quantifying, and Predicting Subjectivity of Image Aesthetics [21.46956783120668]
We propose a novel unified probabilistic framework that can model and quantify subjective aesthetic preference based on the subjective logic.
In this framework, the rating distribution is modeled as a beta distribution, from which the probabilities of being definitely pleasing, being definitely unpleasing, and being uncertain can be obtained.
We present a method to learn deep neural networks for prediction of image aesthetics, which is shown to be effective in improving the performance of subjectivity prediction via experiments.
arXiv Detail & Related papers (2022-08-20T12:16:45Z) - Aesthetic Attribute Assessment of Images Numerically on Mixed
Multi-attribute Datasets [16.120684660965978]
We construct an image attribute dataset called aesthetic mixed dataset with attributes(AMD-A) and design external attribute features for fusion.
Our model can achieve aesthetic classification, overall scoring and attribute scoring.
Experimental results, using the MindSpore, show that our proposed method can effectively improve the performance of the aesthetic overall and attribute assessment.
arXiv Detail & Related papers (2022-07-05T04:42:10Z) - Learning an Adaptation Function to Assess Image Visual Similarities [0.0]
We focus here on the specific task of learning visual image similarities when analogy matters.
We propose to compare different supervised, semi-supervised and self-supervised networks, pre-trained on distinct scales and contents datasets.
Our experiments conducted on the Totally Looks Like image dataset highlight the interest of our method, by increasing the retrieval scores of the best model @1 by 2.25x.
arXiv Detail & Related papers (2022-06-03T07:15:00Z) - Composition and Style Attributes Guided Image Aesthetic Assessment [66.60253358722538]
We propose a method for the automatic prediction of the aesthetics of an image.
The proposed network includes: a pre-trained network for semantic features extraction (the Backbone); a Multi Layer Perceptron (MLP) network that relies on the Backbone features for the prediction of image attributes (the AttributeNet)
Given an image, the proposed multi-network is able to predict: style and composition attributes, and aesthetic score distribution.
arXiv Detail & Related papers (2021-11-08T17:16:38Z) - Neural Knitworks: Patched Neural Implicit Representation Networks [1.0470286407954037]
We propose Knitwork, an architecture for neural implicit representation learning of natural images that achieves image synthesis.
To the best of our knowledge, this is the first implementation of a coordinate-based patch tailored for synthesis tasks such as image inpainting, super-resolution, and denoising.
The results show that modeling natural images using patches, rather than pixels, produces results of higher fidelity.
arXiv Detail & Related papers (2021-09-29T13:10:46Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Learning from Few Samples: A Survey [1.4146420810689422]
We study the existing few-shot meta-learning techniques in the computer vision domain based on their method and evaluation metrics.
We provide a taxonomy for the techniques and categorize them as data-augmentation, embedding, optimization and semantics based learning for few-shot, one-shot and zero-shot settings.
arXiv Detail & Related papers (2020-07-30T14:28:57Z) - Learning to Compose Hypercolumns for Visual Correspondence [57.93635236871264]
We introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match.
The proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network.
arXiv Detail & Related papers (2020-07-21T04:03:22Z) - SideInfNet: A Deep Neural Network for Semi-Automatic Semantic
Segmentation with Side Information [83.03179580646324]
This paper proposes a novel deep neural network architecture, namely SideInfNet.
It integrates features learnt from images with side information extracted from user annotations.
To evaluate our method, we applied the proposed network to three semantic segmentation tasks and conducted extensive experiments on benchmark datasets.
arXiv Detail & Related papers (2020-02-07T06:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.