Related papers: Distilling Knowledge from Object Classification to Aesthetics Assessment

Distilling Knowledge from Object Classification to Aesthetics Assessment

URL: http://arxiv.org/abs/2206.00809v1
Date: Thu, 2 Jun 2022 00:39:01 GMT
Title: Distilling Knowledge from Object Classification to Aesthetics Assessment
Authors: Jingwen Hou, Henghui Ding, Weisi Lin, Weide Liu, Yuming Fang
Abstract summary: The major dilemma of image aesthetics assessment (IAA) comes from the abstract nature of aesthetic labels. We propose to distill knowledge on semantic patterns for a vast variety of image contents to an IAA model. By supervising an end-to-end single-backbone IAA model with the distilled knowledge, the performance of the IAA model is significantly improved.
Score: 68.317720070755
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we point out that the major dilemma of image aesthetics assessment (IAA) comes from the abstract nature of aesthetic labels. That is, a vast variety of distinct contents can correspond to the same aesthetic label. On the one hand, during inference, the IAA model is required to relate various distinct contents to the same aesthetic label. On the other hand, when training, it would be hard for the IAA model to learn to distinguish different contents merely with the supervision from aesthetic labels, since aesthetic labels are not directly related to any specific content. To deal with this dilemma, we propose to distill knowledge on semantic patterns for a vast variety of image contents from multiple pre-trained object classification (POC) models to an IAA model. Expecting the combination of multiple POC models can provide sufficient knowledge on various image contents, the IAA model can easier learn to relate various distinct contents to a limited number of aesthetic labels. By supervising an end-to-end single-backbone IAA model with the distilled knowledge, the performance of the IAA model is significantly improved by 4.8% in SRCC compared to the version trained only with ground-truth aesthetic labels. On specific categories of images, the SRCC improvement brought by the proposed method can achieve up to 7.2%. Peer comparison also shows that our method outperforms 10 previous IAA methods.

Related papers

Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning [14.405750888492735]
Image Aesthetic Assessment (IAA) is a vital and intricate task that entails analyzing and assessing an image's aesthetic values. Traditional methods of IAA often concentrate on a single aesthetic task and suffer from inadequate labeled datasets. We propose a comprehensive aesthetic MLLM capable of nuanced aesthetic insight.
arXiv Detail & Related papers (2024-12-16T16:35:35Z)
AID-AppEAL: Automatic Image Dataset and Algorithm for Content Appeal Enhancement and Assessment Labeling [11.996211235559866]
Image Content Appeal Assessment (ICAA) is a novel metric that quantifies the level of positive interest an image's content generates for viewers. ICAA is different from traditional Image-Aesthetics Assessment (IAA), which judges an image's artistic quality.
arXiv Detail & Related papers (2024-07-08T01:40:32Z)
Multi-modal Learnable Queries for Image Aesthetics Assessment [55.28571422062623]
We propose MMLQ, which utilizes multi-modal learnable queries to extract aesthetics-related features from multi-modal pre-trained features. MMLQ achieves new state-of-the-art performance on multi-modal IAA, beating previous methods by 7.7% and 8.3% in terms of SRCC and PLCC, respectively.
arXiv Detail & Related papers (2024-05-02T14:31:47Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
Image Aesthetics Assessment via Learnable Queries [59.313054821874864]
We propose the Image Aesthetics Assessment via Learnable Queries (IAA-LQ) approach. It adapts learnable queries to extract aesthetic features from pre-trained image features obtained from a frozen image encoder. Experiments on real-world data demonstrate the advantages of IAA-LQ, beating the best state-of-the-art method by 2.2% and 2.1% in terms of SRCC and PLCC, respectively.
arXiv Detail & Related papers (2023-09-06T09:42:16Z)
CLIP Brings Better Features to Visual Aesthetics Learners [14.351572852317558]
Image Aesthetics Assessment (IAA) is a challenging task due to its subjective nature and expensive manual annotations.<n>Recent large-scale vision-language models, such as Contrastive Language-Image Pre-training (CLIP), have shown their promising representation capability for various downstream tasks.<n>We propose a two-phase CLIP-based Semi-supervised Knowledge Distillation paradigm, aiming to learn a lightweight IAA model while leveraging CLIP's strong generalization capability.
arXiv Detail & Related papers (2023-07-28T16:00:21Z)
Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method [64.40494830113286]
We first introduce a large-scale AIAA dataset: Boldbrush Artistic Image dataset (BAID), which consists of 60,337 artistic images covering various art forms. We then propose a new method, SAAN, which can effectively extract and utilize style-specific and generic aesthetic information to evaluate artistic images. Experiments demonstrate that our proposed approach outperforms existing IAA methods on the proposed BAID dataset.
arXiv Detail & Related papers (2023-03-27T12:59:15Z)
VILA: Learning Image Aesthetics from User Comments with Vision-Language Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations. Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels. Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z)
Aesthetically Relevant Image Captioning [17.081262827258943]
We study image AQA and IAC together and present a new IAC method termed Aesthetically Relevant Image Captioning (ARIC) ARIC includes an ARS weighted IAC loss function and an ARS based diverse aesthetic caption selector (DACS) We show that texts with higher ARS's can predict the aesthetic ratings more accurately and that the new ARIC model can generate more accurate, aesthetically more relevant and more diverse image captions.
arXiv Detail & Related papers (2022-11-25T14:28:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.