Explaining Automatic Image Assessment
- URL: http://arxiv.org/abs/2502.01873v1
- Date: Mon, 03 Feb 2025 22:55:14 GMT
- Title: Explaining Automatic Image Assessment
- Authors: Max Lisaius, Scott Wehrwein,
- Abstract summary: Our proposed approach attempts to explain aesthetic assessment models through visualizing dataset trends and automatic categorization of visual aesthetic features.
By evaluating the models adapted to each specific modality using existing and novel metrics, we can capture and visualize aesthetic features and trends.
- Score: 2.8084422332394428
- License:
- Abstract: Previous work in aesthetic categorization and explainability utilizes manual labeling and classification to explain aesthetic scores. These methods require a complex labeling process and are limited in size. Our proposed approach attempts to explain aesthetic assessment models through visualizing dataset trends and automatic categorization of visual aesthetic features through training neural networks on different versions of the same dataset. By evaluating the models adapted to each specific modality using existing and novel metrics, we can capture and visualize aesthetic features and trends.
Related papers
- Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning [14.405750888492735]
Image Aesthetic Assessment (IAA) is a vital and intricate task that entails analyzing and assessing an image's aesthetic values.
Traditional methods of IAA often concentrate on a single aesthetic task and suffer from inadequate labeled datasets.
We propose a comprehensive aesthetic MLLM capable of nuanced aesthetic insight.
arXiv Detail & Related papers (2024-12-16T16:35:35Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms [91.19304518033144]
We aim to align vision models with human aesthetic standards in a retrieval system.
We propose a preference-based reinforcement learning method that fine-tunes the vision models to better align the vision models with human aesthetics.
arXiv Detail & Related papers (2024-06-13T17:59:20Z) - VILA: Learning Image Aesthetics from User Comments with Vision-Language
Pretraining [53.470662123170555]
We propose learning image aesthetics from user comments, and exploring vision-language pretraining methods to learn multimodal aesthetic representations.
Specifically, we pretrain an image-text encoder-decoder model with image-comment pairs, using contrastive and generative objectives to learn rich and generic aesthetic semantics without human labels.
Our results show that our pretrained aesthetic vision-language model outperforms prior works on image aesthetic captioning over the AVA-Captions dataset.
arXiv Detail & Related papers (2023-03-24T23:57:28Z) - Exploring CNN-based models for image's aesthetic score prediction with
using ensemble [3.8073142980733]
We proposed a framework of constructing two types of the automatic image aesthetics assessment models with different CNN architectures.
The attention regions of the models to the images are extracted to analyze the consistency with the subjects in the images.
It is found that the AS classification models trained on XiheAA dataset seem to learn the latent photography principles, although it can't be said that they learn the aesthetic sense.
arXiv Detail & Related papers (2022-10-11T03:23:07Z) - Enhancing efficiency of object recognition in different categorization
levels by reinforcement learning in modular spiking neural networks [1.392250707100996]
We propose a computational model for object recognition in different categorization levels.
A spiking neural network equipped with the reinforcement learning rule is used as a module at each categorization level.
According to the required information at each categorization level, the relevant band-pass filtered images are utilized.
arXiv Detail & Related papers (2021-02-10T12:33:20Z) - A Diagnostic Study of Explainability Techniques for Text Classification [52.879658637466605]
We develop a list of diagnostic properties for evaluating existing explainability techniques.
We compare the saliency scores assigned by the explainability techniques with human annotations of salient input regions to find relations between a model's performance and the agreement of its rationales with human ones.
arXiv Detail & Related papers (2020-09-25T12:01:53Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - Hierarchical Image Classification using Entailment Cone Embeddings [68.82490011036263]
We first inject label-hierarchy knowledge into an arbitrary CNN-based classifier.
We empirically show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
arXiv Detail & Related papers (2020-04-02T10:22:02Z) - Learning Representations For Images With Hierarchical Labels [1.3579420996461438]
We present a set of methods to leverage information about the semantic hierarchy induced by class labels.
We show that availability of such external semantic information in conjunction with the visual semantics from images boosts overall performance.
Although, both the CNN-classifiers injected with hierarchical information, and the embedding-based models outperform a hierarchy-agnostic model on the newly presented, real-world ETH Entomological Collection image dataset.
arXiv Detail & Related papers (2020-04-02T09:56:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.