Visual Motif Identification: Elaboration of a Curated Comparative Dataset and Classification Methods
- URL: http://arxiv.org/abs/2410.15866v1
- Date: Mon, 21 Oct 2024 10:50:00 GMT
- Title: Visual Motif Identification: Elaboration of a Curated Comparative Dataset and Classification Methods
- Authors: Adam Phillips, Daniel Grandes Rodriguez, Miriam Sánchez-Manzano, Alan Salvadó, Manuel Garin, Gloria Haro, Coloma Ballester,
- Abstract summary: In cinema, visual motifs are recurrent iconographic compositions that carry artistic or aesthetic significance.
Our goal is to recognise and classify these motifs by proposing a new machine learning model that uses a custom dataset to that end.
We show how features extracted from a CLIP model can be leveraged by using a shallow network and an appropriate loss to classify images into 20 different motifs, with surprisingly good results.
- Score: 4.431754853927668
- License:
- Abstract: In cinema, visual motifs are recurrent iconographic compositions that carry artistic or aesthetic significance. Their use throughout the history of visual arts and media is interesting to researchers and filmmakers alike. Our goal in this work is to recognise and classify these motifs by proposing a new machine learning model that uses a custom dataset to that end. We show how features extracted from a CLIP model can be leveraged by using a shallow network and an appropriate loss to classify images into 20 different motifs, with surprisingly good results: an $F_1$-score of 0.91 on our test set. We also present several ablation studies justifying the input features, architecture and hyperparameters used.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Measuring Style Similarity in Diffusion Models [118.22433042873136]
We present a framework for understanding and extracting style descriptors from images.
Our framework comprises a new dataset curated using the insight that style is a subjective property of an image.
We also propose a method to extract style attribute descriptors that can be used to style of a generated image to the images used in the training dataset of a text-to-image model.
arXiv Detail & Related papers (2024-04-01T17:58:30Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - ARTxAI: Explainable Artificial Intelligence Curates Deep Representation
Learning for Artistic Images using Fuzzy Techniques [11.286457041998569]
We show how the features obtained from different tasks in artistic image classification are suitable to solve other ones of similar nature.
We propose an explainable artificial intelligence method to map known visual traits of an image with the features used by the deep learning model.
arXiv Detail & Related papers (2023-08-29T13:15:13Z) - Predicting beauty, liking, and aesthetic quality: A comparative analysis
of image databases for visual aesthetics research [0.0]
We examine how consistently the ratings can be predicted by using either (A) a set of 20 previously studied statistical image properties, or (B) the layers of a convolutional neural network developed for object recognition.
Our findings reveal substantial variation in the predictability of aesthetic ratings across the different datasets.
To our surprise, statistical image properties and the convolutional neural network predict aesthetic ratings with similar accuracy, highlighting a significant overlap in the image information captured by the two methods.
arXiv Detail & Related papers (2023-07-03T13:03:17Z) - Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner.
Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z) - Learning an Adaptation Function to Assess Image Visual Similarities [0.0]
We focus here on the specific task of learning visual image similarities when analogy matters.
We propose to compare different supervised, semi-supervised and self-supervised networks, pre-trained on distinct scales and contents datasets.
Our experiments conducted on the Totally Looks Like image dataset highlight the interest of our method, by increasing the retrieval scores of the best model @1 by 2.25x.
arXiv Detail & Related papers (2022-06-03T07:15:00Z) - Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images.
We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image.
We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z) - Learning Portrait Style Representations [34.59633886057044]
We study style representations learned by neural network architectures incorporating higher level characteristics.
We find variation in learned style features from incorporating triplets annotated by art historians as supervision for style similarity.
We also present the first large-scale dataset of portraits prepared for computational analysis.
arXiv Detail & Related papers (2020-12-08T01:36:45Z) - A Data Set and a Convolutional Model for Iconography Classification in
Paintings [3.4138918206057265]
Iconography in art is the discipline that studies the visual content of artworks to determine their motifs and themes.
Applying Computer Vision techniques to the analysis of art images at an unprecedented scale may support iconography research and education.
arXiv Detail & Related papers (2020-10-06T12:40:46Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.