Graph Neural Networks for Knowledge Enhanced Visual Representation of
Paintings
- URL: http://arxiv.org/abs/2105.08190v1
- Date: Mon, 17 May 2021 23:05:36 GMT
- Title: Graph Neural Networks for Knowledge Enhanced Visual Representation of
Paintings
- Authors: Athanasios Efthymiou, Stevan Rudinac, Monika Kackovic, Marcel Worring,
Nachoem Wijnberg
- Abstract summary: ArtSAGENet is a novel architecture that integrates Graph Neural Networks (GNNs) and Convolutional Neural Networks (CNNs)
We show that our proposed ArtSAGENet captures and encodes valuable dependencies between the artists and the artworks.
Our findings underline a great potential of integrating visual content and semantics for fine art analysis and curation.
- Score: 14.89186519385364
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose ArtSAGENet, a novel multimodal architecture that integrates Graph
Neural Networks (GNNs) and Convolutional Neural Networks (CNNs), to jointly
learn visual and semantic-based artistic representations. First, we illustrate
the significant advantages of multi-task learning for fine art analysis and
argue that it is conceptually a much more appropriate setting in the fine art
domain than the single-task alternatives. We further demonstrate that several
GNN architectures can outperform strong CNN baselines in a range of fine art
analysis tasks, such as style classification, artist attribution, creation
period estimation, and tag prediction, while training them requires an order of
magnitude less computational time and only a small amount of labeled data.
Finally, through extensive experimentation we show that our proposed ArtSAGENet
captures and encodes valuable relational dependencies between the artists and
the artworks, surpassing the performance of traditional methods that rely
solely on the analysis of visual content. Our findings underline a great
potential of integrating visual content and semantics for fine art analysis and
curation.
Related papers
- KALE: An Artwork Image Captioning System Augmented with Heterogeneous Graph [24.586916324061168]
We present KALE Knowledge-Augmented vision-Language model for artwork Elaborations.
KALE incorporates the metadata in two ways: firstly as direct textual input, and secondly through a multimodal heterogeneous knowledge graph.
Experimental results demonstrate that KALE achieves strong performance over existing state-of-the-art work across several artwork datasets.
arXiv Detail & Related papers (2024-09-17T06:39:18Z) - GalleryGPT: Analyzing Paintings with Large Multimodal Models [64.98398357569765]
Artwork analysis is important and fundamental skill for art appreciation, which could enrich personal aesthetic sensibility and facilitate the critical thinking ability.
Previous works for automatically analyzing artworks mainly focus on classification, retrieval, and other simple tasks, which is far from the goal of AI.
We introduce a superior large multimodal model for painting analysis composing, dubbed GalleryGPT, which is slightly modified and fine-tuned based on LLaVA architecture.
arXiv Detail & Related papers (2024-08-01T11:52:56Z) - Deep Ensemble Art Style Recognition [2.3369294168789203]
Huge digitization of artworks during the last decades created the need for categorization, analysis, and management of huge amounts of data related to abstract concepts.
Recognition of various art features in artworks has gained attention in the deep learning society.
In this paper, we are concerned with the problem of art style recognition using deep networks.
arXiv Detail & Related papers (2024-05-19T21:26:11Z) - Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for
Improved Vision-Language Compositionality [50.48859793121308]
Contrastively trained vision-language models have achieved remarkable progress in vision and language representation learning.
Recent research has highlighted severe limitations in their ability to perform compositional reasoning over objects, attributes, and relations.
arXiv Detail & Related papers (2023-05-23T08:28:38Z) - Synergy of Machine and Deep Learning Models for Multi-Painter
Recognition [0.0]
We introduce a new large dataset for painting recognition task including 62 artists achieving good results.
RegNet performs better in exporting features, while SVM makes the best classification of images based on the painter with a performance of up to 85%.
arXiv Detail & Related papers (2023-04-28T11:34:53Z) - ALADIN-NST: Self-supervised disentangled representation learning of
artistic style through Neural Style Transfer [60.6863849241972]
We learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image.
We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics.
arXiv Detail & Related papers (2023-04-12T10:33:18Z) - Inching Towards Automated Understanding of the Meaning of Art: An
Application to Computational Analysis of Mondrian's Artwork [0.0]
This paper attempts to identify capabilities that are related to semantic processing.
The proposed methodology identifies the missing capabilities by comparing the process of understanding Mondrian's paintings with the process of understanding electronic circuit designs.
To explain the usefulness of the methodology, the paper discusses a new, three-step computational method to distinguish Mondrian's paintings from other artwork.
arXiv Detail & Related papers (2022-12-29T23:34:19Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - Tensor Composition Net for Visual Relationship Prediction [115.14829858763399]
We present a novel Composition Network (TCN) to predict visual relationships in images.
The key idea of our TCN is to exploit the low rank property of the visual relationship tensor.
We show our TCN's image-level visual relationship prediction provides a simple and efficient mechanism for relation-based image retrieval.
arXiv Detail & Related papers (2020-12-10T06:27:20Z) - Node Masking: Making Graph Neural Networks Generalize and Scale Better [71.51292866945471]
Graph Neural Networks (GNNs) have received a lot of interest in the recent times.
In this paper, we utilize some theoretical tools to better visualize the operations performed by state of the art spatial GNNs.
We introduce a simple concept, Node Masking, that allows them to generalize and scale better.
arXiv Detail & Related papers (2020-01-17T06:26:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.