Semi-supervised Human Pose Estimation in Art-historical Images
- URL: http://arxiv.org/abs/2207.02976v1
- Date: Wed, 6 Jul 2022 21:20:58 GMT
- Title: Semi-supervised Human Pose Estimation in Art-historical Images
- Authors: Matthias Springstein, Stefanie Schneider, Christian Althaus, Ralph
Ewerth
- Abstract summary: We propose a novel approach to estimate human poses in art-language images.
Our approach achieves significantly better results than methods that use pre-trained models or style transfer.
- Score: 9.633949256082763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gesture as \enquote*{language} of non-verbal communication has been
theoretically established since the 17th century. However, its relevance for
the visual arts has been expressed only sporadically. This may be primarily due
to the sheer overwhelming amount of data that traditionally had to be processed
by hand. With the steady progress of digitization, though, a growing number of
historical artifacts have been indexed and made available to the public,
creating a need for automatic retrieval of art-historical motifs with similar
body constellations or poses. Since the domain of art differs significantly
from existing real-world data sets for human pose estimation due to its style
variance, this presents new challenges. In this paper, we propose a novel
approach to estimate human poses in art-historical images. In contrast to
previous work that attempts to bridge the domain gap with pre-trained models or
through style transfer, we suggest semi-supervised learning for both object and
keypoint detection. Furthermore, we introduce a novel domain-specific art data
set that includes both bounding box and keypoint annotations of human figures.
Our approach achieves significantly better results than methods that use
pre-trained models or style transfer.
Related papers
- GRPose: Learning Graph Relations for Human Image Generation with Pose Priors [21.971188335727074]
We propose a framework delving into the graph relations of pose priors to provide control information for human image generation.
Our model achieves superior performance, with a 9.98% increase in pose average precision compared to the latest benchmark model.
arXiv Detail & Related papers (2024-08-29T13:58:34Z) - Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation.
Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z) - UniHuman: A Unified Model for Editing Human Images in the Wild [49.896715833075106]
We propose UniHuman, a unified model that addresses multiple facets of human image editing in real-world settings.
To enhance the model's generation quality and generalization capacity, we leverage guidance from human visual encoders.
In user studies, UniHuman is preferred by the users in an average of 77% of cases.
arXiv Detail & Related papers (2023-12-22T05:00:30Z) - There Is a Digital Art History [1.0878040851637998]
We revisit Johanna Drucker's question, "Is there a digital art history?"
We focus our analysis on two main aspects that seem to suggest a coming paradigm shift towards a "digital" art history.
arXiv Detail & Related papers (2023-08-14T21:21:03Z) - Learning to Evaluate the Artness of AI-generated Images [64.48229009396186]
ArtScore is a metric designed to evaluate the degree to which an image resembles authentic artworks by artists.
We employ pre-trained models for photo and artwork generation, resulting in a series of mixed models.
This dataset is then employed to train a neural network that learns to estimate quantized artness levels of arbitrary images.
arXiv Detail & Related papers (2023-05-08T17:58:27Z) - Poses of People in Art: A Data Set for Human Pose Estimation in Digital
Art History [0.6345523830122167]
We introduce the first openly licensed data set for estimating human poses in art.
The Poses of People in Art data set consists of 2,454 images from 22 art-historical depiction styles.
A total of 10,749 human figures are precisely enclosed by rectangular bounding boxes, with a maximum of four per image labeled by up to 17 keypoints.
arXiv Detail & Related papers (2023-01-12T16:23:58Z) - CtlGAN: Few-shot Artistic Portraits Generation with Contrastive Transfer
Learning [77.27821665339492]
CtlGAN is a new few-shot artistic portraits generation model with a novel contrastive transfer learning strategy.
We adapt a pretrained StyleGAN in the source domain to a target artistic domain with no more than 10 artistic faces.
We propose a new encoder which embeds real faces into Z+ space and proposes a dual-path training strategy to better cope with the adapted decoder.
arXiv Detail & Related papers (2022-03-16T13:28:17Z) - Learning Dynamics via Graph Neural Networks for Human Pose Estimation
and Tracking [98.91894395941766]
We propose a novel online approach to learning the pose dynamics, which are independent of pose detections in current fame.
Specifically, we derive this prediction of dynamics through a graph neural network(GNN) that explicitly accounts for both spatial-temporal and visual information.
Experiments on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed method achieves results superior to the state of the art on both human pose estimation and tracking tasks.
arXiv Detail & Related papers (2021-06-07T16:36:50Z) - Enhancing Human Pose Estimation in Ancient Vase Paintings via
Perceptually-grounded Style Transfer Learning [15.888271913164969]
We show how to adapt a dataset of natural images of known person and pose annotations to the style of Greek vase paintings by means of image style-transfer.
We show that using style-transfer learning significantly improves the SOTA performance on unlabelled data by more than 6% mean average precision (mAP) and mean average recall (mAR)
In a thorough ablation study, we give a targeted analysis of the influence of style intensities, revealing that the model learns generic domain styles.
arXiv Detail & Related papers (2020-12-10T12:08:03Z) - Towards Accurate Human Pose Estimation in Videos of Crowded Scenes [134.60638597115872]
We focus on improving human pose estimation in videos of crowded scenes from the perspectives of exploiting temporal context and collecting new data.
For one frame, we forward the historical poses from the previous frames and backward the future poses from the subsequent frames to current frame, leading to stable and accurate human pose estimation in videos.
In this way, our model achieves best performance on 7 out of 13 videos and 56.33 average w_AP on test dataset of HIE challenge.
arXiv Detail & Related papers (2020-10-16T13:19:11Z) - Demographic Influences on Contemporary Art with Unsupervised Style
Embeddings [25.107166631583212]
contempArt is a collection of paintings and drawings, a detailed graph network based on social connections on Instagram and additional socio-demographic information.
We evaluate three methods suited for generating unsupervised style embeddings of images and correlate them with the remaining data.
arXiv Detail & Related papers (2020-09-30T10:13:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.