Multi-Domain Multi-Definition Landmark Localization for Small Datasets
- URL: http://arxiv.org/abs/2203.10358v1
- Date: Sat, 19 Mar 2022 17:09:29 GMT
- Title: Multi-Domain Multi-Definition Landmark Localization for Small Datasets
- Authors: David Ferman and Gaurav Bharaj
- Abstract summary: We present a novel method for multi image domain and multi-landmark definition learning for small dataset facial localization.
We propose a Vision Transformer encoder with a novel decoder with a definition shared landmark semantic group structured prior.
We show state-of-the-art performance on several varied image domain small datasets for animals, caricatures, and facial portrait paintings.
- Score: 1.2691047660244332
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present a novel method for multi image domain and multi-landmark
definition learning for small dataset facial localization. Training a small
dataset alongside a large(r) dataset helps with robust learning for the former,
and provides a universal mechanism for facial landmark localization for new
and/or smaller standard datasets. To this end, we propose a Vision Transformer
encoder with a novel decoder with a definition agnostic shared landmark
semantic group structured prior, that is learnt, as we train on more than one
dataset concurrently. Due to our novel definition agnostic group prior the
datasets may vary in landmark definitions and domains. During the decoder stage
we use cross- and self-attention, whose output is later fed into
domain/definition specific heads that minimize a Laplacian-log-likelihood loss.
We achieve state-of-the-art performance on standard landmark localization
datasets such as COFW and WFLW, when trained with a bigger dataset. We also
show state-of-the-art performance on several varied image domain small datasets
for animals, caricatures, and facial portrait paintings. Further, we contribute
a small dataset (150 images) of pareidolias to show efficacy of our method.
Finally, we provide several analysis and ablation studies to justify our
claims.
Related papers
- Is in-domain data beneficial in transfer learning for landmarks
detection in x-ray images? [1.5348047288817481]
We study whether the usage of small-scale in-domain x-ray image datasets may provide any improvement for landmark detection over models pre-trained on large natural image datasets only.
Our results show that using in-domain source datasets brings marginal or no benefit with respect to an ImageNet out-of-domain pre-training.
Our findings can provide an indication for the development of robust landmark detection systems in medical images when no large annotated dataset is available.
arXiv Detail & Related papers (2024-03-03T10:35:00Z) - Towards Multi-domain Face Landmark Detection with Synthetic Data from
Diffusion model [27.307563102526192]
deep learning-based facial landmark detection for in-the-wild faces has achieved significant improvement.
There are still challenges in face landmark detection in other domains (e.g. cartoon, caricature, etc)
We design a two-stage training approach that effectively leverages limited datasets and the pre-trained diffusion model.
Our results demonstrate that our method outperforms existing methods on multi-domain face landmark detection.
arXiv Detail & Related papers (2024-01-24T02:35:32Z) - infoVerse: A Universal Framework for Dataset Characterization with
Multidimensional Meta-information [68.76707843019886]
infoVerse is a universal framework for dataset characterization.
infoVerse captures multidimensional characteristics of datasets by incorporating various model-driven meta-information.
In three real-world applications (data pruning, active learning, and data annotation), the samples chosen on infoVerse space consistently outperform strong baselines.
arXiv Detail & Related papers (2023-05-30T18:12:48Z) - Unsupervised Domain Adaptation for Medical Image Segmentation via
Feature-space Density Matching [0.0]
This paper presents an unsupervised domain adaptation approach for semantic segmentation.
We match the target data distribution to the source in the feature space, particularly when the number of target samples is limited.
We demonstrate the efficacy of our proposed approach on 2 datasets, multisite prostate MRI and histopathology images.
arXiv Detail & Related papers (2023-05-09T22:24:46Z) - Unsupervised Domain Adaptation with Histogram-gated Image Translation
for Delayered IC Image Analysis [2.720699926154399]
Histogram-gated Image Translation (HGIT) is an unsupervised domain adaptation framework which transforms images from a given source dataset to the domain of a target dataset.
Our method achieves the best performance compared to the reported domain adaptation techniques, and is also reasonably close to the fully supervised benchmark.
arXiv Detail & Related papers (2022-09-27T15:53:22Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets.
This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets.
In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Domain Adaptation on Semantic Segmentation for Aerial Images [3.946367634483361]
We propose a novel unsupervised domain adaptation framework to address domain shift in semantic image segmentation.
We also apply entropy minimization on the target domain to produce high-confident prediction.
We show improvement over state-of-the-art methods in terms of various metrics.
arXiv Detail & Related papers (2020-12-03T20:58:27Z) - DoFE: Domain-oriented Feature Embedding for Generalizable Fundus Image
Segmentation on Unseen Datasets [96.92018649136217]
We present a novel Domain-oriented Feature Embedding (DoFE) framework to improve the generalization ability of CNNs on unseen target domains.
Our DoFE framework dynamically enriches the image features with additional domain prior knowledge learned from multi-source domains.
Our framework generates satisfying segmentation results on unseen datasets and surpasses other domain generalization and network regularization methods.
arXiv Detail & Related papers (2020-10-13T07:28:39Z) - Spatial Attention Pyramid Network for Unsupervised Domain Adaptation [66.75008386980869]
Unsupervised domain adaptation is critical in various computer vision tasks.
We design a new spatial attention pyramid network for unsupervised domain adaptation.
Our method performs favorably against the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-03-29T09:03:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.