Assembling Semantically-Disentangled Representations for
Predictive-Generative Models via Adaptation from Synthetic Domain
- URL: http://arxiv.org/abs/2002.09818v1
- Date: Sun, 23 Feb 2020 03:35:45 GMT
- Title: Assembling Semantically-Disentangled Representations for
Predictive-Generative Models via Adaptation from Synthetic Domain
- Authors: Burkay Donderici, Caleb New, Chenliang Xu
- Abstract summary: We show that semantically-aligned representations can be generated with the help of a physics based engine.
It is shown that the proposed (SYNTH-VAE-GAN) method can construct a conditional-generative model of human face attributes without relying on real data labels.
- Score: 32.42156485883356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks can form high-level hierarchical representations of
input data. Various researchers have demonstrated that these representations
can be used to enable a variety of useful applications. However, such
representations are typically based on the statistics within the data, and may
not conform with the semantic representation that may be necessitated by the
application. Conditional models are typically used to overcome this challenge,
but they require large annotated datasets which are difficult to come by and
costly to create. In this paper, we show that semantically-aligned
representations can be generated instead with the help of a physics based
engine. This is accomplished by creating a synthetic dataset with decoupled
attributes, learning an encoder for the synthetic dataset, and augmenting
prescribed attributes from the synthetic domain with attributes from the real
domain. It is shown that the proposed (SYNTH-VAE-GAN) method can construct a
conditional predictive-generative model of human face attributes without
relying on real data labels.
Related papers
- Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis [0.0]
This paper introduces a novel approach that leverages three generative models of varying complexity to synthesize Malicious Network Traffic.
Our approach transforms numerical data into text, re-framing data generation as a language modeling task.
Our method surpasses state-of-the-art generative models in producing high-fidelity synthetic data.
arXiv Detail & Related papers (2024-11-04T09:51:10Z) - Massively Annotated Datasets for Assessment of Synthetic and Real Data in Face Recognition [0.2775636978045794]
We study the drift between the performance of models trained on real and synthetic datasets.
We conduct studies on the differences between real and synthetic datasets on the attribute set.
Interestingly enough, we have verified that while real samples suffice to explain the synthetic distribution, the opposite could not be further from being true.
arXiv Detail & Related papers (2024-04-23T17:10:49Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - CasTGAN: Cascaded Generative Adversarial Network for Realistic Tabular
Data Synthesis [0.4999814847776097]
Generative adversarial networks (GANs) have drawn considerable attention in recent years for their proven capability in generating synthetic data.
The validity of the synthetic data and the underlying privacy concerns represent major challenges which are not sufficiently addressed.
arXiv Detail & Related papers (2023-07-01T16:52:18Z) - Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2)
The dataset is composed of both real and synthetic videos from seven gesture classes.
We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z) - Rule-adhering synthetic data -- the lingua franca of learning [0.0]
In this work we explore approaches of incorporating domain expertise into the data synthesis.
The resulting synthetic data generator can be probed for any number of new samples.
We demonstrate the concept for a publicly available data set, and evaluate its benefits via descriptive analysis as well as a downstream ML model.
arXiv Detail & Related papers (2022-09-12T20:01:13Z) - Label-Free Model Evaluation with Semi-Structured Dataset Representations [78.54590197704088]
Label-free model evaluation, or AutoEval, estimates model accuracy on unlabeled test sets.
In the absence of image labels, based on dataset representations, we estimate model performance for AutoEval with regression.
We propose a new semi-structured dataset representation that is manageable for regression learning while containing rich information for AutoEval.
arXiv Detail & Related papers (2021-12-01T18:15:58Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Partially Conditioned Generative Adversarial Networks [75.08725392017698]
Generative Adversarial Networks (GANs) let one synthesise artificial datasets by implicitly modelling the underlying probability distribution of a real-world training dataset.
With the introduction of Conditional GANs and their variants, these methods were extended to generating samples conditioned on ancillary information available for each sample within the dataset.
In this work, we argue that standard Conditional GANs are not suitable for such a task and propose a new Adversarial Network architecture and training strategy.
arXiv Detail & Related papers (2020-07-06T15:59:28Z) - Interpretable Entity Representations through Large-Scale Typing [61.4277527871572]
We present an approach to creating entity representations that are human readable and achieve high performance out of the box.
Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types.
We show that it is possible to reduce the size of our type set in a learning-based way for particular domains.
arXiv Detail & Related papers (2020-04-30T23:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.