Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings
- URL: http://arxiv.org/abs/2012.07412v3
- Date: Mon, 25 Jan 2021 11:18:37 GMT
- Title: Fork or Fail: Cycle-Consistent Training with Many-to-One Mappings
- Authors: Qipeng Guo, Zhijing Jin, Ziyu Wang, Xipeng Qiu, Weinan Zhang, Jun Zhu,
Zheng Zhang, David Wipf
- Abstract summary: Cycle-consistent training is widely used for learning a forward and inverse mapping between two domains of interest.
We develop a conditional variational autoencoder (CVAE) approach that can be viewed as converting surjective mappings to implicit bijections.
Our pipeline can capture such many-to-one mappings during cycle training while promoting graph-to-text diversity.
- Score: 67.11712279612583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cycle-consistent training is widely used for jointly learning a forward and
inverse mapping between two domains of interest without the cumbersome
requirement of collecting matched pairs within each domain. In this regard, the
implicit assumption is that there exists (at least approximately) a
ground-truth bijection such that a given input from either domain can be
accurately reconstructed from successive application of the respective
mappings. But in many applications no such bijection can be expected to exist
and large reconstruction errors can compromise the success of cycle-consistent
training. As one important instance of this limitation, we consider
practically-relevant situations where there exists a many-to-one or surjective
mapping between domains. To address this regime, we develop a conditional
variational autoencoder (CVAE) approach that can be viewed as converting
surjective mappings to implicit bijections whereby reconstruction errors in
both directions can be minimized, and as a natural byproduct, realistic output
diversity can be obtained in the one-to-many direction. As theoretical
motivation, we analyze a simplified scenario whereby minima of the proposed
CVAE-based energy function align with the recovery of ground-truth surjective
mappings. On the empirical side, we consider a synthetic image dataset with
known ground-truth, as well as a real-world application involving natural
language generation from knowledge graphs and vice versa, a prototypical
surjective case. For the latter, our CVAE pipeline can capture such many-to-one
mappings during cycle training while promoting textural diversity for
graph-to-text tasks. Our code is available at github.com/QipengGuo/CycleGT
*A condensed version of this paper has been accepted to AISTATS 2021. This
version contains additional content and updates.
Related papers
- RAGFormer: Learning Semantic Attributes and Topological Structure for Fraud Detection [8.050935113945428]
We present a novel framework called Relation-Aware GNN with transFormer(RAGFormer)
RAGFormer embeds both semantic and topological features into a target node.
The simple yet effective network consists of a semantic encoder, a topology encoder, and an attention fusion module.
arXiv Detail & Related papers (2024-02-27T12:53:15Z) - Auto-Linear Phenomenon in Subsurface Imaging [31.36376355719394]
Subsurface imaging involves solving full waveform inversion (FWI) to predict geophysical properties from measurements.
The usual approach is to train an encoder-decoder network using paired data from two domains: geophysical property and measurement.
This paper extends this direction by demonstrating that only linear mapping necessitates paired data, while both the encoder and decoder can be learned from their respective domains through self-supervised learning.
arXiv Detail & Related papers (2023-04-27T17:59:47Z) - Decoupled Multi-task Learning with Cyclical Self-Regulation for Face
Parsing [71.19528222206088]
We propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation for face parsing.
Specifically, DML-CSR designs a multi-task model which comprises face parsing, binary edge, and category edge detection.
Our method achieves the new state-of-the-art performance on the Helen, CelebA-HQ, and LapaMask datasets.
arXiv Detail & Related papers (2022-03-28T02:12:30Z) - Rethinking the Two-Stage Framework for Grounded Situation Recognition [61.93345308377144]
Grounded Situation Recognition is an essential step towards "human-like" event understanding.
Existing GSR methods resort to a two-stage framework: predicting the verb in the first stage and detecting the semantic roles in the second stage.
We propose a novel SituFormer for GSR which consists of a Coarse-to-Fine Verb Model (CFVM) and a Transformer-based Noun Model (TNM)
arXiv Detail & Related papers (2021-12-10T08:10:56Z) - A Domain Gap Aware Generative Adversarial Network for Multi-domain Image
Translation [22.47113158859034]
The paper proposes a unified model to translate images across multiple domains with significant domain gaps.
With a single unified generator, the model can maintain consistency over the global shapes as well as the local texture information across multiple domains.
arXiv Detail & Related papers (2021-10-21T00:33:06Z) - Image-to-image Mapping with Many Domains by Sparse Attribute Transfer [71.28847881318013]
Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.
Current convention is to approach this task with cycle-consistent GANs.
We propose an alternate approach that directly restricts the generator to performing a simple sparse transformation in a latent layer.
arXiv Detail & Related papers (2020-06-23T19:52:23Z) - Phase Consistent Ecological Domain Adaptation [76.75730500201536]
We focus on the task of semantic segmentation, where annotated synthetic data are aplenty, but annotating real data is laborious.
The first criterion, inspired by visual psychophysics, is that the map between the two image domains be phase-preserving.
The second criterion aims to leverage ecological statistics, or regularities in the scene which are manifest in any image of it, regardless of the characteristics of the illuminant or the imaging sensor.
arXiv Detail & Related papers (2020-04-10T06:58:03Z) - Grounded and Controllable Image Completion by Incorporating Lexical
Semantics [111.47374576372813]
Lexical Semantic Image Completion (LSIC) may have potential applications in art, design, and heritage conservation.
We advocate generating results faithful to both visual and lexical semantic context.
One major challenge for LSIC comes from modeling and aligning the structure of visual-semantic context.
arXiv Detail & Related papers (2020-02-29T16:54:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.