Synthetic-to-Real Unsupervised Domain Adaptation for Scene Text
Detection in the Wild
- URL: http://arxiv.org/abs/2009.01766v1
- Date: Thu, 3 Sep 2020 16:16:34 GMT
- Title: Synthetic-to-Real Unsupervised Domain Adaptation for Scene Text
Detection in the Wild
- Authors: Weijia Wu and Ning Lu and Enze Xie
- Abstract summary: We propose a synthetic-to-real domain adaptation method for scene text detection.
A text self-training (TST) method and adversarial text instance alignment (ATA) for domain adaptive scene text detection are introduced.
Results demonstrate the effectiveness of the proposed method with up to 10% improvement.
- Score: 11.045516338817132
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep learning-based scene text detection can achieve preferable performance,
powered with sufficient labeled training data. However, manual labeling is time
consuming and laborious. At the extreme, the corresponding annotated data are
unavailable. Exploiting synthetic data is a very promising solution except for
domain distribution mismatches between synthetic datasets and real datasets. To
address the severe domain distribution mismatch, we propose a synthetic-to-real
domain adaptation method for scene text detection, which transfers knowledge
from synthetic data (source domain) to real data (target domain). In this
paper, a text self-training (TST) method and adversarial text instance
alignment (ATA) for domain adaptive scene text detection are introduced. ATA
helps the network learn domain-invariant features by training a domain
classifier in an adversarial manner. TST diminishes the adverse effects of
false positives~(FPs) and false negatives~(FNs) from inaccurate pseudo-labels.
Two components have positive effects on improving the performance of scene text
detectors when adapting from synthetic-to-real scenes. We evaluate the proposed
method by transferring from SynthText, VISD to ICDAR2015, ICDAR2013. The
results demonstrate the effectiveness of the proposed method with up to 10%
improvement, which has important exploration significance for domain adaptive
scene text detection. Code is available at
https://github.com/weijiawu/SyntoReal_STD
Related papers
- Toward Real Text Manipulation Detection: New Dataset and New Solution [58.557504531896704]
High costs associated with professional text manipulation limit the availability of real-world datasets.
We present the Real Text Manipulation dataset, encompassing 14,250 text images.
Our contributions aim to propel advancements in real-world text tampering detection.
arXiv Detail & Related papers (2023-12-12T02:10:16Z) - Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors [54.80516786370663]
FreeReal is a real-domain-aligned pre-training paradigm that enables the complementary strengths of LSD and real data.
GlyphMix embeds synthetic images as graffiti-like units onto real images.
FreeReal consistently outperforms previous pre-training methods by a substantial margin across four public datasets.
arXiv Detail & Related papers (2023-12-08T15:10:55Z) - ParGANDA: Making Synthetic Pedestrians A Reality For Object Detection [2.7648976108201815]
We propose to use a Generative Adversarial Network (GAN) to close the gap between the real and synthetic data.
Our approach not only produces visually plausible samples but also does not require any labels of the real domain.
arXiv Detail & Related papers (2023-07-21T05:26:32Z) - Unsupervised Domain Adaptation for Sparse Retrieval by Filling
Vocabulary and Word Frequency Gaps [12.573927420408365]
IR models using a pretrained language model significantly outperform lexical approaches like BM25.
This paper proposes an unsupervised domain adaptation method by filling vocabulary and word-frequency gaps.
We show that our method outperforms the present stateof-the-art domain adaptation method.
arXiv Detail & Related papers (2022-11-08T03:58:26Z) - UNITS: Unsupervised Intermediate Training Stage for Scene Text Detection [16.925048424113463]
We propose a new training paradigm for scene text detection, which introduces an textbfUNsupervised textbfIntermediate textbfTraining textbfStage (UNITS)
UNITS builds a buffer path to real-world data and can alleviate the gap between the pre-training stage and fine-tuning stage.
Three training strategies are further explored to perceive information from real-world data in an unsupervised way.
arXiv Detail & Related papers (2022-05-10T05:34:58Z) - Unsupervised Domain Adaptive Salient Object Detection Through
Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations.
We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z) - Content Disentanglement for Semantically Consistent
Synthetic-to-RealDomain Adaptation in Urban Traffic Scenes [39.38387505091648]
Synthetic data generation is an appealing approach to generate novel traffic scenarios in autonomous driving.
Deep learning techniques trained solely on synthetic data encounter dramatic performance drops when they are tested on real data.
We propose a new, unsupervised, end-to-end domain adaptation network architecture that enables semantically consistent domain adaptation between synthetic and real data.
arXiv Detail & Related papers (2021-05-18T17:42:26Z) - Contrastive Learning and Self-Training for Unsupervised Domain
Adaptation in Semantic Segmentation [71.77083272602525]
UDA attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain.
We propose a contrastive learning approach that adapts category-wise centroids across domains.
We extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels.
arXiv Detail & Related papers (2021-05-05T11:55:53Z) - A Free Lunch for Unsupervised Domain Adaptive Object Detection without
Source Data [69.091485888121]
Unsupervised domain adaptation assumes that source and target domain data are freely available and usually trained together to reduce the domain gap.
We propose a source data-free domain adaptive object detection (SFOD) framework via modeling it into a problem of learning with noisy labels.
arXiv Detail & Related papers (2020-12-10T01:42:35Z) - Text Recognition -- Real World Data and Where to Find Them [36.10220484561196]
We present a method for exploiting weakly annotated images to improve text extraction pipelines.
The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions.
It produces nearly error-free, localised instances of scene text, which we treat as "pseudo ground truth" (PGT)
arXiv Detail & Related papers (2020-07-06T22:23:27Z) - Text Recognition in Real Scenarios with a Few Labeled Samples [55.07859517380136]
Scene text recognition (STR) is still a hot research topic in computer vision field.
This paper proposes a few-shot adversarial sequence domain adaptation (FASDA) approach to build sequence adaptation.
Our approach can maximize the character-level confusion between the source domain and the target domain.
arXiv Detail & Related papers (2020-06-22T13:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.