Related papers: Unsupervised Structure-Texture Separation Network for Oracle Character Recognition

Unsupervised Structure-Texture Separation Network for Oracle Character Recognition

URL: http://arxiv.org/abs/2205.06549v1
Date: Fri, 13 May 2022 10:27:02 GMT
Title: Unsupervised Structure-Texture Separation Network for Oracle Character Recognition
Authors: Mei Wang, Weihong Deng, Cheng-Lin Liu
Abstract summary: Oracle bone script is the earliest-known Chinese writing system of the Shang dynasty and is precious to archeology and philology. We propose a structure-texture separation network (STSN), which is an end-to-end learning framework for joint disentanglement, transformation, adaptation and recognition.
Score: 70.29024469395608
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Oracle bone script is the earliest-known Chinese writing system of the Shang dynasty and is precious to archeology and philology. However, real-world scanned oracle data are rare and few experts are available for annotation which make the automatic recognition of scanned oracle characters become a challenging task. Therefore, we aim to explore unsupervised domain adaptation to transfer knowledge from handprinted oracle data, which are easy to acquire, to scanned domain. We propose a structure-texture separation network (STSN), which is an end-to-end learning framework for joint disentanglement, transformation, adaptation and recognition. First, STSN disentangles features into structure (glyph) and texture (noise) components by generative models, and then aligns handprinted and scanned data in structure feature space such that the negative influence caused by serious noises can be avoided when adapting. Second, transformation is achieved via swapping the learned textures across domains and a classifier for final classification is trained to predict the labels of the transformed scanned characters. This not only guarantees the absolute separation, but also enhances the discriminative ability of the learned features. Extensive experiments on Oracle-241 dataset show that STSN outperforms other adaptation methods and successfully improves recognition performance on scanned data even when they are contaminated by long burial and careless excavation.

Related papers

OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography [58.790901822971094]
Oracle Bone Script (OBS) encapsulates the cultural records and intellectual expressions of ancient civilizations.<n>Despite the discovery of approximately 4,500 OBS characters, only about 1,600 have been deciphered.<n>This paper proposes a novel two-stage semantic framework, named OracleFusion.
arXiv Detail & Related papers (2025-06-26T08:56:07Z)
A Transformer Based Handwriting Recognition System Jointly Using Online and Offline Features [8.419663258260671]
We introduce an end-to-end network that performs early fusion of offline images and online stroke data.<n>Our approach achieves state-of-the-art accuracy, exceeding previous bests by up to 1%.
arXiv Detail & Related papers (2025-06-25T08:58:47Z)
OBIFormer: A Fast Attentive Denoising Framework for Oracle Bone Inscriptions [7.657419462547438]
Oracle bone inscriptions (OBIs) are the earliest known form of Chinese characters and serve as a valuable resource for research in anthropology and archaeology. Previous methods either focus on pixel-level information or utilize vanilla transformers for glyph-based OBI denoising. This paper proposes a fast attentive denoising framework for oracle bone inscriptions, i.e., OBIFormer.
arXiv Detail & Related papers (2025-04-18T07:24:35Z)
Dual-branch Graph Feature Learning for NLOS Imaging [51.31554007495926]
Non-line-of-sight (NLOS) imaging offers the capability to reveal occluded scenes that are not directly visible. xnet methodology integrates an albedo-focused reconstruction branch dedicated to albedo information recovery and a depth-focused reconstruction branch that extracts geometrical structure. Our method attains the highest level of performance among existing methods across synthetic and real data.
arXiv Detail & Related papers (2025-02-27T01:49:00Z)
Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing [71.29488677105127]
Existing scene text recognition (STR) methods struggle to recognize challenging texts, especially for artistic and severely distorted characters. We propose a contrastive learning-based STR framework by leveraging synthetic and real unlabeled data without any human cost. Our method achieves SOTA performance (94.7% and 70.9% average accuracy on common benchmarks and Union14M-Benchmark.
arXiv Detail & Related papers (2024-11-23T15:24:47Z)
Unsupervised Attention Regularization Based Domain Adaptation for Oracle Character Recognition [59.05212866862219]
The study of oracle characters plays an important role in Chinese archaeology and philology. The difficulty of collecting and annotating real-world scanned oracle characters hinders the development of oracle character recognition. We develop a novel unsupervised domain adaptation (UDA) method to transfer recognition knowledge from labeled handprinted oracle characters to unlabeled scanned data.
arXiv Detail & Related papers (2024-09-24T09:07:05Z)
Oracle Character Recognition using Unsupervised Discriminative Consistency Network [65.64172835624206]
We propose a novel unsupervised domain adaptation method for oracle character recognition (OrCR) We leverage pseudo-labeling to incorporate the semantic information into adaptation and constrain augmentation consistency. Our approach achieves state-of-the-art result on Oracle-241 dataset and substantially outperforms the recently proposed structure-texture separation network by 15.1%.
arXiv Detail & Related papers (2023-12-11T02:52:27Z)
Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents [0.0]
A modified UNet structure using a Swin Transformer backbone is presented to remove typical artifacts in scanned documents. An improvement in text extraction quality with a reduced error rate of up to 53.9% on the synthetic data is archived.
arXiv Detail & Related papers (2023-06-05T12:12:23Z)
UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection [52.91782218300844]
We propose a novel Unsupervised Inconsistency-Aware method based on Vision Transformer, called UIA-ViT. Due to the self-attention mechanism, the attention map among patch embeddings naturally represents the consistency relation, making the vision Transformer suitable for the consistency representation learning.
arXiv Detail & Related papers (2022-10-23T15:24:47Z)
DFCANet: Dense Feature Calibration-Attention Guided Network for Cross Domain Iris Presentation Attack Detection [2.95102708174421]
iris presentation attack detection (IPAD) is essential for securing personal identity. Existing IPAD algorithms do not generalize well to unseen and cross-domain scenarios. This paper proposes DFCANet: Dense Feature and Attention Guided Network.
arXiv Detail & Related papers (2021-11-01T13:04:23Z)
SPIN: Structure-Preserving Inner Offset Network for Scene Text Recognition [48.676064155070556]
Arbitrary text appearance poses a great challenge in scene text recognition tasks. We introduce a new learnable geometric-unrelated module, the Structure-Preserving Inner Offset Network (SPIN) SPIN allows the color manipulation of source data within the network.
arXiv Detail & Related papers (2020-05-27T01:47:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.