OBIFormer: A Fast Attentive Denoising Framework for Oracle Bone Inscriptions
- URL: http://arxiv.org/abs/2504.13524v1
- Date: Fri, 18 Apr 2025 07:24:35 GMT
- Title: OBIFormer: A Fast Attentive Denoising Framework for Oracle Bone Inscriptions
- Authors: Jinhao Li, Zijian Chen, Tingzhu Chen, Zhiji Liu, Changbo Wang,
- Abstract summary: Oracle bone inscriptions (OBIs) are the earliest known form of Chinese characters and serve as a valuable resource for research in anthropology and archaeology.<n>Previous methods either focus on pixel-level information or utilize vanilla transformers for glyph-based OBI denoising.<n>This paper proposes a fast attentive denoising framework for oracle bone inscriptions, i.e., OBIFormer.
- Score: 7.657419462547438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Oracle bone inscriptions (OBIs) are the earliest known form of Chinese characters and serve as a valuable resource for research in anthropology and archaeology. However, most excavated fragments are severely degraded due to thousands of years of natural weathering, corrosion, and man-made destruction, making automatic OBI recognition extremely challenging. Previous methods either focus on pixel-level information or utilize vanilla transformers for glyph-based OBI denoising, which leads to tremendous computational overhead. Therefore, this paper proposes a fast attentive denoising framework for oracle bone inscriptions, i.e., OBIFormer. It leverages channel-wise self-attention, glyph extraction, and selective kernel feature fusion to reconstruct denoised images precisely while being computationally efficient. Our OBIFormer achieves state-of-the-art denoising performance for PSNR and SSIM metrics on synthetic and original OBI datasets. Furthermore, comprehensive experiments on a real oracle dataset demonstrate the great potential of our OBIFormer in assisting automatic OBI recognition. The code will be made available at https://github.com/LJHolyGround/OBIFormer.
Related papers
- Mitigating Long-tail Distribution in Oracle Bone Inscriptions: Dataset, Model, and Benchmark [36.21507457913964]
oracle bone inscription (OBI) recognition plays a significant role in understanding the history and culture of ancient China.<n>The existing OBI datasets suffer from a long-tail distribution problem, leading to biased performance of OBI recognition models across majority and minority classes.<n>We present the Oracle-P15K, a structure-aligned OBI dataset for OBI generation and denoising, consisting of 14,542 images infused with domain knowledge from OBI experts.
arXiv Detail & Related papers (2025-04-13T13:03:25Z) - OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones? [40.226986425846825]
We introduce OBI-Bench, a holistic benchmark crafted to evaluate large multi-modal models (LMMs) on whole-process oracle bone inscriptions.
OBI-Bench includes 5,523 meticulously collected diverse-sourced images, covering five key domain problems: recognition, rejoining, classification, retrieval, and deciphering.
Unlike existing benchmarks, OBI-Bench focuses on advanced visual perception and reasoning with OBI-specific knowledge, challenging LMMs to perform tasks akin to those faced by experts.
arXiv Detail & Related papers (2024-12-02T06:31:28Z) - Unsupervised Attention Regularization Based Domain Adaptation for Oracle Character Recognition [59.05212866862219]
The study of oracle characters plays an important role in Chinese archaeology and philology.
The difficulty of collecting and annotating real-world scanned oracle characters hinders the development of oracle character recognition.
We develop a novel unsupervised domain adaptation (UDA) method to transfer recognition knowledge from labeled handprinted oracle characters to unlabeled scanned data.
arXiv Detail & Related papers (2024-09-24T09:07:05Z) - Oracle Bone Inscriptions Multi-modal Dataset [58.20314888996118]
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography.
This paper proposes an Oracle Bone Inscriptions Multi-modal dataset, which includes annotation information for 10,077 pieces of oracle bones.
This dataset can be used for a variety of AI-related research tasks relevant to the field of OBI, such as OBI Character Detection and Recognition, Rubbing Denoising, Character Matching, Character Generation, Reading Sequence Prediction, Missing Characters Completion task and so on.
arXiv Detail & Related papers (2024-07-04T12:47:32Z) - Deciphering Oracle Bone Language with Diffusion Models [70.69739681961558]
Oracle Bone Script (OBS) originated from China's Shang Dynasty approximately 3,000 years ago.<n>This paper introduces a novel approach by adopting image generation techniques, specifically through the development of Oracle Bone Script Decipher (OBSD)<n>OBSD generates vital clues for decipherment, charting a new course for AI-assisted analysis of ancient languages.
arXiv Detail & Related papers (2024-06-02T09:42:23Z) - Oracle Character Recognition using Unsupervised Discriminative
Consistency Network [65.64172835624206]
We propose a novel unsupervised domain adaptation method for oracle character recognition (OrCR)
We leverage pseudo-labeling to incorporate the semantic information into adaptation and constrain augmentation consistency.
Our approach achieves state-of-the-art result on Oracle-241 dataset and substantially outperforms the recently proposed structure-texture separation network by 15.1%.
arXiv Detail & Related papers (2023-12-11T02:52:27Z) - Unsupervised Structure-Texture Separation Network for Oracle Character
Recognition [70.29024469395608]
Oracle bone script is the earliest-known Chinese writing system of the Shang dynasty and is precious to archeology and philology.
We propose a structure-texture separation network (STSN), which is an end-to-end learning framework for joint disentanglement, transformation, adaptation and recognition.
arXiv Detail & Related papers (2022-05-13T10:27:02Z) - DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for
AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise
Annotations [90.27736364704108]
We present DrugOOD, a systematic OOD dataset curator and benchmark for AI-aided drug discovery.
DrugOOD comes with an open-source Python package that fully automates benchmarking processes.
We focus on one of the most crucial problems in AIDD: drug target binding affinity prediction.
arXiv Detail & Related papers (2022-01-24T12:32:48Z) - Recognition of Oracle Bone Inscriptions by using Two Deep Learning
Models [0.0]
Oracle bone inscriptions (OBIs) contain some of the oldest characters in the world and were used in China about 3000 years ago.
This paper aims to design a online OBI recognition system for helping preservation and organization the cultural heritage.
arXiv Detail & Related papers (2021-05-03T12:31:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.