Related papers: A Cross-Font Image Retrieval Network for Recognizing Undeciphered Oracle Bone Inscriptions

A Cross-Font Image Retrieval Network for Recognizing Undeciphered Oracle Bone Inscriptions

URL: http://arxiv.org/abs/2409.06381v2
Date: Thu, 26 Dec 2024 02:32:19 GMT
Title: A Cross-Font Image Retrieval Network for Recognizing Undeciphered Oracle Bone Inscriptions
Authors: Zhicong Wu, Qifeng Su, Ke Gu, Xiaodong Shi,
Abstract summary: Oracle Bone Inscription (OBI) is the earliest mature writing system in China.<n>We propose a cross-font image retrieval network (CFIRN) to decipher OBI characters.
Score: 12.664292922995532
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Oracle Bone Inscription (OBI) is the earliest mature writing system in China, which represents a crucial stage in the development of hieroglyphs. Nevertheless, the substantial quantity of undeciphered OBI characters remains a significant challenge for scholars, while conventional methods of ancient script research are both time-consuming and labor-intensive. In this paper, we propose a cross-font image retrieval network (CFIRN) to decipher OBI characters by establishing associations between OBI characters and other script forms, simulating the interpretive behavior of paleography scholars. Concretely, our network employs a siamese framework to extract deep features from character images of various fonts, fully exploring structure clues with different resolutions by multiscale feature integration (MFI) module and multiscale refinement classifier (MRC). Extensive experiments on three challenging cross-font image retrieval datasets demonstrate that, given undeciphered OBI characters, our CFIRN can effectively achieve accurate matches with characters from other gallery fonts, thereby facilitating the deciphering.

Related papers

When Text-as-Vision Meets Semantic IDs in Generative Recommendation: An Empirical Study [48.67151986743594]
We revisit representation design for Semantic ID learning by treating text as a visual signal.<n>We conduct a systematic empirical study of OCR-based text representations, obtained by rendering item descriptions into images.<n>We find that OCR-text consistently matches or surpasses standard text embeddings for Semantic ID learning in both unimodal and multimodal settings.
arXiv Detail & Related papers (2026-01-21T06:18:57Z)
Interpretable Oracle Bone Script Decipherment through Radical and Pictographic Analysis with LVLMs [17.78374199471431]
We propose an interpretable Oracle Bone Script (OBS) decipherment method based on Large Vision-Language Models.<n>We also propose the Pictographic Decipherment OBS dataset, which comprises 47,157 Chinese characters annotated with OBS images and pictographic analysis texts.<n>Our approach achieves state-of-the-art Top-10 accuracy and superior zero-shot decipherment capabilities.
arXiv Detail & Related papers (2025-08-13T18:13:32Z)
OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography [58.790901822971094]
Oracle Bone Script (OBS) encapsulates the cultural records and intellectual expressions of ancient civilizations.<n>Despite the discovery of approximately 4,500 OBS characters, only about 1,600 have been deciphered.<n>This paper proposes a novel two-stage semantic framework, named OracleFusion.
arXiv Detail & Related papers (2025-06-26T08:56:07Z)
ImageScope: Unifying Language-Guided Image Retrieval via Large Multimodal Model Collective Reasoning [62.61187785810336]
ImageScope is a training-free, three-stage framework that unifies language-guided image retrieval tasks. In the first stage, we improve the robustness of the framework by synthesizing search intents across varying levels of semantic granularity. In the second and third stages, we reflect on retrieval results by verifying predicate propositions locally, and performing pairwise evaluations globally.
arXiv Detail & Related papers (2025-03-13T08:43:24Z)
Structured Analysis and Comparison of Alphabets in Historical Handwritten Ciphers [3.423211639513232]
We propose the CSI metric, a novel way of comparing pairs of ciphered documents. We assess their effectiveness in an unsupervised clustering scenario utilising visual features, including SIFT, pre-trained learnt embeddings, and OCR descriptors.
arXiv Detail & Related papers (2024-10-29T10:12:16Z)
Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML) This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature. The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z)
Oracle Bone Inscriptions Multi-modal Dataset [58.20314888996118]
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. This paper proposes an Oracle Bone Inscriptions Multi-modal dataset, which includes annotation information for 10,077 pieces of oracle bones. This dataset can be used for a variety of AI-related research tasks relevant to the field of OBI, such as OBI Character Detection and Recognition, Rubbing Denoising, Character Matching, Character Generation, Reading Sequence Prediction, Missing Characters Completion task and so on.
arXiv Detail & Related papers (2024-07-04T12:47:32Z)
Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction [73.26364649572237]
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world. A large number of Oracle Bone Inscriptions (OBI) remain undeciphered, making it one of the global challenges in paleography today. This paper introduces a novel approach, namely Puzzle Pieces Picker (P$3$), to decipher these enigmatic characters through radical reconstruction.
arXiv Detail & Related papers (2024-06-05T07:34:39Z)
Deformation Robust Text Spotting with Geometric Prior [5.639053898266709]
We develop a robust text spotting method (DR TextSpotter) to solve the recognition problem of complex deformation of characters in different fonts. A graph convolution network is constructed to fuse the character features and landmark features, and then performs semantic reasoning to enhance the discrimination for different characters.
arXiv Detail & Related papers (2023-08-31T02:13:15Z)
OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models [122.27878464009181]
We conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks. OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available.
arXiv Detail & Related papers (2023-05-13T11:28:37Z)
VGTS: Visually Guided Text Spotting for Novel Categories in Historical Manuscripts [26.09365732823049]
We propose a Visually Guided Text Spotting (VGTS) approach that accurately spots novel characters using just one annotated support sample. The DSA block aims to identify, focus on, and learn discriminative spatial regions in the support and query images, mimicking the human visual spotting process. To tackle the example imbalance problem in low-resource spotting tasks, we develop a novel torus loss function that enhances the discriminative power of the embedding space for distance metric learning.
arXiv Detail & Related papers (2023-04-03T06:40:52Z)
Unsupervised Clustering of Roman Potsherds via Variational Autoencoders [63.8376359764052]
We propose an artificial intelligence solution to support archaeologists in the classification task of Roman commonware potsherds. The partiality and handcrafted variance of the fragments make their matching a challenging problem. We propose to pair similar profiles via the unsupervised hierarchical clustering of non-linear features learned in the latent space of a deep convolutional Variational Autoencoder (VAE) network.
arXiv Detail & Related papers (2022-03-14T18:56:13Z)
Continuous Offline Handwriting Recognition using Deep Learning Models [0.0]
Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis. We have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq) The new proposed model provides competitive results with those obtained with other well-established methodologies.
arXiv Detail & Related papers (2021-12-26T07:31:03Z)
HENet: Forcing a Network to Think More for Font Recognition [10.278412487287882]
This paper proposes a novel font recognizer with a pluggable module solving the font recognition task. The pluggable module hides the most discriminative accessible features and forces the network to consider other complicated features to solve the hard examples of similar fonts, called HE Block.
arXiv Detail & Related papers (2021-10-21T03:25:47Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.