A Cross-Font Image Retrieval Network for Recognizing Undeciphered Oracle Bone Inscriptions
- URL: http://arxiv.org/abs/2409.06381v1
- Date: Tue, 10 Sep 2024 10:04:58 GMT
- Title: A Cross-Font Image Retrieval Network for Recognizing Undeciphered Oracle Bone Inscriptions
- Authors: Zhicong Wu, Qifeng Su, Ke Gu, Xiaodong Shi,
- Abstract summary: Oracle Bone Inscription (OBI) is the earliest mature writing system known in China to date.
We propose a cross-font image retrieval network (CFIRN) to decipher OBI characters.
- Score: 12.664292922995532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Oracle Bone Inscription (OBI) is the earliest mature writing system known in China to date, which represents a crucial stage in the development of hieroglyphs. Nevertheless, the substantial quantity of undeciphered OBI characters continues to pose a persistent challenge for scholars, while conventional methods of ancient script research are both time-consuming and labor-intensive. In this paper, we propose a cross-font image retrieval network (CFIRN) to decipher OBI characters by establishing associations between OBI characters and other script forms, simulating the interpretive behavior of paleography scholars. Concretely, our network employs a siamese framework to extract deep features from character images of various fonts, fully exploring structure clues with different resolution by designed multiscale feature integration (MFI) module and multiscale refinement classifier (MRC). Extensive experiments on three challenging cross-font image retrieval datasets demonstrate that, given undeciphered OBI characters, our CFIRN can effectively achieve accurate matches with characters from other gallery fonts.
Related papers
- Retrieval-Enhanced Machine Learning: Synthesis and Opportunities [60.34182805429511]
Retrieval-enhancement can be extended to a broader spectrum of machine learning (ML)
This work introduces a formal framework of this paradigm, Retrieval-Enhanced Machine Learning (REML), by synthesizing the literature in various domains in ML with consistent notations which is missing from the current literature.
The goal of this work is to equip researchers across various disciplines with a comprehensive, formally structured framework of retrieval-enhanced models, thereby fostering interdisciplinary future research.
arXiv Detail & Related papers (2024-07-17T20:01:21Z) - Oracle Bone Inscriptions Multi-modal Dataset [58.20314888996118]
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography.
This paper proposes an Oracle Bone Inscriptions Multi-modal dataset, which includes annotation information for 10,077 pieces of oracle bones.
This dataset can be used for a variety of AI-related research tasks relevant to the field of OBI, such as OBI Character Detection and Recognition, Rubbing Denoising, Character Matching, Character Generation, Reading Sequence Prediction, Missing Characters Completion task and so on.
arXiv Detail & Related papers (2024-07-04T12:47:32Z) - Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction [73.26364649572237]
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world.
A large number of Oracle Bone Inscriptions (OBI) remain undeciphered, making it one of the global challenges in paleography today.
This paper introduces a novel approach, namely Puzzle Pieces Picker (P$3$), to decipher these enigmatic characters through radical reconstruction.
arXiv Detail & Related papers (2024-06-05T07:34:39Z) - OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models [122.27878464009181]
We conducted a comprehensive evaluation of Large Multimodal Models, such as GPT4V and Gemini, in various text-related visual tasks.
OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available.
arXiv Detail & Related papers (2023-05-13T11:28:37Z) - VGTS: Visually Guided Text Spotting for Novel Categories in Historical Manuscripts [26.09365732823049]
We propose a Visually Guided Text Spotting (VGTS) approach that accurately spots novel characters using just one annotated support sample.
The DSA block aims to identify, focus on, and learn discriminative spatial regions in the support and query images, mimicking the human visual spotting process.
To tackle the example imbalance problem in low-resource spotting tasks, we develop a novel torus loss function that enhances the discriminative power of the embedding space for distance metric learning.
arXiv Detail & Related papers (2023-04-03T06:40:52Z) - Unsupervised Clustering of Roman Potsherds via Variational Autoencoders [63.8376359764052]
We propose an artificial intelligence solution to support archaeologists in the classification task of Roman commonware potsherds.
The partiality and handcrafted variance of the fragments make their matching a challenging problem.
We propose to pair similar profiles via the unsupervised hierarchical clustering of non-linear features learned in the latent space of a deep convolutional Variational Autoencoder (VAE) network.
arXiv Detail & Related papers (2022-03-14T18:56:13Z) - Continuous Offline Handwriting Recognition using Deep Learning Models [0.0]
Handwritten text recognition is an open problem of great interest in the area of automatic document image analysis.
We have proposed a new recognition model based on integrating two types of deep learning architectures: convolutional neural networks (CNN) and sequence-to-sequence (seq2seq)
The new proposed model provides competitive results with those obtained with other well-established methodologies.
arXiv Detail & Related papers (2021-12-26T07:31:03Z) - HENet: Forcing a Network to Think More for Font Recognition [10.278412487287882]
This paper proposes a novel font recognizer with a pluggable module solving the font recognition task.
The pluggable module hides the most discriminative accessible features and forces the network to consider other complicated features to solve the hard examples of similar fonts, called HE Block.
arXiv Detail & Related papers (2021-10-21T03:25:47Z) - Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction.
Our approach enables us to massively scale up the number of character types we can effectively model.
We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.