Related papers: Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach

Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach

URL: http://arxiv.org/abs/2403.06682v1
Date: Mon, 11 Mar 2024 12:57:28 GMT
Title: Restoring Ancient Ideograph: A Multimodal Multitask Neural Network Approach
Authors: Siyu Duan, Jun Wang, Qi Su
Abstract summary: This paper proposes a novel Multimodal Multitask Restoring Model (MMRM) to restore ancient texts. It combines context understanding with residual visual information from damaged ancient artefacts, enabling it to predict damaged characters and generate restored images simultaneously.
Score: 11.263700269889654
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Cultural heritage serves as the enduring record of human thought and history. Despite significant efforts dedicated to the preservation of cultural relics, many ancient artefacts have been ravaged irreversibly by natural deterioration and human actions. Deep learning technology has emerged as a valuable tool for restoring various kinds of cultural heritages, including ancient text restoration. Previous research has approached ancient text restoration from either visual or textual perspectives, often overlooking the potential of synergizing multimodal information. This paper proposes a novel Multimodal Multitask Restoring Model (MMRM) to restore ancient texts, particularly emphasising the ideograph. This model combines context understanding with residual visual information from damaged ancient artefacts, enabling it to predict damaged characters and generate restored images simultaneously. We tested the MMRM model through experiments conducted on both simulated datasets and authentic ancient inscriptions. The results show that the proposed method gives insightful restoration suggestions in both simulation experiments and real-world scenarios. To the best of our knowledge, this work represents the pioneering application of multimodal deep learning in ancient text restoration, which will contribute to the understanding of ancient society and culture in digital humanities fields.

Related papers

Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts [65.90535970515266]
TimeTravel is a benchmark of 10,250 expert-verified samples spanning 266 distinct cultures across 10 major historical regions. TimeTravel is designed for AI-driven analysis of manuscripts, artworks, inscriptions, and archaeological discoveries. We evaluate contemporary AI models on TimeTravel, highlighting their strengths and identifying areas for improvement.
arXiv Detail & Related papers (2025-02-20T18:59:51Z)
Cultural Heritage 3D Reconstruction with Diffusion Networks [0.6445605125467574]
Article explores the use of recent generative AI algorithms for repairing cultural heritage objects. conditional diffusion model designed to reconstruct 3D point clouds effectively.
arXiv Detail & Related papers (2024-10-14T15:43:40Z)
Safeguard Text-to-Image Diffusion Models with Human Feedback Inversion [51.931083971448885]
We propose a framework named Human Feedback Inversion (HFI), where human feedback on model-generated images is condensed into textual tokens guiding the mitigation or removal of problematic images. Our experimental results demonstrate our framework significantly reduces objectionable content generation while preserving image quality, contributing to the ethical deployment of AI in the public sphere.
arXiv Detail & Related papers (2024-07-17T05:21:41Z)
Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction [73.26364649572237]
Oracle Bone Inscriptions is one of the oldest existing forms of writing in the world. A large number of Oracle Bone Inscriptions (OBI) remain undeciphered, making it one of the global challenges in paleography today. This paper introduces a novel approach, namely Puzzle Pieces Picker (P$3$), to decipher these enigmatic characters through radical reconstruction.
arXiv Detail & Related papers (2024-06-05T07:34:39Z)
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild [57.06779516541574]
SUPIR (Scaling-UP Image Restoration) is a groundbreaking image restoration method that harnesses generative prior and the power of model scaling up. We collect a dataset comprising 20 million high-resolution, high-quality images for model training, each enriched with descriptive text annotations.
arXiv Detail & Related papers (2024-01-24T17:58:07Z)
Knowledge-Aware Artifact Image Synthesis with LLM-Enhanced Prompting and Multi-Source Supervision [5.517240672957627]
We propose a novel knowledge-aware artifact image synthesis approach that brings lost historical objects accurately into their visual forms. Compared to existing approaches, our proposed model produces higher-quality artifact images that align better with the implicit details and historical knowledge contained within written documents.
arXiv Detail & Related papers (2023-12-13T11:03:07Z)
(Re)framing Built Heritage through the Machinic Gaze [3.683202928838613]
We argue that the proliferation of machine learning and vision technologies create new scopic regimes for heritage. We introduce the term machinic gaze' to conceptualise the reconfiguration of heritage representation via AI models.
arXiv Detail & Related papers (2023-10-06T23:48:01Z)
ScrollTimes: Tracing the Provenance of Paintings as a Window into History [35.605930297790465]
The study of cultural artifact provenance, tracing ownership and preservation, holds significant importance in archaeology and art history. In collaboration with art historians, we examined the handscroll, a traditional Chinese painting form that provides a rich source of historical data. We present a three-tiered methodology encompassing artifact, contextual, and provenance levels, designed to create a "Biography" for handscroll.
arXiv Detail & Related papers (2023-06-15T03:38:09Z)
Can Artificial Intelligence Reconstruct Ancient Mosaics? [71.93546109923456]
In the last years, Artificial Intelligence (AI) has made impressive progress in the generation of images from text descriptions and reference images. In this paper, we explore whether this innovative technology can be used to reconstruct mosaics with missing parts. Results are promising showing that AI is able to interpret the key features of the mosaics and is able to produce reconstructions that capture the essence of the scene.
arXiv Detail & Related papers (2022-10-07T19:42:09Z)
Where Does the Performance Improvement Come From? - A Reproducibility Concern about Image-Text Retrieval [85.03655458677295]
Image-text retrieval has gradually become a major research direction in the field of information retrieval. We first examine the related concerns and why the focus is on image-text retrieval tasks. We analyze various aspects of the reproduction of pretrained and nonpretrained retrieval models.
arXiv Detail & Related papers (2022-03-08T05:01:43Z)
Compositional Scene Representation Learning via Reconstruction: A Survey [48.33349317481124]
Compositional scene representation learning is a task that enables such abilities. Deep neural networks have been proven to be advantageous in representation learning. Learning via reconstruction is advantageous because it may utilize massive unlabeled data and avoid costly and laborious data annotation.
arXiv Detail & Related papers (2022-02-15T02:14:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.