Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized
Herculaneum Papyri
- URL: http://arxiv.org/abs/2308.05070v1
- Date: Wed, 9 Aug 2023 17:00:43 GMT
- Title: Volumetric Fast Fourier Convolution for Detecting Ink on the Carbonized
Herculaneum Papyri
- Authors: Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli, Rita Cucchiara
- Abstract summary: We propose a modification of the Fast Fourier Convolution operator for volumetric data and apply it in a segmentation architecture for ink detection on the Herculaneum papyri.
To encourage the research on this task and the application of the proposed operator to other tasks involving volumetric data, we will release our implementation.
- Score: 23.090618261864886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Digital Document Restoration (DDR) have led to
significant breakthroughs in analyzing highly damaged written artifacts. Among
those, there has been an increasing interest in applying Artificial
Intelligence techniques for virtually unwrapping and automatically detecting
ink on the Herculaneum papyri collection. This collection consists of
carbonized scrolls and fragments of documents, which have been digitized via
X-ray tomography to allow the development of ad-hoc deep learning-based DDR
solutions. In this work, we propose a modification of the Fast Fourier
Convolution operator for volumetric data and apply it in a segmentation
architecture for ink detection on the challenging Herculaneum papyri,
demonstrating its suitability via deep experimental analysis. To encourage the
research on this task and the application of the proposed operator to other
tasks involving volumetric data, we will release our implementation
(https://github.com/aimagelab/vffc)
Related papers
- Transformer-Based UNet with Multi-Headed Cross-Attention Skip
Connections to Eliminate Artifacts in Scanned Documents [0.0]
A modified UNet structure using a Swin Transformer backbone is presented to remove typical artifacts in scanned documents.
An improvement in text extraction quality with a reduced error rate of up to 53.9% on the synthetic data is archived.
arXiv Detail & Related papers (2023-06-05T12:12:23Z) - How Does Generative Retrieval Scale to Millions of Passages? [68.98628807288972]
We conduct the first empirical study of generative retrieval techniques across various corpus scales.
We scale generative retrieval to millions of passages with a corpus of 8.8M passages and evaluating model sizes up to 11B parameters.
While generative retrieval is competitive with state-of-the-art dual encoders on small corpora, scaling to millions of passages remains an important and unsolved challenge.
arXiv Detail & Related papers (2023-05-19T17:33:38Z) - EduceLab-Scrolls: Verifiable Recovery of Text from Herculaneum Papyri using X-ray CT [0.0]
We present a complete software pipeline for revealing the hidden texts of the Herculaneum papyri using X-ray CT images.
We also present EduceLab-Scrolls, a comprehensive open dataset representing two decades of research effort on this problem.
arXiv Detail & Related papers (2023-04-04T19:28:51Z) - Batch-based Model Registration for Fast 3D Sherd Reconstruction [74.55975819488404]
3D reconstruction techniques have widely been used for digital documentation of archaeological fragments.
We aim to develop a portable, high- throughput, and accurate reconstruction system for efficient digitization of fragments excavated in archaeological sites.
We develop a new batch-based matching algorithm that pairs the front and back sides of the fragments, and a new Bilateral Boundary ICP algorithm that can register partial scans sharing very narrow overlapping regions.
arXiv Detail & Related papers (2022-11-13T13:08:59Z) - Unsupervised Clustering of Roman Potsherds via Variational Autoencoders [63.8376359764052]
We propose an artificial intelligence solution to support archaeologists in the classification task of Roman commonware potsherds.
The partiality and handcrafted variance of the fragments make their matching a challenging problem.
We propose to pair similar profiles via the unsupervised hierarchical clustering of non-linear features learned in the latent space of a deep convolutional Variational Autoencoder (VAE) network.
arXiv Detail & Related papers (2022-03-14T18:56:13Z) - One-shot Key Information Extraction from Document with Deep Partial
Graph Matching [60.48651298832829]
Key Information Extraction (KIE) from documents improves efficiency, productivity, and security in many industrial scenarios.
Existing supervised learning methods for the KIE task need to feed a large number of labeled samples and learn separate models for different types of documents.
We propose a deep end-to-end trainable network for one-shot KIE using partial graph matching.
arXiv Detail & Related papers (2021-09-26T07:45:53Z) - Robust Retrieval Augmented Generation for Zero-shot Slot Filling [11.30375489913602]
We present a novel approach to zero-shot slot filling that extends dense passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models.
Our model reports large improvements on both T-REx and zsRE slot filling datasets, improving both passage retrieval and slot value generation, and ranking at the top-1 position in the KILT leaderboard.
arXiv Detail & Related papers (2021-08-31T15:51:27Z) - Benchmarking Scientific Image Forgery Detectors [18.225190509954874]
This paper presents an extendable open-source library that reproduces the most common image forgery operations reported by the research integrity community.
We create a large scientific forgery image benchmark (39,423 images) with an enriched ground-truth.
In addition, concerned about the high number of retracted papers due to image duplication, this work evaluates the state-of-the-art copy-move detection methods in the proposed dataset.
arXiv Detail & Related papers (2021-05-26T22:58:20Z) - TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval [103.85002875155551]
We propose a novel generalized distillation method, TeachText, for exploiting large-scale language pretraining.
We extend our method to video side modalities and show that we can effectively reduce the number of used modalities at test time.
Our approach advances the state of the art on several video retrieval benchmarks by a significant margin and adds no computational overhead at test time.
arXiv Detail & Related papers (2021-04-16T17:55:28Z) - Solving Missing-Annotation Object Detection with Background
Recalibration Loss [49.42997894751021]
This paper focuses on a novel and challenging detection scenario: A majority of true objects/instances is unlabeled in the datasets.
Previous art has proposed to use soft sampling to re-weight the gradients of RoIs based on the overlaps with positive instances, while their method is mainly based on the two-stage detector.
In this paper, we introduce a superior solution called Background Recalibration Loss (BRL) that can automatically re-calibrate the loss signals according to the pre-defined IoU threshold and input image.
arXiv Detail & Related papers (2020-02-12T23:11:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.