Enhancing Object Detection in Ancient Documents with Synthetic Data
Generation and Transformer-Based Models
- URL: http://arxiv.org/abs/2307.16005v1
- Date: Sat, 29 Jul 2023 15:29:25 GMT
- Title: Enhancing Object Detection in Ancient Documents with Synthetic Data
Generation and Transformer-Based Models
- Authors: Zahra Ziran, Francesco Leotta, Massimo Mecella
- Abstract summary: This research aims to enhance object detection in ancient documents by reducing false positives and improving precision.
We propose a method that involves the creation of synthetic datasets through computational mediation.
Our approach includes associating objects with their component parts and introducing a visual feature map to enable the model to discern between different symbols and document elements.
- Score: 0.4125187280299248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The study of ancient documents provides a glimpse into our past. However, the
low image quality and intricate details commonly found in these documents
present significant challenges for accurate object detection. The objective of
this research is to enhance object detection in ancient documents by reducing
false positives and improving precision. To achieve this, we propose a method
that involves the creation of synthetic datasets through computational
mediation, along with the integration of visual feature extraction into the
object detection process. Our approach includes associating objects with their
component parts and introducing a visual feature map to enable the model to
discern between different symbols and document elements. Through our
experiments, we demonstrate that improved object detection has a profound
impact on the field of Paleography, enabling in-depth analysis and fostering a
greater understanding of these valuable historical artifacts.
Related papers
- Correlation of Object Detection Performance with Visual Saliency and Depth Estimation [0.09208007322096533]
This paper investigates the correlations between object detection accuracy and two fundamental visual tasks: depth prediction and visual saliency prediction.
Our analysis reveals significant variations in these correlations across object categories, with larger objects showing correlation values up to three times higher than smaller objects.
These findings suggest incorporating visual saliency features into object detection architectures could be more beneficial than depth information.
arXiv Detail & Related papers (2024-11-05T06:34:19Z) - Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection [14.22646492640906]
We propose a simple and highly efficient decoder-free architecture for open-vocabulary visual relationship detection.
Our model consists of a Transformer-based image encoder that represents objects as tokens and models their relationships implicitly.
Our approach achieves state-of-the-art relationship detection performance on Visual Genome and on the large-vocabulary GQA benchmark at real-time inference speeds.
arXiv Detail & Related papers (2024-03-21T10:15:57Z) - Oriented Object Detection in Optical Remote Sensing Images using Deep Learning: A Survey [10.665235711722076]
Oriented object detection is one of the most fundamental and challenging tasks in remote sensing.
Recent years have witnessed remarkable progress in oriented object detection using deep learning techniques.
arXiv Detail & Related papers (2023-02-21T06:31:53Z) - Spatial Reasoning for Few-Shot Object Detection [21.3564383157159]
We propose a spatial reasoning framework that detects novel objects with only a few training examples in a context.
We employ a graph convolutional network as the RoIs and their relatedness are defined as nodes and edges, respectively.
We demonstrate that the proposed method significantly outperforms the state-of-the-art methods and verify its efficacy through extensive ablation studies.
arXiv Detail & Related papers (2022-11-02T12:38:08Z) - High-resolution Iterative Feedback Network for Camouflaged Object
Detection [128.893782016078]
Spotting camouflaged objects that are visually assimilated into the background is tricky for object detection algorithms.
We aim to extract the high-resolution texture details to avoid the detail degradation that causes blurred vision in edges and boundaries.
We introduce a novel HitNet to refine the low-resolution representations by high-resolution features in an iterative feedback manner.
arXiv Detail & Related papers (2022-03-22T11:20:21Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - Inverting and Understanding Object Detectors [15.207501110589924]
We propose using inversion as a primary tool to understand modern object detectors and develop an optimization-based approach to layout inversion.
We reveal intriguing properties of detectors by applying our layout inversion technique to a variety of modern object detectors.
arXiv Detail & Related papers (2021-06-26T03:31:59Z) - Ensembling object detectors for image and video data analysis [98.26061123111647]
We propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data.
We extend it to video data by proposing a two-stage tracking-based scheme for detection refinement.
arXiv Detail & Related papers (2021-02-09T12:38:16Z) - Deep Texture-Aware Features for Camouflaged Object Detection [69.84122372541506]
This paper formulates texture-aware refinement modules to learn the texture-aware features in a deep convolutional neural network.
We evaluate our network on the benchmark dataset for camouflaged object detection both qualitatively and quantitatively.
arXiv Detail & Related papers (2021-02-05T04:38:32Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Learning Object Detection from Captions via Textual Scene Attributes [70.90708863394902]
We argue that captions contain much richer information about the image, including attributes of objects and their relations.
We present a method that uses the attributes in this "textual scene graph" to train object detectors.
We empirically demonstrate that the resulting model achieves state-of-the-art results on several challenging object detection datasets.
arXiv Detail & Related papers (2020-09-30T10:59:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.