Line Graphics Digitization: A Step Towards Full Automation
- URL: http://arxiv.org/abs/2307.02065v1
- Date: Wed, 5 Jul 2023 07:08:58 GMT
- Title: Line Graphics Digitization: A Step Towards Full Automation
- Authors: Omar Moured, Jiaming Zhang, Alina Roitberg, Thorsten Schwarz, Rainer
Stiefelhagen
- Abstract summary: We present the Line Graphics (LG) dataset, which includes pixel-wise annotations of 5 coarse and 10 fine-grained categories.
Our dataset covers 520 images of mathematical graphics collected from 450 documents from different disciplines.
Our proposed dataset can support two different computer vision tasks, i.e., semantic segmentation and object detection.
- Score: 29.017383766914406
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The digitization of documents allows for wider accessibility and
reproducibility. While automatic digitization of document layout and text
content has been a long-standing focus of research, this problem in regard to
graphical elements, such as statistical plots, has been under-explored. In this
paper, we introduce the task of fine-grained visual understanding of
mathematical graphics and present the Line Graphics (LG) dataset, which
includes pixel-wise annotations of 5 coarse and 10 fine-grained categories. Our
dataset covers 520 images of mathematical graphics collected from 450 documents
from different disciplines. Our proposed dataset can support two different
computer vision tasks, i.e., semantic segmentation and object detection. To
benchmark our LG dataset, we explore 7 state-of-the-art models. To foster
further research on the digitization of statistical graphs, we will make the
dataset, code, and models publicly available to the community.
Related papers
- Unlocking Comics: The AI4VA Dataset for Visual Understanding [62.345344799258804]
This paper presents a novel dataset comprising Franco-Belgian comics from the 1950s annotated for tasks including depth estimation, semantic segmentation, saliency detection, and character identification.
It consists of two distinct and consistent styles and incorporates object concepts and labels taken from natural images.
By including such diverse information across styles, this dataset not only holds promise for computational creativity but also offers avenues for the digitization of art and storytelling innovation.
arXiv Detail & Related papers (2024-10-27T14:27:05Z) - From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models [98.41645229835493]
Data visualization in the form of charts plays a pivotal role in data analysis, offering critical insights and aiding in informed decision-making.
Large foundation models, such as large language models, have revolutionized various natural language processing tasks.
This survey paper serves as a comprehensive resource for researchers and practitioners in the fields of natural language processing, computer vision, and data analysis.
arXiv Detail & Related papers (2024-03-18T17:57:09Z) - MatGD: Materials Graph Digitizer [2.4857235004269165]
MatGD (Material Graph Digitizer) is a tool for digitizing a data line from scientific graphs.
From the 62,534 papers in the areas of batteries, MOFs, 501,045 figures were mined.
Our tool showcased performance with over 99% accuracy in legend marker and text detection.
arXiv Detail & Related papers (2023-09-19T07:19:16Z) - Graph Pooling for Graph Neural Networks: Progress, Challenges, and
Opportunities [128.55790219377315]
Graph neural networks have emerged as a leading architecture for many graph-level tasks.
graph pooling is indispensable for obtaining a holistic graph-level representation of the whole graph.
arXiv Detail & Related papers (2022-04-15T04:02:06Z) - A Survey of Historical Document Image Datasets [2.8707038627097226]
This paper presents a systematic literature review of image datasets for document image analysis.
It focuses on historical documents, such as handwritten manuscripts and early prints.
Finding appropriate datasets for historical document analysis is a crucial prerequisite to facilitate research using different machine learning algorithms.
arXiv Detail & Related papers (2022-03-16T09:56:48Z) - CHARTER: heatmap-based multi-type chart data extraction [7.838284602257369]
We present a method and a system for end-to-end conversion of document charts into machine readable data format.
Our approach extracts and analyses charts along with their graphical elements and supporting structures.
Our detection system is based on neural networks, trained solely on synthetic data.
arXiv Detail & Related papers (2021-11-28T11:01:21Z) - InfographicVQA [31.084392784258032]
InfographicVQA is a new dataset that comprises a diverse collection of infographics along with natural language questions and answers annotations.
We curate the dataset with emphasis on questions that require elementary reasoning and basic arithmetic skills.
The dataset, code and leaderboard will be made available at http://docvqa.org.
arXiv Detail & Related papers (2021-04-26T17:45:54Z) - Visual Distant Supervision for Scene Graph Generation [66.10579690929623]
Scene graph models usually require supervised learning on large quantities of labeled data with intensive human annotation.
We propose visual distant supervision, a novel paradigm of visual relation learning, which can train scene graph models without any human-labeled data.
Comprehensive experimental results show that our distantly supervised model outperforms strong weakly supervised and semi-supervised baselines.
arXiv Detail & Related papers (2021-03-29T06:35:24Z) - Structural Information Preserving for Graph-to-Text Generation [59.00642847499138]
The task of graph-to-text generation aims at producing sentences that preserve the meaning of input graphs.
We propose to tackle this problem by leveraging richer training signals that can guide our model for preserving input information.
Experiments on two benchmarks for graph-to-text generation show the effectiveness of our approach over a state-of-the-art baseline.
arXiv Detail & Related papers (2021-02-12T20:09:01Z) - Graphical Object Detection in Document Images [30.48863304419383]
We present a novel end-to-end trainable deep learning based framework to localize graphical objects in the document images called as Graphical Object Detection (GOD)
Our framework is data-driven and does not require any meta-data to locate graphical objects in the document images.
Our model yields promising results as compared to state-of-the-art techniques.
arXiv Detail & Related papers (2020-08-25T06:35:57Z) - Graph Edit Distance Reward: Learning to Edit Scene Graph [69.39048809061714]
We propose a new method to edit the scene graph according to the user instructions, which has never been explored.
To be specific, in order to learn editing scene graphs as the semantics given by texts, we propose a Graph Edit Distance Reward.
In the context of text-editing image retrieval, we validate the effectiveness of our method in CSS and CRIR dataset.
arXiv Detail & Related papers (2020-08-15T04:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.