CHARTER: heatmap-based multi-type chart data extraction
- URL: http://arxiv.org/abs/2111.14103v1
- Date: Sun, 28 Nov 2021 11:01:21 GMT
- Title: CHARTER: heatmap-based multi-type chart data extraction
- Authors: Joseph Shtok, Sivan Harary, Ophir Azulai, Adi Raz Goldfarb, Assaf
Arbelle, Leonid Karlinsky
- Abstract summary: We present a method and a system for end-to-end conversion of document charts into machine readable data format.
Our approach extracts and analyses charts along with their graphical elements and supporting structures.
Our detection system is based on neural networks, trained solely on synthetic data.
- Score: 7.838284602257369
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The digital conversion of information stored in documents is a great source
of knowledge. In contrast to the documents text, the conversion of the embedded
documents graphics, such as charts and plots, has been much less explored. We
present a method and a system for end-to-end conversion of document charts into
machine readable tabular data format, which can be easily stored and analyzed
in the digital domain. Our approach extracts and analyses charts along with
their graphical elements and supporting structures such as legends, axes,
titles, and captions. Our detection system is based on neural networks, trained
solely on synthetic data, eliminating the limiting factor of data collection.
As opposed to previous methods, which detect graphical elements using
bounding-boxes, our networks feature auxiliary domain specific heatmaps
prediction enabling the precise detection of pie charts, line and scatter plots
which do not fit the rectangular bounding-box presumption. Qualitative and
quantitative results show high robustness and precision, improving upon
previous works on popular benchmarks
Related papers
- GraphKD: Exploring Knowledge Distillation Towards Document Object
Detection with Structured Graph Creation [14.511401955827875]
Object detection in documents is a key step to automate the structural elements identification process.
We present a graph-based knowledge distillation framework to correctly identify and localize the document objects in a document image.
arXiv Detail & Related papers (2024-02-17T23:08:32Z) - Enhancing Visually-Rich Document Understanding via Layout Structure
Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model.
We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z) - Line Graphics Digitization: A Step Towards Full Automation [29.017383766914406]
We present the Line Graphics (LG) dataset, which includes pixel-wise annotations of 5 coarse and 10 fine-grained categories.
Our dataset covers 520 images of mathematical graphics collected from 450 documents from different disciplines.
Our proposed dataset can support two different computer vision tasks, i.e., semantic segmentation and object detection.
arXiv Detail & Related papers (2023-07-05T07:08:58Z) - Augraphy: A Data Augmentation Library for Document Images [59.457999432618614]
Augraphy is a Python library for constructing data augmentation pipelines.
It provides strategies to produce augmented versions of clean document images that appear to have been altered by standard office operations.
arXiv Detail & Related papers (2022-08-30T22:36:19Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - DocScanner: Robust Document Image Rectification with Progressive
Learning [162.03694280524084]
This work presents DocScanner, a new deep network architecture for document image rectification.
DocScanner maintains a single estimate of the rectified image, which is progressively corrected with a recurrent architecture.
The iterative refinements make DocScanner converge to a robust and superior performance, and the lightweight recurrent architecture ensures the running efficiency.
arXiv Detail & Related papers (2021-10-28T09:15:02Z) - Learning to Generate Scene Graph from Natural Language Supervision [52.18175340725455]
We propose one of the first methods that learn from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as scene graph.
We leverage an off-the-shelf object detector to identify and localize object instances, match labels of detected regions to concepts parsed from captions, and thus create "pseudo" labels for learning scene graph.
arXiv Detail & Related papers (2021-09-06T03:38:52Z) - Plot2Spectra: an Automatic Spectra Extraction Tool [10.64947007982639]
This paper develops a plot digitizer, named Plot2Spectra, to extract data points from spectroscopy graph images in an automatic fashion.
In the first axis alignment stage, we adopt an anchor-free detector to detect the plot region and then refine the detected bounding boxes.
In the second plot data extraction stage, we first employ semantic segmentation to separate pixels belonging to plot lines from the background.
arXiv Detail & Related papers (2021-07-06T18:17:28Z) - Towards an efficient framework for Data Extraction from Chart Images [27.114170963444074]
We adopt state-of-the-art computer vision techniques for the data extraction stage in a data mining system.
For building a robust point detector, a fully convolutional network with feature fusion module is adopted.
For data conversion, we translate the detected element into data with semantic value.
arXiv Detail & Related papers (2021-05-05T13:18:53Z) - Scene Graph Modification Based on Natural Language Commands [90.0662899539489]
Structured representations like graphs and parse trees play a crucial role in many Natural Language Processing systems.
In this paper, we explore the novel problem of graph modification, where the systems need to learn how to update an existing graph given a new user's command.
arXiv Detail & Related papers (2020-10-06T10:01:19Z) - OCR Graph Features for Manipulation Detection in Documents [11.193867567895353]
We propose a model that leverages graph features using OCR (Optical Character Recognition)
Our model relies on a data-driven approach to detect alterations by training a random forest classifier on the graph-based OCR features.
We evaluate our algorithm's forgery detection performance on dataset constructed from real business documents with slight forgery imperfections.
arXiv Detail & Related papers (2020-09-10T21:50:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.