ChartParser: Automatic Chart Parsing for Print-Impaired
- URL: http://arxiv.org/abs/2211.08863v1
- Date: Wed, 16 Nov 2022 12:19:10 GMT
- Title: ChartParser: Automatic Chart Parsing for Print-Impaired
- Authors: Anukriti Kumar, Tanuja Ganu, Saikat Guha
- Abstract summary: Infographics are often an integral component of scientific documents for reporting qualitative or quantitative findings.
Their interpretation continues to be a challenge for the blind, low-vision, and other print-impaired (BLV) individuals.
We propose a fully automated pipeline that leverages deep learning, OCR, and image processing techniques to extract all figures from a research paper.
- Score: 2.1325744957975568
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Infographics are often an integral component of scientific documents for
reporting qualitative or quantitative findings as they make it much simpler to
comprehend the underlying complex information. However, their interpretation
continues to be a challenge for the blind, low-vision, and other print-impaired
(BLV) individuals. In this paper, we propose ChartParser, a fully automated
pipeline that leverages deep learning, OCR, and image processing techniques to
extract all figures from a research paper, classify them into various chart
categories (bar chart, line chart, etc.) and obtain relevant information from
them, specifically bar charts (including horizontal, vertical, stacked
horizontal and stacked vertical charts) which already have several exciting
challenges. Finally, we present the retrieved content in a tabular format that
is screen-reader friendly and accessible to the BLV users. We present a
thorough evaluation of our approach by applying our pipeline to sample
real-world annotated bar charts from research papers.
Related papers
- AskChart: Universal Chart Understanding through Textual Enhancement [20.075911012193494]
State-of-the-art approaches primarily focus on visual cues from chart images, failing to explicitly incorporate rich textual information embedded within the charts.
We introduce AskChart, a universal model that explicitly integrates both textual and visual cues from charts using a Mixture of Experts (MoE) architecture.
arXiv Detail & Related papers (2024-12-26T09:59:43Z) - Rethinking Comprehensive Benchmark for Chart Understanding: A Perspective from Scientific Literature [33.69273440337546]
We introduce a new benchmark, Scientific Chart QA (SCI-CQA), which emphasizes flowcharts as a critical yet often overlooked category.
We curated a dataset of 202,760 image-text pairs from 15 top-tier computer science conferences papers over the past decade.
SCI-CQA also introduces a novel evaluation framework inspired by human exams, encompassing 5,629 carefully curated questions.
arXiv Detail & Related papers (2024-12-11T05:29:54Z) - On Pre-training of Multimodal Language Models Customized for Chart Understanding [83.99377088129282]
This paper explores the training processes necessary to improve MLLMs' comprehension of charts.
We introduce CHOPINLLM, an MLLM tailored for in-depth chart comprehension.
arXiv Detail & Related papers (2024-07-19T17:58:36Z) - TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning [83.58521787193293]
We present TinyChart, an efficient MLLM for chart understanding with only 3B parameters.
TinyChart overcomes two key challenges in efficient chart understanding: (1) reduce the burden of learning numerical computations through a Program-of-Thoughts (PoT) learning strategy, and (2) reduce lengthy vision feature sequences produced by the vision transformer for high-resolution images through a Vision Token Merging module.
arXiv Detail & Related papers (2024-04-25T14:23:24Z) - ChartAssisstant: A Universal Chart Multimodal Language Model via
Chart-to-Table Pre-training and Multitask Instruction Tuning [54.89249749894061]
ChartAssistant is a vision-language model for universal chart comprehension and reasoning.
It undergoes a two-stage training process, starting with pre-training on chart-to-table parsing to align chart and text.
Experimental results demonstrate significant performance gains over the state-of-the-art UniChart and Chartllama method.
arXiv Detail & Related papers (2024-01-04T17:51:48Z) - StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding [54.45681512355684]
Current chart-related tasks focus on either chart perception that extracts information from the visual charts, or chart reasoning given the extracted data.
We introduce StructChart, a novel framework that leverages Structured Triplet Representations (STR) to achieve a unified and label-efficient approach.
arXiv Detail & Related papers (2023-09-20T12:51:13Z) - Enhanced Chart Understanding in Vision and Language Task via Cross-modal
Pre-training on Plot Table Pairs [71.55796212450055]
We introduce ChartT5, a V+L model that learns how to interpret table information from chart images via cross-modal pre-training on plot table pairs.
Specifically, we propose two novel pre-training objectives: Masked Header Prediction (MHP) and Masked Value Prediction (MVP)
arXiv Detail & Related papers (2023-05-29T22:29:03Z) - ChartReader: A Unified Framework for Chart Derendering and Comprehension
without Heuristic Rules [89.75395046894809]
We present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks.
Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks.
Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model.
arXiv Detail & Related papers (2023-04-05T00:25:27Z) - Chart-to-Text: A Large-Scale Benchmark for Chart Summarization [9.647079534077472]
We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts.
We explain the dataset construction process and analyze the datasets.
arXiv Detail & Related papers (2022-03-12T17:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.