Related papers: Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

URL: http://arxiv.org/abs/2203.06486v1
Date: Sat, 12 Mar 2022 17:01:38 GMT
Title: Chart-to-Text: A Large-Scale Benchmark for Chart Summarization
Authors: Shankar Kanthara, Rixie Tiffany Ko Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, Shafiq Joty
Abstract summary: We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts. We explain the dataset construction process and analyze the datasets.
Score: 9.647079534077472
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Charts are commonly used for exploring data and communicating insights. Generating natural language summaries from charts can be very helpful for people in inferring key insights that would otherwise require a lot of cognitive and perceptual efforts. We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts covering a wide range of topics and chart types. We explain the dataset construction process and analyze the datasets. We also introduce a number of state-of-the-art neural models as baselines that utilize image captioning and data-to-text generation techniques to tackle two problem variations: one assumes the underlying data table of the chart is available while the other needs to extract data from chart images. Our analysis with automatic and human evaluation shows that while our best models usually generate fluent summaries and yield reasonable BLEU scores, they also suffer from hallucinations and factual errors as well as difficulties in correctly explaining complex patterns and trends in charts.

Related papers

RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning [63.599057862999]
RefChartQA is a novel benchmark that integrates Chart Question Answering (ChartQA) with visual grounding. Our experiments demonstrate that incorporating spatial awareness via grounding improves response accuracy by over 15%.
arXiv Detail & Related papers (2025-03-29T15:50:08Z)
On Pre-training of Multimodal Language Models Customized for Chart Understanding [83.99377088129282]
This paper explores the training processes necessary to improve MLLMs' comprehension of charts. We introduce CHOPINLLM, an MLLM tailored for in-depth chart comprehension.
arXiv Detail & Related papers (2024-07-19T17:58:36Z)
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild [28.643565008567172]
We introduce ChartGemma, a novel chart understanding and reasoning model developed over PaliGemma. Rather than relying on underlying data tables, ChartGemma is trained on instruction-tuning data generated directly from chart images. Our simple approach achieves state-of-the-art results across $5$ benchmarks spanning chart summarization, question answering, and fact-checking.
arXiv Detail & Related papers (2024-07-04T22:16:40Z)
TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning [83.58521787193293]
We present TinyChart, an efficient MLLM for chart understanding with only 3B parameters. TinyChart overcomes two key challenges in efficient chart understanding: (1) reduce the burden of learning numerical computations through a Program-of-Thoughts (PoT) learning strategy, and (2) reduce lengthy vision feature sequences produced by the vision transformer for high-resolution images through a Vision Token Merging module.
arXiv Detail & Related papers (2024-04-25T14:23:24Z)
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning [54.89249749894061]
ChartAssistant is a vision-language model for universal chart comprehension and reasoning. It undergoes a two-stage training process, starting with pre-training on chart-to-table parsing to align chart and text. Experimental results demonstrate significant performance gains over the state-of-the-art UniChart and Chartllama method.
arXiv Detail & Related papers (2024-01-04T17:51:48Z)
ChartLlama: A Multimodal LLM for Chart Understanding and Generation [70.1393163657813]
We create a high-quality instruction-tuning dataset leveraging GPT-4. Next, we introduce ChartLlama, a multi-modal large language model that we've trained using our created dataset.
arXiv Detail & Related papers (2023-11-27T15:20:23Z)
StructChart: Perception, Structuring, Reasoning for Visual Chart Understanding [58.38480335579541]
Current chart-related tasks focus on either chart perception which refers to extracting information from the visual charts, or performing reasoning given the extracted data. In this paper, we aim to establish a unified and label-efficient learning paradigm for joint perception and reasoning tasks. Experiments are conducted on various chart-related tasks, demonstrating the effectiveness and promising potential for a unified chart perception-reasoning paradigm.
arXiv Detail & Related papers (2023-09-20T12:51:13Z)
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning [29.947053208614246]
We present UniChart, a pretrained model for chart comprehension and reasoning. UniChart encodes the relevant text, data, and visual elements of charts and then uses a chart-grounded text decoder to generate the expected output in natural language. We propose several chart-specific pretraining tasks that include: (i) low-level tasks to extract the visual elements (e.g., bars, lines) and data from charts, and (ii) high-level tasks to acquire chart understanding and reasoning skills.
arXiv Detail & Related papers (2023-05-24T06:11:17Z)
ChartSumm: A Comprehensive Benchmark for Automatic Chart Summarization of Long and Short Summaries [0.26097841018267615]
Automatic chart to text summarization is an effective tool for the visually impaired people. In this paper, we propose ChartSumm: a large-scale benchmark dataset consisting of a total of 84,363 charts.
arXiv Detail & Related papers (2023-04-26T15:25:24Z)
ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules [89.75395046894809]
We present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks. Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks. Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model.
arXiv Detail & Related papers (2023-04-05T00:25:27Z)
Chart-to-Text: Generating Natural Language Descriptions for Charts by Adapting the Transformer Model [6.320141734801679]
We introduce a new dataset and present a neural model for automatically generating natural language summaries for charts. The generated summaries provide an interpretation of the chart and convey the key insights found within that chart.
arXiv Detail & Related papers (2020-10-18T23:57:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.