ChartReformer: Natural Language-Driven Chart Image Editing
- URL: http://arxiv.org/abs/2403.00209v2
- Date: Wed, 1 May 2024 06:14:44 GMT
- Title: ChartReformer: Natural Language-Driven Chart Image Editing
- Authors: Pengyu Yan, Mahesh Bhosale, Jay Lal, Bikhyat Adhikari, David Doermann,
- Abstract summary: We propose ChartReformer, a natural language-driven chart image editing solution that directly edits the charts from the input images with the given instruction prompts.
To generalize ChartReformer, we define and standardize various types of chart editing, covering style, layout, format, and data-centric edits.
- Score: 0.1712670816823812
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chart visualizations are essential for data interpretation and communication; however, most charts are only accessible in image format and lack the corresponding data tables and supplementary information, making it difficult to alter their appearance for different application scenarios. To eliminate the need for original underlying data and information to perform chart editing, we propose ChartReformer, a natural language-driven chart image editing solution that directly edits the charts from the input images with the given instruction prompts. The key in this method is that we allow the model to comprehend the chart and reason over the prompt to generate the corresponding underlying data table and visual attributes for new charts, enabling precise edits. Additionally, to generalize ChartReformer, we define and standardize various types of chart editing, covering style, layout, format, and data-centric edits. The experiments show promising results for the natural language-driven chart image editing.
Related papers
- PlotEdit: Natural Language-Driven Accessible Chart Editing in PDFs via Multimodal LLM Agents [47.79080056618323]
PlotEdit is a novel multi-agent framework for natural language-driven end-to-end chart image editing.
PlotEdit orchestrates five LLM agents: Chart2Table for data table extraction, Chart2Vision for style identification, Chart2Code for retrieving rendering code, Instruction Decomposition Agent for parsing user requests into executable steps, and Multimodal Editing Agent for implementing nuanced chart component modifications.
PlotEdit outperforms existing baselines on the ChartCraft dataset across style, layout, format, and data-centric edits.
arXiv Detail & Related papers (2025-01-20T02:31:52Z) - On Pre-training of Multimodal Language Models Customized for Chart Understanding [83.99377088129282]
This paper explores the training processes necessary to improve MLLMs' comprehension of charts.
We introduce CHOPINLLM, an MLLM tailored for in-depth chart comprehension.
arXiv Detail & Related papers (2024-07-19T17:58:36Z) - ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild [28.643565008567172]
We introduce ChartGemma, a novel chart understanding and reasoning model developed over PaliGemma.
Rather than relying on underlying data tables, ChartGemma is trained on instruction-tuning data generated directly from chart images.
Our simple approach achieves state-of-the-art results across $5$ benchmarks spanning chart summarization, question answering, and fact-checking.
arXiv Detail & Related papers (2024-07-04T22:16:40Z) - ChartAssisstant: A Universal Chart Multimodal Language Model via
Chart-to-Table Pre-training and Multitask Instruction Tuning [54.89249749894061]
ChartAssistant is a vision-language model for universal chart comprehension and reasoning.
It undergoes a two-stage training process, starting with pre-training on chart-to-table parsing to align chart and text.
Experimental results demonstrate significant performance gains over the state-of-the-art UniChart and Chartllama method.
arXiv Detail & Related papers (2024-01-04T17:51:48Z) - ChartCheck: Explainable Fact-Checking over Real-World Chart Images [11.172722085164281]
We introduce ChartCheck, a novel, large-scale dataset for explainable fact-checking against real-world charts.
We systematically evaluate ChartCheck using vision-language and chart-to-table models, and propose a baseline to the community.
arXiv Detail & Related papers (2023-11-13T16:35:29Z) - StructChart: On the Schema, Metric, and Augmentation for Visual Chart Understanding [54.45681512355684]
Current chart-related tasks focus on either chart perception that extracts information from the visual charts, or chart reasoning given the extracted data.
We introduce StructChart, a novel framework that leverages Structured Triplet Representations (STR) to achieve a unified and label-efficient approach.
arXiv Detail & Related papers (2023-09-20T12:51:13Z) - ChartReader: A Unified Framework for Chart Derendering and Comprehension
without Heuristic Rules [89.75395046894809]
We present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks.
Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks.
Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model.
arXiv Detail & Related papers (2023-04-05T00:25:27Z) - Chart-to-Text: A Large-Scale Benchmark for Chart Summarization [9.647079534077472]
We present Chart-to-text, a large-scale benchmark with two datasets and a total of 44,096 charts.
We explain the dataset construction process and analyze the datasets.
arXiv Detail & Related papers (2022-03-12T17:01:38Z) - Graph Edit Distance Reward: Learning to Edit Scene Graph [69.39048809061714]
We propose a new method to edit the scene graph according to the user instructions, which has never been explored.
To be specific, in order to learn editing scene graphs as the semantics given by texts, we propose a Graph Edit Distance Reward.
In the context of text-editing image retrieval, we validate the effectiveness of our method in CSS and CRIR dataset.
arXiv Detail & Related papers (2020-08-15T04:52:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.