ChartE$^{3}$: A Comprehensive Benchmark for End-to-End Chart Editing
- URL: http://arxiv.org/abs/2601.21694v1
- Date: Thu, 29 Jan 2026 13:29:27 GMT
- Title: ChartE$^{3}$: A Comprehensive Benchmark for End-to-End Chart Editing
- Authors: Shuo Li, Jiajun Sun, Zhekai Wang, Xiaoran Fan, Hui Li, Dingwen Yang, Zhiheng Xi, Yijun Wang, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang,
- Abstract summary: ChartE$3$ is an End-to-End Chart Editing benchmark.<n>It directly evaluates models without relying on intermediate natural language programs or code-level supervision.<n>It contains over 1,200 high-quality samples constructed via a well-designed data pipeline with human curation.
- Score: 64.65742943745866
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Charts are a fundamental visualization format for structured data analysis. Enabling end-to-end chart editing according to user intent is of great practical value, yet remains challenging due to the need for both fine-grained control and global structural consistency. Most existing approaches adopt pipeline-based designs, where natural language or code serves as an intermediate representation, limiting their ability to faithfully execute complex edits. We introduce ChartE$^{3}$, an End-to-End Chart Editing benchmark that directly evaluates models without relying on intermediate natural language programs or code-level supervision. ChartE$^{3}$ focuses on two complementary editing dimensions: local editing, which involves fine-grained appearance changes such as font or color adjustments, and global editing, which requires holistic, data-centric transformations including data filtering and trend line addition. ChartE$^{3}$ contains over 1,200 high-quality samples constructed via a well-designed data pipeline with human curation. Each sample is provided as a triplet of a chart image, its underlying code, and a multimodal editing instruction, enabling evaluation from both objective and subjective perspectives. Extensive benchmarking of state-of-the-art multimodal large language models reveals substantial performance gaps, particularly on global editing tasks, highlighting critical limitations in current end-to-end chart editing capabilities.
Related papers
- ChartEditBench: Evaluating Grounded Multi-Turn Chart Editing in Multimodal Language Models [4.257440824082894]
We introduce ChartEditBench, a benchmark for incremental, visually grounded chart editing via code.<n>Unlike prior one-shot benchmarks, ChartEditBench evaluates sustained, context-aware editing.<n> Experiments with state-of-the-art MLLMs reveal substantial degradation in multi-turn settings due to error accumulation and breakdowns in shared context.
arXiv Detail & Related papers (2026-02-17T17:45:34Z) - ChartAnchor: Chart Grounding with Structural-Semantic Fidelity [19.798612765001746]
Chart grounding refers to the bidirectional alignment between a chart's visual appearance and the structured semantics.<n>ChartAnchor is a benchmark of 8k+ chart-table-code triples spanning 30 chart types drawn from diverse real-world and augmented sources.<n>A multi-level evaluation framework integrates semantic validation, stylistic analysis, and perceptual metrics to assess both structural and content-level correctness.
arXiv Detail & Related papers (2025-11-30T18:28:09Z) - Charts Are Not Images: On the Challenges of Scientific Chart Editing [66.38730113476677]
textitFigEdit is a benchmark for scientific figure editing comprising over 30,000 samples.<n>Our benchmark demonstrates the profound limitations of pixel-level manipulation.<n>By releasing textitFigEdit, we aim to enable systematic progress in structure-aware figure editing.
arXiv Detail & Related papers (2025-11-30T06:13:48Z) - ChartEditor: A Reinforcement Learning Framework for Robust Chart Editing [46.847377471580366]
We present ChartEditVista, a comprehensive benchmark consisting of 7,964 samples spanning 31 chart categories.<n>The inputs in ChartEditVista include only the original chart image and natural language editing instructions, without the original chart codes.<n>We also present ChartEditor, a model trained using a reinforcement learning framework that incorporates a novel rendering reward to simultaneously enforce code executability and visual fidelity.
arXiv Detail & Related papers (2025-11-19T09:27:37Z) - BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning [51.472854950300416]
We propose BigCharts, a dataset creation pipeline that generates visually diverse chart images.<n>Unlike purely synthetic datasets, BigCharts incorporates real-world data, ensuring authenticity and visual diversity.<n>By introducing novel reward signals specifically designed for chart reasoning, our approach enhances model robustness and generalization.
arXiv Detail & Related papers (2025-08-13T13:39:17Z) - ChartM$^3$: Benchmarking Chart Editing with Multimodal Instructions [65.21061221740388]
We introduce a novel paradigm for multimodal chart editing, where user intent is expressed through a combination of natural language and visual indicators.<n>We present Chart$textM3$, a new benchmark for Multimodal chart editing with Multi-level complexity and Multi-perspective evaluation.
arXiv Detail & Related papers (2025-07-25T13:30:14Z) - On Pre-training of Multimodal Language Models Customized for Chart Understanding [83.99377088129282]
This paper explores the training processes necessary to improve MLLMs' comprehension of charts.<n>We introduce CHOPINLLM, an MLLM tailored for in-depth chart comprehension.
arXiv Detail & Related papers (2024-07-19T17:58:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.