Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data
- URL: http://arxiv.org/abs/2503.16260v1
- Date: Thu, 20 Mar 2025 15:56:04 GMT
- Title: Chain of Functions: A Programmatic Pipeline for Fine-Grained Chart Reasoning Data
- Authors: Zijian Li, Jingjing Fu, Lei Song, Jiang Bian, Jun Zhang, Rui Wang,
- Abstract summary: Existing methods leveraged (M)LLMs for data generation, but direct prompting often yields limited precision and diversity.<n>We propose textitChain of Functions (CoF), a novel programmatic reasoning data generation pipeline that utilizes freely-explored reasoning paths as supervision.<n>textitCoF provides multiple benefits: 1) Precision: function-governed generation reduces hallucinations compared to freeform generation; 2) Diversity: enumerating function chains enables varied question; 3) Explainability: function chains serve as built-in rationales, allowing fine-grained evaluation beyond overall accuracy; 4) Practicality: eliminating reliance
- Score: 17.17642905547469
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Visual reasoning is crucial for multimodal large language models (MLLMs) to address complex chart queries, yet high-quality rationale data remains scarce. Existing methods leveraged (M)LLMs for data generation, but direct prompting often yields limited precision and diversity. In this paper, we propose \textit{Chain of Functions (CoF)}, a novel programmatic reasoning data generation pipeline that utilizes freely-explored reasoning paths as supervision to ensure data precision and diversity. Specifically, it starts with human-free exploration among the atomic functions (e.g., maximum data and arithmetic operations) to generate diverse function chains, which are then translated into linguistic rationales and questions with only a moderate open-sourced LLM. \textit{CoF} provides multiple benefits: 1) Precision: function-governed generation reduces hallucinations compared to freeform generation; 2) Diversity: enumerating function chains enables varied question taxonomies; 3) Explainability: function chains serve as built-in rationales, allowing fine-grained evaluation beyond overall accuracy; 4) Practicality: eliminating reliance on extremely large models. Employing \textit{CoF}, we construct the \textit{ChartCoF} dataset, with 1.4k complex reasoning Q\&A for fine-grained analysis and 50k Q\&A for reasoning enhancement. The fine-grained evaluation on \textit{ChartCoF} reveals varying performance across question taxonomies for each MLLM, and the experiments also show that finetuning with \textit{ChartCoF} achieves state-of-the-art performance among same-scale MLLMs on widely used benchmarks. Furthermore, the novel paradigm of function-governed rationale generation in \textit{CoF} could inspire broader applications beyond charts.
Related papers
- Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization [69.29207684569695]
Chain-of-thought (CoT) reasoning greatly improves the interpretability and problem-solving abilities of multimodal large language models (MLLMs)
Existing approaches are focused on text CoT, limiting their ability to leverage visual cues.
In this paper, we introduce Unsupervised Visual CoT (UV-CoT), a novel framework for image-level CoT reasoning via preference optimization.
arXiv Detail & Related papers (2025-04-25T14:48:18Z) - Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts [5.309459419455988]
We propose a novel framework for generating causal graphs from narrative texts.
We introduce an "Expert Index," comprising seven linguistically informed features, integrated into a Situation-Task-Action-Consequence (STAC) classification model.
Experiments on 100 narrative chapters and short stories demonstrate that our approach consistently outperforms GPT-4o and Claude 3.5 in causal graph quality.
arXiv Detail & Related papers (2025-04-10T05:09:07Z) - A Benchmark for End-to-End Zero-Shot Biomedical Relation Extraction with LLMs: Experiments with OpenAI Models [7.923208324118286]
We study patterns in the performance of OpenAI LLMs across a diverse sampling of biomedical relation extraction tasks.
We found the zero-shot performances to be proximal to that of fine-tuned methods.
arXiv Detail & Related papers (2025-04-05T07:08:54Z) - Distill Visual Chart Reasoning Ability from LLMs to MLLMs [38.62832112530892]
Solving complex chart Q&A tasks requires advanced visual reasoning abilities in multimodal large language models (MLLMs)
We propose Code-as-Intermediary Translation (CIT), a cost-effective, efficient and easily scalable data synthesis method for distilling visual reasoning abilities from LLMs to MLLMs.
We employ text-based synthesizing techniques to construct chart-plotting code and produce ReachQA, a dataset containing 3k reasoning-intensive charts and 20k Q&A pairs.
arXiv Detail & Related papers (2024-10-24T14:50:42Z) - GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization [13.61818620609812]
We propose a lightweight yet effective unsupervised approach called GLIMMER: a Graph and LexIcal features based unsupervised Multi-docuMEnt summaRization approach.
It first constructs a sentence graph from the source documents, then automatically identifies semantic clusters by mining low-level features from raw texts.
Experiments conducted on Multi-News, Multi-XScience and DUC-2004 demonstrate that our approach outperforms existing unsupervised approaches.
arXiv Detail & Related papers (2024-08-19T16:01:48Z) - GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models [58.08177466768262]
Long-context capabilities are essential for large language models (LLMs) to tackle complex and long-input tasks.
We introduce GraphReader, a graph-based agent system designed to handle long texts by structuring them into a graph and employing an agent to explore this graph autonomously.
Experimental results on the LV-Eval dataset reveal that GraphReader, using a 4k context window, consistently outperforms GPT-4-128k across context lengths from 16k to 256k by a large margin.
arXiv Detail & Related papers (2024-06-20T17:57:51Z) - Optimizing Language Model's Reasoning Abilities with Weak Supervision [48.60598455782159]
We present textscPuzzleBen, a weakly supervised benchmark that comprises 25,147 complex questions, answers, and human-generated rationales.
A unique aspect of our dataset is the inclusion of 10,000 unannotated questions, enabling us to explore utilizing fewer supersized data to boost LLMs' inference capabilities.
arXiv Detail & Related papers (2024-05-07T07:39:15Z) - Integrating Graphs with Large Language Models: Methods and Prospects [68.37584693537555]
Large language models (LLMs) have emerged as frontrunners, showcasing unparalleled prowess in diverse applications.
Merging the capabilities of LLMs with graph-structured data has been a topic of keen interest.
This paper bifurcates such integrations into two predominant categories.
arXiv Detail & Related papers (2023-10-09T07:59:34Z) - Multimodal Chain-of-Thought Reasoning in Language Models [94.70184390935661]
We propose Multimodal-CoT that incorporates language (text) and vision (images) modalities into a two-stage framework.
Experimental results on ScienceQA and A-OKVQA benchmark datasets show the effectiveness of our proposed approach.
arXiv Detail & Related papers (2023-02-02T07:51:19Z) - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation [102.20036684996248]
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
We conduct experiments on two data-to-text generation tasks like WebNLG and LogicNLG.
arXiv Detail & Related papers (2022-12-16T17:36:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.