A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges
- URL: http://arxiv.org/abs/2412.11936v2
- Date: Tue, 18 Feb 2025 02:37:21 GMT
- Title: A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges
- Authors: Yibo Yan, Jiamin Su, Jianxiang He, Fangteng Fu, Xu Zheng, Yuanhuiyi Lyu, Kun Wang, Shen Wang, Qingsong Wen, Xuming Hu,
- Abstract summary: This survey provides the first comprehensive analysis of mathematical reasoning in the era of multimodal large language models (MLLMs)<n>We review over 200 studies published since 2021, and examine the state-of-the-art developments in Math-LLMs.<n>In particular, we explore multimodal mathematical reasoning pipeline, as well as the role of (M)LLMs and the associated methodologies.
- Score: 25.82535441866882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Mathematical reasoning, a core aspect of human cognition, is vital across many domains, from educational problem-solving to scientific advancements. As artificial general intelligence (AGI) progresses, integrating large language models (LLMs) with mathematical reasoning tasks is becoming increasingly significant. This survey provides the first comprehensive analysis of mathematical reasoning in the era of multimodal large language models (MLLMs). We review over 200 studies published since 2021, and examine the state-of-the-art developments in Math-LLMs, with a focus on multimodal settings. We categorize the field into three dimensions: benchmarks, methodologies, and challenges. In particular, we explore multimodal mathematical reasoning pipeline, as well as the role of (M)LLMs and the associated methodologies. Finally, we identify five major challenges hindering the realization of AGI in this domain, offering insights into the future direction for enhancing multimodal reasoning capabilities. This survey serves as a critical resource for the research community in advancing the capabilities of LLMs to tackle complex multimodal reasoning tasks.
Related papers
- Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) [66.51642638034822]
Reasoning is central to human intelligence, enabling structured problem-solving across diverse tasks.
Recent advances in large language models (LLMs) have greatly enhanced their reasoning abilities in arithmetic, commonsense, and symbolic domains.
This paper offers a concise yet insightful overview of reasoning techniques in both textual and multimodal LLMs.
arXiv Detail & Related papers (2025-04-04T04:04:56Z) - Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models [86.45058529521258]
OlymMATH is a novel Olympiad-level mathematical benchmark designed to rigorously test the complex reasoning capabilities of LLMs.
OlymMATH features 200 meticulously curated problems, each manually verified and available in parallel English and Chinese versions.
arXiv Detail & Related papers (2025-03-27T11:20:17Z) - Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey [124.23247710880008]
multimodal CoT (MCoT) reasoning has recently garnered significant research attention.
Existing MCoT studies design various methodologies to address the challenges of image, video, speech, audio, 3D, and structured data.
We present the first systematic survey of MCoT reasoning, elucidating the relevant foundational concepts and definitions.
arXiv Detail & Related papers (2025-03-16T18:39:13Z) - When Continue Learning Meets Multimodal Large Language Model: A Survey [7.250878248686215]
Fine-tuning MLLMs for specific tasks often causes performance degradation in the model's prior knowledge domain.
This review paper presents an overview and analysis of 440 research papers in this area.
arXiv Detail & Related papers (2025-02-27T03:39:10Z) - Position: Multimodal Large Language Models Can Significantly Advance Scientific Reasoning [51.11965014462375]
Multimodal Large Language Models (MLLMs) integrate text, images, and other modalities.
This paper argues that MLLMs can significantly advance scientific reasoning across disciplines such as mathematics, physics, chemistry, and biology.
arXiv Detail & Related papers (2025-02-05T04:05:27Z) - LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning [7.512199306943756]
We present a novel method to enhance Large Language Models' capabilities in mathematical reasoning tasks.
Motivated by the need to bridge this gap, our approach incorporates a question paraphrase strategy.
specialized training objectives are employed to guide the model's learning process.
arXiv Detail & Related papers (2024-12-28T17:48:33Z) - ErrorRadar: Benchmarking Complex Mathematical Reasoning of Multimodal Large Language Models Via Error Detection [60.297079601066784]
We introduce ErrorRadar, the first benchmark designed to assess MLLMs' capabilities in error detection.
ErrorRadar evaluates two sub-tasks: error step identification and error categorization.
It consists of 2,500 high-quality multimodal K-12 mathematical problems, collected from real-world student interactions.
Results indicate significant challenges still remain, as GPT-4o with best performance is still around 10% behind human evaluation.
arXiv Detail & Related papers (2024-10-06T14:59:09Z) - A Survey on Multimodal Benchmarks: In the Era of Large AI Models [13.299775710527962]
Multimodal Large Language Models (MLLMs) have brought substantial advancements in artificial intelligence.
This survey systematically reviews 211 benchmarks that assess MLLMs across four core domains: understanding, reasoning, generation, and application.
arXiv Detail & Related papers (2024-09-21T15:22:26Z) - Large Multimodal Agents: A Survey [78.81459893884737]
Large language models (LLMs) have achieved superior performance in powering text-based AI agents.
There is an emerging research trend focused on extending these LLM-powered AI agents into the multimodal domain.
This review aims to provide valuable insights and guidelines for future research in this rapidly evolving field.
arXiv Detail & Related papers (2024-02-23T06:04:23Z) - Large Language Models for Mathematical Reasoning: Progresses and Challenges [15.925641169201747]
Large Language Models (LLMs) are geared towards the automated resolution of mathematical problems.
This survey endeavors to address four pivotal dimensions.
It provides a holistic perspective on the current state, accomplishments, and future challenges in this rapidly evolving field.
arXiv Detail & Related papers (2024-01-31T20:26:32Z) - Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions [47.83142414018448]
We focus on two popular reasoning tasks: arithmetic reasoning and code generation.
We introduce (i) a general ontology of perturbations for math and coding questions, (ii) a semi-automatic method to apply these perturbations, and (iii) two datasets.
We show a significant performance drop across all the models against perturbed questions.
arXiv Detail & Related papers (2024-01-17T18:13:07Z) - A Survey of Reasoning with Foundation Models [235.7288855108172]
Reasoning plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation.
We introduce seminal foundation models proposed or adaptable for reasoning.
We then delve into the potential future directions behind the emergence of reasoning abilities within foundation models.
arXiv Detail & Related papers (2023-12-17T15:16:13Z) - Mathematical Language Models: A Survey [29.419915295762692]
This paper conducts a comprehensive survey of mathematical LMs, systematically categorizing pivotal research endeavors from two distinct perspectives: tasks and methodologies.
The survey entails the compilation of over 60 mathematical datasets, including training datasets, benchmark datasets, and augmented datasets.
arXiv Detail & Related papers (2023-12-12T01:39:16Z) - A Survey on Multimodal Large Language Models [71.63375558033364]
Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot.<n>This paper aims to trace and summarize the recent progress of MLLMs.
arXiv Detail & Related papers (2023-06-23T15:21:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.