Related papers: Mathematical Language Models: A Survey

Mathematical Language Models: A Survey

URL: http://arxiv.org/abs/2312.07622v3
Date: Fri, 23 Feb 2024 14:00:04 GMT
Title: Mathematical Language Models: A Survey
Authors: Wentao Liu, Hanglei Hu, Jie Zhou, Yuyang Ding, Junsong Li, Jiayi Zeng, Mengliang He, Qin Chen, Bo Jiang, Aimin Zhou and Liang He
Abstract summary: This paper conducts a comprehensive survey of mathematical Language Models (LMs) The survey systematically categorizing pivotal research endeavors from two distinct perspectives: tasks and methodologies. The survey entails the compilation of over 60 mathematical datasets, including training datasets, benchmark datasets, and augmented datasets.
Score: 30.295544831040754
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In recent years, there has been remarkable progress in leveraging Language Models (LMs), encompassing Pre-trained Language Models (PLMs) and Large-scale Language Models (LLMs), within the domain of mathematics. This paper conducts a comprehensive survey of mathematical LMs, systematically categorizing pivotal research endeavors from two distinct perspectives: tasks and methodologies. The landscape reveals a large number of proposed mathematical LLMs, which are further delineated into instruction learning, tool-based methods, fundamental CoT techniques, and advanced CoT methodologies. In addition, our survey entails the compilation of over 60 mathematical datasets, including training datasets, benchmark datasets, and augmented datasets. Addressing the primary challenges and delineating future trajectories within the field of mathematical LMs, this survey is positioned as a valuable resource, poised to facilitate and inspire future innovation among researchers invested in advancing this domain.

Related papers

Discrete Diffusion in Large Language and Multimodal Models: A Survey [56.31088116526825]
We provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs)<n>Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel decoding paradigm.<n>We trace the historical development of dLLMs and dMLLMs, formalize the underlying mathematical frameworks, and categorize representative models.
arXiv Detail & Related papers (2025-06-16T17:59:08Z)
A Survey on Large Language Models for Mathematical Reasoning [13.627895103752783]
This survey examines the development of mathematical reasoning abilities in large language models (LLMs)<n>We review methods for enhancing mathematical reasoning, ranging from training-free prompting to fine-tuning approaches such as supervised fine-tuning and reinforcement learning.<n>Despite notable progress, fundamental challenges remain in terms of capacity, efficiency, and generalization.
arXiv Detail & Related papers (2025-06-10T04:44:28Z)
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task [49.355810887265925]
We introduce MathFimer, a novel framework for mathematical reasoning step expansion. We develop a specialized model, MathFimer-7B, on our carefully curated NuminaMath-FIM dataset. We then apply these models to enhance existing mathematical reasoning datasets by inserting detailed intermediate steps into their solution chains.
arXiv Detail & Related papers (2025-02-17T11:22:24Z)
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework [81.29965270493238]
We develop a specialized dataset aimed at enhancing the evaluation and fine-tuning of large language models (LLMs) for wireless communication applications. The dataset includes a diverse set of multi-hop questions, including true/false and multiple-choice types, spanning varying difficulty levels from easy to hard. We introduce a Pointwise V-Information (PVI) based fine-tuning method, providing a detailed theoretical analysis and justification for its use in quantifying the information content of training data.
arXiv Detail & Related papers (2025-01-16T16:19:53Z)
A Survey of Mathematical Reasoning in the Era of Multimodal Large Language Model: Benchmark, Method & Challenges [25.82535441866882]
This survey provides the first comprehensive analysis of mathematical reasoning in the era of multimodal large language models (MLLMs) We review over 200 studies published since 2021, and examine the state-of-the-art developments in Math-LLMs. In particular, we explore multimodal mathematical reasoning pipeline, as well as the role of (M)LLMs and the associated methodologies.
arXiv Detail & Related papers (2024-12-16T16:21:41Z)
A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources. We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z)
A Survey on Multimodal Benchmarks: In the Era of Large AI Models [13.299775710527962]
Multimodal Large Language Models (MLLMs) have brought substantial advancements in artificial intelligence. This survey systematically reviews 211 benchmarks that assess MLLMs across four core domains: understanding, reasoning, generation, and application.
arXiv Detail & Related papers (2024-09-21T15:22:26Z)
SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation. Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z)
Benchmarking Large Language Models for Math Reasoning Tasks [12.91916443702145]
We compare seven state-of-the-art in-context learning algorithms for mathematical problem solving across five widely used mathematical datasets on four powerful foundation models. Our results indicate that larger foundation models like GPT-4o and LLaMA 3-70B can solve mathematical reasoning independently from the concrete prompting strategy. We open-source our benchmark code to support the integration of additional models in future research.
arXiv Detail & Related papers (2024-08-20T13:34:17Z)
Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community. There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z)
A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models [43.37740735934396]
Text summarization research has undergone several significant transformations with the advent of deep neural networks, pre-trained language models (PLMs), and recent large language models (LLMs) This survey provides a comprehensive review of the research progress and evolution in text summarization through the lens of these paradigm shifts.
arXiv Detail & Related papers (2024-06-17T07:52:32Z)
A Survey of Multimodal Large Language Model from A Data-centric Perspective [46.57232264950785]
Multimodal large language models (MLLMs) enhance the capabilities of standard large language models by integrating and processing data from multiple modalities. Data plays a pivotal role in the development and refinement of these models.
arXiv Detail & Related papers (2024-05-26T17:31:21Z)
Advancing Graph Representation Learning with Large Language Models: A Comprehensive Survey of Techniques [37.60727548905253]
The integration of Large Language Models (LLMs) with Graph Representation Learning (GRL) marks a significant evolution in analyzing complex data structures. This collaboration harnesses the sophisticated linguistic capabilities of LLMs to improve the contextual understanding and adaptability of graph models. Despite a growing body of research dedicated to integrating LLMs into the graph domain, a comprehensive review that deeply analyzes the core components and operations is notably lacking.
arXiv Detail & Related papers (2024-02-04T05:51:14Z)
Large Language Models for Mathematical Reasoning: Progresses and Challenges [15.925641169201747]
Large Language Models (LLMs) are geared towards the automated resolution of mathematical problems. This survey endeavors to address four pivotal dimensions. It provides a holistic perspective on the current state, accomplishments, and future challenges in this rapidly evolving field.
arXiv Detail & Related papers (2024-01-31T20:26:32Z)
The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains. This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z)
Exploring the Potential of Large Language Models in Computational Argumentation [54.85665903448207]
Large language models (LLMs) have demonstrated impressive capabilities in understanding context and generating natural language. This work aims to embark on an assessment of LLMs, such as ChatGPT, Flan models, and LLaMA2 models, in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-11-15T15:12:15Z)
MinT: Boosting Generalization in Mathematical Reasoning via Multi-View Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs) We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles. Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.