Mathematical Language Models: A Survey
- URL: http://arxiv.org/abs/2312.07622v3
- Date: Fri, 23 Feb 2024 14:00:04 GMT
- Title: Mathematical Language Models: A Survey
- Authors: Wentao Liu, Hanglei Hu, Jie Zhou, Yuyang Ding, Junsong Li, Jiayi Zeng,
Mengliang He, Qin Chen, Bo Jiang, Aimin Zhou and Liang He
- Abstract summary: This paper conducts a comprehensive survey of mathematical Language Models (LMs)
The survey systematically categorizing pivotal research endeavors from two distinct perspectives: tasks and methodologies.
The survey entails the compilation of over 60 mathematical datasets, including training datasets, benchmark datasets, and augmented datasets.
- Score: 30.295544831040754
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In recent years, there has been remarkable progress in leveraging Language
Models (LMs), encompassing Pre-trained Language Models (PLMs) and Large-scale
Language Models (LLMs), within the domain of mathematics. This paper conducts a
comprehensive survey of mathematical LMs, systematically categorizing pivotal
research endeavors from two distinct perspectives: tasks and methodologies. The
landscape reveals a large number of proposed mathematical LLMs, which are
further delineated into instruction learning, tool-based methods, fundamental
CoT techniques, and advanced CoT methodologies. In addition, our survey entails
the compilation of over 60 mathematical datasets, including training datasets,
benchmark datasets, and augmented datasets. Addressing the primary challenges
and delineating future trajectories within the field of mathematical LMs, this
survey is positioned as a valuable resource, poised to facilitate and inspire
future innovation among researchers invested in advancing this domain.
Related papers
- A Survey of Small Language Models [104.80308007044634]
Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources.
We present a comprehensive survey on SLMs, focusing on their architectures, training techniques, and model compression techniques.
arXiv Detail & Related papers (2024-10-25T23:52:28Z) - A Survey on Multimodal Benchmarks: In the Era of Large AI Models [13.299775710527962]
Multimodal Large Language Models (MLLMs) have brought substantial advancements in artificial intelligence.
This survey systematically reviews 211 benchmarks that assess MLLMs across four core domains: understanding, reasoning, generation, and application.
arXiv Detail & Related papers (2024-09-21T15:22:26Z) - SIaM: Self-Improving Code-Assisted Mathematical Reasoning of Large Language Models [54.78329741186446]
We propose a novel paradigm that uses a code-based critic model to guide steps including question-code data construction, quality control, and complementary evaluation.
Experiments across both in-domain and out-of-domain benchmarks in English and Chinese demonstrate the effectiveness of the proposed paradigm.
arXiv Detail & Related papers (2024-08-28T06:33:03Z) - Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities [89.40778301238642]
Model merging is an efficient empowerment technique in the machine learning community.
There is a significant gap in the literature regarding a systematic and thorough review of these techniques.
arXiv Detail & Related papers (2024-08-14T16:58:48Z) - A Systematic Survey of Text Summarization: From Statistical Methods to Large Language Models [43.37740735934396]
Text summarization research has undergone several significant transformations with the advent of deep neural networks, pre-trained language models (PLMs), and recent large language models (LLMs)
This survey provides a comprehensive review of the research progress and evolution in text summarization through the lens of these paradigm shifts.
arXiv Detail & Related papers (2024-06-17T07:52:32Z) - A Survey of Multimodal Large Language Model from A Data-centric Perspective [46.57232264950785]
Multimodal large language models (MLLMs) enhance the capabilities of standard large language models by integrating and processing data from multiple modalities.
Data plays a pivotal role in the development and refinement of these models.
arXiv Detail & Related papers (2024-05-26T17:31:21Z) - Advancing Graph Representation Learning with Large Language Models: A
Comprehensive Survey of Techniques [37.60727548905253]
The integration of Large Language Models (LLMs) with Graph Representation Learning (GRL) marks a significant evolution in analyzing complex data structures.
This collaboration harnesses the sophisticated linguistic capabilities of LLMs to improve the contextual understanding and adaptability of graph models.
Despite a growing body of research dedicated to integrating LLMs into the graph domain, a comprehensive review that deeply analyzes the core components and operations is notably lacking.
arXiv Detail & Related papers (2024-02-04T05:51:14Z) - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey [54.19942426544731]
The rapid growth of Large Language Models (LLMs) has been a driving force in transforming various domains.
This paper examines the multi-faceted dimensions of efficiency essential for the end-to-end algorithmic development of LLMs.
arXiv Detail & Related papers (2023-12-01T16:00:25Z) - Exploring the Potential of Large Language Models in Computational Argumentation [54.85665903448207]
Large language models (LLMs) have demonstrated impressive capabilities in understanding context and generating natural language.
This work aims to embark on an assessment of LLMs, such as ChatGPT, Flan models, and LLaMA2 models, in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-11-15T15:12:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.