MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning
- URL: http://arxiv.org/abs/2307.07951v1
- Date: Sun, 16 Jul 2023 05:41:53 GMT
- Title: MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning
- Authors: Zhenwen Liang, Dian Yu, Xiaoman Pan, Wenlin Yao, Qingkai Zeng,
Xiangliang Zhang, Dong Yu
- Abstract summary: Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
- Score: 53.90744622542961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reasoning in mathematical domains remains a significant challenge for
relatively small language models (LMs). Many current methods focus on
specializing LMs in mathematical reasoning and rely heavily on knowledge
distillation from powerful but inefficient large LMs (LLMs). In this work, we
explore a new direction that avoids over-reliance on LLM teachers, introducing
a multi-view fine-tuning method that efficiently exploits existing mathematical
problem datasets with diverse annotation styles. Our approach uniquely
considers the various annotation formats as different "views" and leverages
them in training the model. By postpending distinct instructions to input
questions, models can learn to generate solutions in diverse formats in a
flexible manner. Experimental results show that our strategy enables a LLaMA-7B
model to outperform prior approaches that utilize knowledge distillation, as
well as carefully established baselines. Additionally, the proposed method
grants the models promising generalization ability across various views and
datasets, and the capability to learn from inaccurate or incomplete noisy data.
We hope our multi-view training paradigm could inspire future studies in other
machine reasoning domains.
Related papers
- Enhancing Fairness and Performance in Machine Learning Models: A Multi-Task Learning Approach with Monte-Carlo Dropout and Pareto Optimality [1.5498930424110338]
We propose a bias mitigation method based on multi-task learning.
We show how it can deliver the most desired trade-off between model fairness and performance.
arXiv Detail & Related papers (2024-04-12T04:17:50Z) - Supervised Fine-Tuning as Inverse Reinforcement Learning [8.044033685073003]
The prevailing approach to aligning Large Language Models (LLMs) typically relies on human or AI feedback.
In our work, we question the efficacy of such datasets and explore various scenarios where alignment with expert demonstrations proves more realistic.
arXiv Detail & Related papers (2024-03-18T17:52:57Z) - Model Composition for Multimodal Large Language Models [73.70317850267149]
We propose a new paradigm through the model composition of existing MLLMs to create a new model that retains the modal understanding capabilities of each original model.
Our basic implementation, NaiveMC, demonstrates the effectiveness of this paradigm by reusing modality encoders and merging LLM parameters.
arXiv Detail & Related papers (2024-02-20T06:38:10Z) - The Curious Case of Nonverbal Abstract Reasoning with Multi-Modal Large
Language Models [20.177263185773153]
Multi-modal large language models (MLLMs) integrate verbal and visual information.
Despite the revolutionizing prospect of MLLMs, our understanding of their reasoning abilities is limited.
arXiv Detail & Related papers (2024-01-22T16:57:05Z) - Multi-View Class Incremental Learning [57.14644913531313]
Multi-view learning (MVL) has gained great success in integrating information from multiple perspectives of a dataset to improve downstream task performance.
This paper investigates a novel paradigm called multi-view class incremental learning (MVCIL), where a single model incrementally classifies new classes from a continual stream of views.
arXiv Detail & Related papers (2023-06-16T08:13:41Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Model-Based Deep Learning: On the Intersection of Deep Learning and
Optimization [101.32332941117271]
Decision making algorithms are used in a multitude of different applications.
Deep learning approaches that use highly parametric architectures tuned from data without relying on mathematical models are becoming increasingly popular.
Model-based optimization and data-centric deep learning are often considered to be distinct disciplines.
arXiv Detail & Related papers (2022-05-05T13:40:08Z) - Beyond Explaining: Opportunities and Challenges of XAI-Based Model
Improvement [75.00655434905417]
Explainable Artificial Intelligence (XAI) is an emerging research field bringing transparency to highly complex machine learning (ML) models.
This paper offers a comprehensive overview over techniques that apply XAI practically for improving various properties of ML models.
We show empirically through experiments on toy and realistic settings how explanations can help improve properties such as model generalization ability or reasoning.
arXiv Detail & Related papers (2022-03-15T15:44:28Z) - MeLIME: Meaningful Local Explanation for Machine Learning Models [2.819725769698229]
We show that our approach, MeLIME, produces more meaningful explanations compared to other techniques over different ML models.
MeLIME generalizes the LIME method, allowing more flexible perturbation sampling and the use of different local interpretable models.
arXiv Detail & Related papers (2020-09-12T16:06:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.