Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning
Processes
- URL: http://arxiv.org/abs/2402.10654v1
- Date: Fri, 16 Feb 2024 13:02:11 GMT
- Title: Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning
Processes
- Authors: Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che
- Abstract summary: We introduce Enhancing NumeriCal reasOning with Reliable procEsses (Encore), which derives the reliable reasoning process by decomposing the answer formula.
We present a series of pre-training tasks to help models learn the reasoning process generation with synthesized data.
Experiments show that Encore yields improvement on all five experimental datasets with an average of 1.8%.
- Score: 55.2326738851157
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Numerical reasoning is an essential ability for NLP systems to handle numeric
information. Recent research indicates that fine-tuning a small-scale model to
learn generating reasoning processes alongside answers can significantly
enhance performance. However, current methods have the limitation that most
methods generate reasoning processes with large language models (LLMs), which
are "unreliable" since such processes could contain information unrelated to
the answer. To address this limitation, we introduce Enhancing NumeriCal
reasOning with Reliable procEsses (Encore), which derives the reliable
reasoning process by decomposing the answer formula, ensuring which fully
supports the answer. Nevertheless, models could lack enough data to learn the
reasoning process generation adequately, since our method generates only one
single reasoning process for one formula. To overcome this difficulty, we
present a series of pre-training tasks to help models learn the reasoning
process generation with synthesized data. The experiments show that Encore
yields improvement on all five experimental datasets with an average of 1.8%,
proving the effectiveness of our method.
Related papers
- Patience Is The Key to Large Language Model Reasoning [0.0]
We propose a simple method by encouraging models to adopt a more patient reasoning style.
We generate detailed reasoning processes as positive examples and simple answers as negative examples, thereby training the model to favor thoroughness in its responses.
Our results demonstrate a performance increase of up to 6.7% on GSM8k with training just on a lightweight dataset.
arXiv Detail & Related papers (2024-11-20T07:20:48Z) - SRA-MCTS: Self-driven Reasoning Augmentation with Monte Carlo Tree Search for Code Generation [14.786100203787194]
Large language models demonstrate exceptional performance in simple code generation tasks but face challenges in tackling complex problems.
We propose a reasoning-augmented data generation process, SRA-MCTS, which guides the model to autonomously generate high-quality intermediate reasoning paths.
Our method operates entirely through the model itself without requiring additional supervision.
arXiv Detail & Related papers (2024-11-17T12:31:04Z) - The Surprising Effectiveness of Test-Time Training for Abstract Reasoning [64.36534512742736]
We investigate the effectiveness of test-time training (TTT) as a mechanism for improving models' reasoning capabilities.
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models.
Our findings suggest that explicit symbolic search is not the only path to improved abstract reasoning in neural language models.
arXiv Detail & Related papers (2024-11-11T18:59:45Z) - A Comparative Study on Reasoning Patterns of OpenAI's o1 Model [69.08287909042421]
We show that OpenAI's o1 model has achieved the best performance on most datasets.
We also provide a detailed analysis on several reasoning benchmarks.
arXiv Detail & Related papers (2024-10-17T15:09:03Z) - General Purpose Verification for Chain of Thought Prompting [16.381123651223763]
We explore ways to improve reasoning capabilities of Large Language Models (LLMs)
We propose three general principles that a model should adhere to while reasoning.
We apply these constraints to the reasoning steps generated by the LLM to improve the accuracy of the final generation.
arXiv Detail & Related papers (2024-04-30T21:15:17Z) - R-Tuning: Instructing Large Language Models to Say `I Don't Know' [66.11375475253007]
Large language models (LLMs) have revolutionized numerous domains with their impressive performance but still face their challenges.
Previous instruction tuning methods force the model to complete a sentence no matter whether the model knows the knowledge or not.
We present a new approach called Refusal-Aware Instruction Tuning (R-Tuning)
Experimental results demonstrate R-Tuning effectively improves a model's ability to answer known questions and refrain from answering unknown questions.
arXiv Detail & Related papers (2023-11-16T08:45:44Z) - Sci-CoT: Leveraging Large Language Models for Enhanced Knowledge
Distillation in Small Models for Scientific QA [5.117094291273979]
Large Language Models (LLMs) have shown outstanding performance across wide range of downstream tasks.
We propose Sci-CoT, a two-stage framework that separates the processes of generating rationales and inferring answers.
Our 80-million parameter model is able to exceed the performance of BLOOM-176B in the ARC-Easy dataset under the few shot setting.
arXiv Detail & Related papers (2023-08-09T03:18:07Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - Logic-Guided Data Augmentation and Regularization for Consistent
Question Answering [55.05667583529711]
This paper addresses the problem of improving the accuracy and consistency of responses to comparison questions.
Our method leverages logical and linguistic knowledge to augment labeled training data and then uses a consistency-based regularizer to train the model.
arXiv Detail & Related papers (2020-04-21T17:03:08Z) - An Hybrid Method for the Estimation of the Breast Mechanical Parameters [0.9176056742068814]
An accurate numerical breast model can provide assistance to surgeons with visual information of the breast as a result of a surgery simulation.
The process of finding the model parameters requires numeric inputs, either based in medical imaging techniques, or other measures.
Inverse elasticity solvers are highly robust and provide solutions within the required degree of accuracy.
Deep-learning methods, such as neural networks, can provide accurate results in the majority of cases.
arXiv Detail & Related papers (2020-03-09T11:21:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.