APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning
- URL: http://arxiv.org/abs/2212.07249v3
- Date: Tue, 12 Mar 2024 13:30:16 GMT
- Title: APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning
- Authors: Jiashuo Sun, Hang Zhang, Chen Lin, Xiangdong Su, Yeyun Gong, Jian Guo
- Abstract summary: We propose APOLLO to improve the long-form numerical reasoning framework.
For the retriever, we adopt a number-aware negative sampling strategy to enable the retriever to be more discriminative on key numerical facts.
For the generator, we design consistency-based reinforcement learning and target program augmentation strategy.
- Score: 31.252979262232124
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-form numerical reasoning in financial analysis aims to generate a
reasoning program to calculate the correct answer for a given question.
Previous work followed a retriever-generator framework, where the retriever
selects key facts from a long-form document, and the generator generates a
reasoning program based on retrieved facts. However, they treated all facts
equally without considering the different contributions of facts with and
without numbers. Meanwhile, the program consistency were ignored under
supervised training, resulting in lower training accuracy and diversity. To
solve these problems, we proposed APOLLO to improve the long-form numerical
reasoning framework. For the retriever, we adopt a number-aware negative
sampling strategy to enable the retriever to be more discriminative on key
numerical facts. For the generator, we design consistency-based reinforcement
learning and target program augmentation strategy based on the consistency of
program execution results. Experimental results on the FinQA and ConvFinQA
leaderboard verify the effectiveness of our proposed method, achieving the new
state-of-the-art.
Related papers
- Case-Based Reasoning Approach for Solving Financial Question Answering [5.10832476049103]
FinQA introduced a numerical reasoning dataset for financial documents.
We propose a novel approach to tackle numerical reasoning problems using case based reasoning (CBR)
Our model retrieves relevant cases to address a given question, and then generates an answer based on the retrieved cases and contextual information.
arXiv Detail & Related papers (2024-05-18T10:06:55Z) - Learning Planning-based Reasoning by Trajectories Collection and Process Reward Synthesizing [61.98556945939045]
We propose a framework to learn planning-based reasoning through Direct Preference Optimization (DPO) on collected trajectories.
Our results on challenging logical reasoning benchmarks demonstrate the effectiveness of our learning framework.
arXiv Detail & Related papers (2024-02-01T15:18:33Z) - Zero-Shot Question Answering over Financial Documents using Large
Language Models [0.18749305679160366]
We introduce a large language model (LLM) based approach to answer complex questions requiring multi-hop numerical reasoning over financial reports.
We use novel zero-shot prompts that guide the LLM to encode the required reasoning into a Python program or a domain specific language.
arXiv Detail & Related papers (2023-11-19T16:23:34Z) - Let's Reinforce Step by Step [10.65244642965387]
We use Reinforcement Learning from Human Feedback to shape model reasoning processes.
Our results show that the fine-grained reward provided by PRM-based methods enhances accuracy on simple mathematical reasoning.
We also show the critical role reward aggregation functions play in model performance.
arXiv Detail & Related papers (2023-11-10T01:35:51Z) - Let's reward step by step: Step-Level reward model as the Navigators for
Reasoning [64.27898739929734]
Process-Supervised Reward Model (PRM) furnishes LLMs with step-by-step feedback during the training phase.
We propose a greedy search algorithm that employs the step-level feedback from PRM to optimize the reasoning pathways explored by LLMs.
To explore the versatility of our approach, we develop a novel method to automatically generate step-level reward dataset for coding tasks and observed similar improved performance in the code generation tasks.
arXiv Detail & Related papers (2023-10-16T05:21:50Z) - Comprehensive Solution Program Centric Pretraining for Table-and-Text
Hybrid Numerical Reasoning [21.708394374594082]
Numerical reasoning over table-and-text hybrid passages, such as financial reports, poses significant challenges.
coarse-grained supervision of the whole solution program has impeded the model's ability to learn the underlying numerical reasoning process.
We propose three pretraining tasks that operate at both the whole program and sub-program level.
arXiv Detail & Related papers (2023-05-12T13:44:40Z) - Hierarchical Programmatic Reinforcement Learning via Learning to Compose
Programs [58.94569213396991]
We propose a hierarchical programmatic reinforcement learning framework to produce program policies.
By learning to compose programs, our proposed framework can produce program policies that describe out-of-distributionally complex behaviors.
The experimental results in the Karel domain show that our proposed framework outperforms baselines.
arXiv Detail & Related papers (2023-01-30T14:50:46Z) - NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual
Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences.
The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation.
In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z) - CodeRL: Mastering Code Generation through Pretrained Models and Deep
Reinforcement Learning [92.36705236706678]
"CodeRL" is a new framework for program synthesis tasks through pretrained LMs and deep reinforcement learning.
During inference, we introduce a new generation procedure with a critical sampling strategy.
For the model backbones, we extended the encoder-decoder architecture of CodeT5 with enhanced learning objectives.
arXiv Detail & Related papers (2022-07-05T02:42:15Z) - A Robustly Optimized Long Text to Math Models for Numerical Reasoning On
FinQA [2.93888900363581]
FinQA challenge has been organized to strengthen the study on numerical reasoning.
Our approach achieves the 1st place in FinQA challenge, with 71.93% execution accuracy and 67.03% program accuracy.
arXiv Detail & Related papers (2022-06-29T12:10:18Z) - Enforcing Consistency in Weakly Supervised Semantic Parsing [68.2211621631765]
We explore the use of consistency between the output programs for related inputs to reduce the impact of spurious programs.
We find that a more consistent formalism leads to improved model performance even without consistency-based training.
arXiv Detail & Related papers (2021-07-13T03:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.