Related papers: One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning

One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning

URL: http://arxiv.org/abs/2510.01526v1
Date: Wed, 01 Oct 2025 23:45:45 GMT
Title: One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning
Authors: Mengyu Wang, Sotirios Sabanis, Miguel de Carvalho, Shay B. Cohen, Tiejun Ma,
Abstract summary: Expert Question Decomposition (EQD) is designed to balance the use of domain knowledge with computational efficiency.<n>It requires only a few thousand training examples and a single A100 GPU for fine-tuning, with inference time comparable to zero-shot prompting.<n>We evaluate EQD in the financial domain, characterized by specialized knowledge and complex quantitative reasoning, across four benchmark datasets.
Score: 24.581052880693765
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Domain-specific quantitative reasoning remains a major challenge for large language models (LLMs), especially in fields requiring expert knowledge and complex question answering (QA). In this work, we propose Expert Question Decomposition (EQD), an approach designed to balance the use of domain knowledge with computational efficiency. EQD is built on a two-step fine-tuning framework and guided by a reward function that measures the effectiveness of generated sub-questions in improving QA outcomes. It requires only a few thousand training examples and a single A100 GPU for fine-tuning, with inference time comparable to zero-shot prompting. Beyond its efficiency, EQD outperforms state-of-the-art domain-tuned models and advanced prompting strategies. We evaluate EQD in the financial domain, characterized by specialized knowledge and complex quantitative reasoning, across four benchmark datasets. Our method consistently improves QA performance by 0.6% to 10.5% across different LLMs. Our analysis reveals an important insight: in domain-specific QA, a single supporting question often provides greater benefit than detailed guidance steps.

Related papers

Benchmarking Uncertainty Calibration in Large Language Model Long-Form Question Answering [7.1559850008795385]
Large Language Models (LLMs) are commonly used in Question Answering (QA) settings.<n>Existing UQ approaches remain weakly validated in scientific QA.<n>We introduce the first large-scale benchmark for evaluating UQ metrics in reasoning-demanding QA.
arXiv Detail & Related papers (2026-01-30T20:02:34Z)
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study [59.44848132298657]
Post-training quantization (PTQ) usually comes with the cost of large accuracy drops, especially for reasoning tasks under low-bit settings.<n>In this study, we present a systematic empirical study of quantization-aware training (QAT) for reasoning models.
arXiv Detail & Related papers (2026-01-21T11:22:29Z)
Building Domain-Specific Small Language Models via Guided Data Generation [5.404790079646315]
Large Language Models (LLMs) have shown remarkable success in supporting a wide range of knowledge-intensive tasks.<n>In specialized domains, there is growing interest in leveraging LLMs to assist subject matter experts with domain-specific challenges.<n>Many open-source models demand significant computational resources for effective domain adaptation and deployment.<n>We present a cost-efficient and scalable training pipeline that combines guided synthetic data generation from a small seed corpus with bottom-up domain data.
arXiv Detail & Related papers (2025-11-23T07:19:31Z)
General-Reasoner: Advancing LLM Reasoning Across All Domains [64.70599911897595]
Reinforcement learning (RL) has recently demonstrated strong potential in enhancing the reasoning capabilities of large language models (LLMs)<n>We propose General-Reasoner, a novel training paradigm designed to enhance LLM reasoning capabilities across diverse domains.<n>We train a series of models and evaluate them on a wide range of datasets covering wide domains like physics, chemistry, finance, electronics etc.
arXiv Detail & Related papers (2025-05-20T17:41:33Z)
ExpertGenQA: Open-ended QA generation in Specialized Domains [9.412082058055823]
ExpertGenQA is a protocol that combines few-shot learning with structured topic and style categorization to generate comprehensive domain-specific QA pairs.<n>We show that ExpertGenQA achieves twice the efficiency of baseline few-shot approaches while maintaining $94.4%$ topic coverage.
arXiv Detail & Related papers (2025-03-04T19:09:48Z)
Analyzing the Effectiveness of Quantum Annealing with Meta-Learning [7.251305766151019]
We propose a new methodology to study the effectiveness of Quantum Annealing (QA) based on meta-learning models. We build a dataset composed of more than five thousand instances of ten different optimization problems. We define a set of more than a hundred features to describe their characteristics, and solve them with both QA and three classical solvers.
arXiv Detail & Related papers (2024-08-01T14:03:11Z)
KaPQA: Knowledge-Augmented Product Question-Answering [59.096607961704656]
We introduce two product question-answering (QA) datasets focused on Adobe Acrobat and Photoshop products. We also propose a novel knowledge-driven RAG-QA framework to enhance the performance of the models in the product QA task.
arXiv Detail & Related papers (2024-07-22T22:14:56Z)
Performance Prediction for Multi-hop Questions [7.388002745070808]
We propose multHP, a novel pre-retrieval method for predicting the performance of open-domain multi-hop questions. Our evaluation shows that the proposed model is a strong predictor of the performance, outperforming traditional single-hop QPP models.
arXiv Detail & Related papers (2023-08-12T01:34:41Z)
A Survey for Efficient Open Domain Question Answering [51.67110249787223]
Open domain question answering (ODQA) is a longstanding task aimed at answering factual questions from a large knowledge corpus without any explicit evidence in natural language processing (NLP)
arXiv Detail & Related papers (2022-11-15T04:18:53Z)
ProQA: Structural Prompt-based Pre-training for Unified Question Answering [84.59636806421204]
ProQA is a unified QA paradigm that solves various tasks through a single model. It concurrently models the knowledge generalization for all QA tasks while keeping the knowledge customization for every specific QA task. ProQA consistently boosts performance on both full data fine-tuning, few-shot learning, and zero-shot testing scenarios.
arXiv Detail & Related papers (2022-05-09T04:59:26Z)
Synthetic Question Value Estimation for Domain Adaptation of Question Answering [31.003053719921628]
We introduce a novel idea of training a question value estimator (QVE) that directly estimates the usefulness of synthetic questions for improving the target-domain QA performance. By using such questions and only around 15% of the human annotations on the target domain, we can achieve comparable performance to the fully-supervised baselines.
arXiv Detail & Related papers (2022-03-16T20:22:31Z)
Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation [11.733609600774306]
Question answering systems are typically evaluated against manually annotated finite sets of one or more answers. This leads to a coverage limitation that results in underestimating the true performance of systems. We present the first systematic conceptual and data-driven analysis to examine the shortcomings of token-level equivalence measures.
arXiv Detail & Related papers (2022-02-15T18:53:58Z)
Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA) First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA) Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.