Related papers: ExpertGenQA: Open-ended QA generation in Specialized Domains

ExpertGenQA: Open-ended QA generation in Specialized Domains

URL: http://arxiv.org/abs/2503.02948v1
Date: Tue, 04 Mar 2025 19:09:48 GMT
Title: ExpertGenQA: Open-ended QA generation in Specialized Domains
Authors: Haz Sameen Shahgir, Chansong Lim, Jia Chen, Evangelos E. Papalexakis, Yue Dong,
Abstract summary: ExpertGenQA is a protocol that combines few-shot learning with structured topic and style categorization to generate comprehensive domain-specific QA pairs.<n>We show that ExpertGenQA achieves twice the efficiency of baseline few-shot approaches while maintaining $94.4%$ topic coverage.
Score: 9.412082058055823
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating high-quality question-answer pairs for specialized technical domains remains challenging, with existing approaches facing a tradeoff between leveraging expert examples and achieving topical diversity. We present ExpertGenQA, a protocol that combines few-shot learning with structured topic and style categorization to generate comprehensive domain-specific QA pairs. Using U.S. Federal Railroad Administration documents as a test bed, we demonstrate that ExpertGenQA achieves twice the efficiency of baseline few-shot approaches while maintaining $94.4\%$ topic coverage. Through systematic evaluation, we show that current LLM-based judges and reward models exhibit strong bias toward superficial writing styles rather than content quality. Our analysis using Bloom's Taxonomy reveals that ExpertGenQA better preserves the cognitive complexity distribution of expert-written questions compared to template-based approaches. When used to train retrieval models, our generated queries improve top-1 accuracy by $13.02\%$ over baseline performance, demonstrating their effectiveness for downstream applications in technical domains.

Related papers

Optimal Query Allocation in Extractive QA with LLMs: A Learning-to-Defer Framework with Theoretical Guarantees [3.4289478404209826]
Large Language Models excel in generative tasks but exhibit inefficiencies in structured text selection. We propose a Learning-to-Defer framework that allocates queries to specialized experts, ensuring high-confidence predictions.
arXiv Detail & Related papers (2024-10-21T08:21:00Z)
A RAG Approach for Generating Competency Questions in Ontology Engineering [1.0044270899550196]
With the emergence of Large Language Models (LLMs), there arises the possibility to automate and enhance this process.<n>We present a retrieval-augmented generation (RAG) approach that uses LLMs for the automatic generation of CQs.<n>We conduct experiments using GPT-4 on two domain engineering tasks and compare results against ground-truth CQs constructed by domain experts.
arXiv Detail & Related papers (2024-09-13T13:34:32Z)
Multi-Modal Prompt Learning on Blind Image Quality Assessment [65.0676908930946]
Image Quality Assessment (IQA) models benefit significantly from semantic information, which allows them to treat different types of objects distinctly. Traditional methods, hindered by a lack of sufficiently annotated data, have employed the CLIP image-text pretraining model as their backbone to gain semantic awareness. Recent approaches have attempted to address this mismatch using prompt technology, but these solutions have shortcomings. This paper introduces an innovative multi-modal prompt-based methodology for IQA.
arXiv Detail & Related papers (2024-04-23T11:45:32Z)
PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models [72.57329554067195]
ProxyQA is an innovative framework dedicated to assessing longtext generation. It comprises in-depth human-curated meta-questions spanning various domains, each accompanied by specific proxy-questions with pre-annotated answers. It assesses the generated content's quality through the evaluator's accuracy in addressing the proxy-questions.
arXiv Detail & Related papers (2024-01-26T18:12:25Z)
Fusing Models with Complementary Expertise [42.099743709292866]
We consider the Fusion of Experts (FoE) problem of fusing outputs of expert models with complementary knowledge of the data distribution. Our method is applicable to both discriminative and generative tasks. We extend our method to the "frugal" setting where it is desired to reduce the number of expert model evaluations at test time.
arXiv Detail & Related papers (2023-10-02T18:31:35Z)
Performance Prediction for Multi-hop Questions [7.388002745070808]
We propose multHP, a novel pre-retrieval method for predicting the performance of open-domain multi-hop questions. Our evaluation shows that the proposed model is a strong predictor of the performance, outperforming traditional single-hop QPP models.
arXiv Detail & Related papers (2023-08-12T01:34:41Z)
Few-shot Unified Question Answering: Tuning Models or Prompts? [23.885286975673644]
The paper explores the potential of two paradigms of tuning, model, and prompts, for unified QA under a low-resource setting. The research offers insights into the advantages and limitations of prompt tuning for unified QA in a few-shot setting.
arXiv Detail & Related papers (2023-05-23T23:14:38Z)
Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification [59.698811329287174]
We leverage GPT-2 for generating artificial training instances in order to improve classification performance. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements.
arXiv Detail & Related papers (2021-11-17T12:10:03Z)
RnG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering [57.94658176442027]
We present RnG-KBQA, a Rank-and-Generate approach for KBQA. We achieve new state-of-the-art results on GrailQA and WebQSP datasets.
arXiv Detail & Related papers (2021-09-17T17:58:28Z)
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts. Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA) First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA) Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.