AQUALLM: Audio Question Answering Data Generation Using Large Language
Models
- URL: http://arxiv.org/abs/2312.17343v1
- Date: Thu, 28 Dec 2023 20:01:27 GMT
- Title: AQUALLM: Audio Question Answering Data Generation Using Large Language
Models
- Authors: Swarup Ranjan Behera, Krishna Mohan Injeti, Jaya Sai Kiran Patibandla,
Praveen Kumar Pokala, and Balakrishna Reddy Pailla
- Abstract summary: We introduce a scalable AQA data generation pipeline, which relies on Large Language Models (LLMs)
We present three extensive and high-quality benchmark datasets for AQA.
Models trained on our datasets demonstrate enhanced generalizability when compared to models trained using human-annotated AQA data.
- Score: 2.2232550112727267
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Audio Question Answering (AQA) constitutes a pivotal task in which machines
analyze both audio signals and natural language questions to produce precise
natural language answers. The significance of possessing high-quality, diverse,
and extensive AQA datasets cannot be overstated when aiming for the precision
of an AQA system. While there has been notable focus on developing accurate and
efficient AQA models, the creation of high-quality, diverse, and extensive
datasets for the specific task at hand has not garnered considerable attention.
To address this challenge, this work makes several contributions. We introduce
a scalable AQA data generation pipeline, denoted as the AQUALLM framework,
which relies on Large Language Models (LLMs). This framework utilizes existing
audio-caption annotations and incorporates state-of-the-art LLMs to generate
expansive, high-quality AQA datasets. Additionally, we present three extensive
and high-quality benchmark datasets for AQA, contributing significantly to the
progression of AQA research. AQA models trained on the proposed datasets set
superior benchmarks compared to the existing state-of-the-art. Moreover, models
trained on our datasets demonstrate enhanced generalizability when compared to
models trained using human-annotated AQA data. Code and datasets will be
accessible on GitHub~\footnote{\url{https://github.com/swarupbehera/AQUALLM}}.
Related papers
- ATTIQA: Generalizable Image Quality Feature Extractor using Attribute-aware Pretraining [25.680035174334886]
In no-reference image quality assessment (NR-IQA), the challenge of limited dataset sizes hampers the development of robust and generalizable models.
We propose a novel pretraining framework that constructs a generalizable representation for IQA by selectively extracting quality-related knowledge.
Our approach achieves state-of-the-art performance on multiple IQA datasets and exhibits remarkable generalization capabilities.
arXiv Detail & Related papers (2024-06-03T06:03:57Z) - Automatic Question-Answer Generation for Long-Tail Knowledge [65.11554185687258]
We propose an automatic approach to generate specialized QA datasets for tail entities.
We conduct extensive experiments by employing pretrained LLMs on our newly generated long-tail QA datasets.
arXiv Detail & Related papers (2024-03-03T03:06:31Z) - QASnowball: An Iterative Bootstrapping Framework for High-Quality
Question-Answering Data Generation [67.27999343730224]
We introduce an iterative bootstrapping framework for QA data augmentation (named QASnowball)
QASnowball can iteratively generate large-scale high-quality QA data based on a seed set of supervised examples.
We conduct experiments in the high-resource English scenario and the medium-resource Chinese scenario, and the experimental results show that the data generated by QASnowball can facilitate QA models.
arXiv Detail & Related papers (2023-09-19T05:20:36Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - PAXQA: Generating Cross-lingual Question Answering Examples at Training
Scale [53.92008514395125]
PAXQA (Projecting annotations for cross-lingual (x) QA) decomposes cross-lingual QA into two stages.
We propose a novel use of lexically-constrained machine translation, in which constrained entities are extracted from the parallel bitexts.
We show that models fine-tuned on these datasets outperform prior synthetic data generation models over several extractive QA datasets.
arXiv Detail & Related papers (2023-04-24T15:46:26Z) - Pre-trained Transformer-Based Approach for Arabic Question Answering : A
Comparative Study [0.5801044612920815]
We evaluate the state-of-the-art pre-trained transformers models for Arabic QA using four reading comprehension datasets.
We fine-tuned and compared the performance of the AraBERTv2-base model, AraBERTv0.2-large model, and AraELECTRA model.
arXiv Detail & Related papers (2021-11-10T12:33:18Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.