Enhancing textual textbook question answering with large language models and retrieval augmented generation
- URL: http://arxiv.org/abs/2402.05128v3
- Date: Wed, 22 Jan 2025 07:14:27 GMT
- Title: Enhancing textual textbook question answering with large language models and retrieval augmented generation
- Authors: Hessa Abdulrahman Alawwad, Areej Alhothali, Usman Naseem, Ali Alkhathlan, Amani Jamal,
- Abstract summary: We propose a framework (PLRTQA) that incorporates the retrieval augmented generation (RAG) technique to handle the out-of-domain scenario.
Our architecture outperforms the baseline, achieving an accuracy improvement of 4. 12% in the validation set and 9. 84% in the test set for textual multiple-choice questions.
- Score: 3.6799953119508735
- License:
- Abstract: Textbook question answering (TQA) is a challenging task in artificial intelligence due to the complex nature of context needed to answer complex questions. Although previous research has improved the task, there are still some limitations in textual TQA, including weak reasoning and inability to capture contextual information in the lengthy context. We propose a framework (PLRTQA) that incorporates the retrieval augmented generation (RAG) technique to handle the out-of-domain scenario where concepts are spread across different lessons, and utilize transfer learning to handle the long context and enhance reasoning abilities. Our architecture outperforms the baseline, achieving an accuracy improvement of 4. 12% in the validation set and 9. 84% in the test set for textual multiple-choice questions. While this paper focuses on solving challenges in the textual TQA, It provides a foundation for future work in multimodal TQA where the visual components are integrated to address more complex educational scenarios. Code: https://github.com/hessaAlawwad/PLR-TQA
Related papers
- Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues [8.797350517975477]
Video text-based visual question answering (Video TextVQA) is a practical task that aims to answer questions by jointly textual reasoning and visual information in a given video.
We propose the TEA (stands for textbfTrack thbfE bftextAlanguageer'') method that better extends the generative TextVQA framework from image to video.
arXiv Detail & Related papers (2024-12-17T03:06:12Z) - Enhanced Textual Feature Extraction for Visual Question Answering: A Simple Convolutional Approach [2.744781070632757]
We compare models that leverage long-range dependencies and simpler models focusing on local textual features within a well-established VQA framework.
We propose ConvGRU, a model that incorporates convolutional layers to improve text feature representation without substantially increasing model complexity.
Tested on the VQA-v2 dataset, ConvGRU demonstrates a modest yet consistent improvement over baselines for question types such as Number and Count.
arXiv Detail & Related papers (2024-05-01T12:39:35Z) - Harnessing the Power of Prompt-based Techniques for Generating
School-Level Questions using Large Language Models [0.5459032912385802]
We propose a novel approach that utilizes prompt-based techniques to generate descriptive and reasoning-based questions.
We curate a new QG dataset called EduProbe for school-level subjects, by leveraging the rich content of NCERT textbooks.
We investigate several prompt-based QG methods by fine-tuning transformer-based large language models.
arXiv Detail & Related papers (2023-12-02T05:13:28Z) - SEMQA: Semi-Extractive Multi-Source Question Answering [94.04430035121136]
We introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion.
We create the first dataset of this kind, QuoteSum, with human-written semi-extractive answers to natural and generated questions.
arXiv Detail & Related papers (2023-11-08T18:46:32Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation [55.83319599681002]
Text-VQA aims at answering questions that require understanding the textual cues in an image.
We develop a new method to generate high-quality and diverse QA pairs by explicitly utilizing the existing rich text available in the scene context of each image.
arXiv Detail & Related papers (2022-08-03T02:18:09Z) - Towards Complex Document Understanding By Discrete Reasoning [77.91722463958743]
Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language.
We introduce a new Document VQA dataset, named TAT-DQA, which consists of 3,067 document pages and 16,558 question-answer pairs.
We develop a novel model named MHST that takes into account the information in multi-modalities, including text, layout and visual image, to intelligently address different types of questions.
arXiv Detail & Related papers (2022-07-25T01:43:19Z) - Modern Question Answering Datasets and Benchmarks: A Survey [5.026863544662493]
Question Answering (QA) is one of the most important natural language processing (NLP) tasks.
It aims using NLP technologies to generate a corresponding answer to a given question based on the massive unstructured corpus.
In this paper, we investigate influential QA datasets that have been released in the era of deep learning.
arXiv Detail & Related papers (2022-06-30T05:53:56Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - TSQA: Tabular Scenario Based Question Answering [14.92495213480887]
scenario-based question answering (SQA) has attracted an increasing research interest.
To support the study of this task, we construct GeoTSQA.
We extend state-of-the-art MRC methods with TTGen, a novel table-to-text generator.
arXiv Detail & Related papers (2021-01-14T02:00:33Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.