LocalRQA: From Generating Data to Locally Training, Testing, and
Deploying Retrieval-Augmented QA Systems
- URL: http://arxiv.org/abs/2403.00982v1
- Date: Fri, 1 Mar 2024 21:10:20 GMT
- Title: LocalRQA: From Generating Data to Locally Training, Testing, and
Deploying Retrieval-Augmented QA Systems
- Authors: Xiao Yu, Yunan Lu, Zhou Yu
- Abstract summary: LocalRQA is an open-source toolkit that lets researchers and developers customize the model training, testing, and deployment process.
We build systems using online documentation obtained from Databricks and Faire's websites.
We find 7B-models trained and deployed using LocalRQA reach a similar performance compared to using OpenAI's text-ada and GPT-4 QAT-4.
- Score: 22.90963783300522
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-augmented question-answering systems combine retrieval techniques
with large language models to provide answers that are more accurate and
informative. Many existing toolkits allow users to quickly build such systems
using off-the-shelf models, but they fall short in supporting researchers and
developers to customize the model training, testing, and deployment process. We
propose LocalRQA, an open-source toolkit that features a wide selection of
model training algorithms, evaluation methods, and deployment tools curated
from the latest research. As a showcase, we build QA systems using online
documentation obtained from Databricks and Faire's websites. We find 7B-models
trained and deployed using LocalRQA reach a similar performance compared to
using OpenAI's text-ada-002 and GPT-4-turbo.
Related papers
- InspectorRAGet: An Introspection Platform for RAG Evaluation [14.066727601732625]
InspectorRAGet is an introspection platform for RAG evaluation.
It allows the user to analyze aggregate and instance-level performance of RAG systems.
arXiv Detail & Related papers (2024-04-26T11:51:53Z) - Towards MLOps: A DevOps Tools Recommender System for Machine Learning
System [1.065497990128313]
MLOps and machine learning systems evolve on new data unlike traditional systems on requirements.
In this paper, we present a framework for recommendation system that processes the contextual information.
Four different approaches i.e., rule-based, random forest, decision trees and k-nearest neighbors were investigated.
arXiv Detail & Related papers (2024-02-20T09:57:49Z) - A Practical Toolkit for Multilingual Question and Answer Generation [79.31199020420827]
We introduce AutoQG, an online service for multilingual QAG, along with lmqg, an all-in-one Python package for model fine-tuning, generation, and evaluation.
We also release QAG models in eight languages fine-tuned on a few variants of pre-trained encoder-decoder language models, which can be used online via AutoQG or locally via lmqg.
arXiv Detail & Related papers (2023-05-27T08:42:37Z) - PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question
Answering Research and Development [24.022050096797606]
PRIMEQA is a one-stop QA repository with an aim to democratize QA re-search and facilitate easy replication of state-of-the-art (SOTA) QA methods.
It supports core QA functionalities like retrieval and reading comprehension as well as auxiliary capabilities such as question generation.
It has been designed as an end-to-end toolkit for various use cases: building front-end applications, replicating SOTA methods on pub-lic benchmarks, and expanding pre-existing methods.
arXiv Detail & Related papers (2023-01-23T20:43:26Z) - ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data
Format [88.33443450434521]
Task-oriented dialogue (TOD) systems function as digital assistants, guiding users through various tasks such as booking flights or finding restaurants.
Existing toolkits for building TOD systems often fall short of in delivering comprehensive arrays of data, models, and experimental environments.
We introduce ConvLab-3: a multifaceted dialogue system toolkit crafted to bridge this gap.
arXiv Detail & Related papers (2022-11-30T16:37:42Z) - KGI: An Integrated Framework for Knowledge Intensive Language Tasks [16.511913995069097]
In this paper, we propose a system based on an enhanced version of this approach for other knowledge intensive language tasks.
Our system achieves results comparable to the best models in the KILT leaderboards.
arXiv Detail & Related papers (2022-04-08T10:36:21Z) - UKP-SQUARE: An Online Platform for Question Answering Research [50.35348764297317]
We present UKP-SQUARE, an online QA platform for researchers which allows users to query and analyze a large collection of modern Skills.
UKP-SQUARE allows users to query and analyze a large collection of modern Skills via a user-friendly web interface and integrated tests.
arXiv Detail & Related papers (2022-03-25T15:00:24Z) - A Coarse to Fine Question Answering System based on Reinforcement
Learning [48.80863342506432]
The system is designed using an actor-critic based deep reinforcement learning model to achieve multi-step question answering.
We test our model on four QA datasets, WIKEREADING, WIKIREADING LONG, CNN and SQuAD, and demonstrate 1.3$%$-1.7$%$ accuracy improvements with 1.5x-3.4x training speed-ups.
arXiv Detail & Related papers (2021-06-01T06:41:48Z) - Retrieving and Reading: A Comprehensive Survey on Open-domain Question
Answering [62.88322725956294]
We review the latest research trends in OpenQA, with particular attention to systems that incorporate neural MRC techniques.
We introduce modern OpenQA architecture named Retriever-Reader'' and analyze the various systems that follow this architecture.
We then discuss key challenges to developing OpenQA systems and offer an analysis of benchmarks that are commonly used.
arXiv Detail & Related papers (2021-01-04T04:47:46Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.