Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical Specifications
- URL: http://arxiv.org/abs/2410.19790v1
- Date: Tue, 15 Oct 2024 16:37:18 GMT
- Title: Telco-DPR: A Hybrid Dataset for Evaluating Retrieval Models of 3GPP Technical Specifications
- Authors: Thaina Saraiva, Marco Sousa, Pedro Vieira, António Rodrigues,
- Abstract summary: This paper proposes a Question-Answering (QA) system for the telecom domain using 3rd Generation Partnership Project technical documents.
A hybrid dataset, Telco-DPR, is presented, combining text and tables, and includes a set of synthetic question/answer pairs.
The retrieval models are evaluated and compared using top-K accuracy and Mean Reciprocal Rank (MRR)
The proposed QA system, using the developed RAG model and the Generative Pretrained Transformer (GPT)-4, achieves a 14% improvement in answer accuracy.
- Score: 0.8999666725996975
- License:
- Abstract: This paper proposes a Question-Answering (QA) system for the telecom domain using 3rd Generation Partnership Project (3GPP) technical documents. Alongside, a hybrid dataset, Telco-DPR, which consists of a curated 3GPP corpus in a hybrid format, combining text and tables, is presented. Additionally, the dataset includes a set of synthetic question/answer pairs designed to evaluate the retrieval performance of QA systems on this type of data. The retrieval models, including the sparse model, Best Matching 25 (BM25), as well as dense models, such as Dense Passage Retriever (DPR) and Dense Hierarchical Retrieval (DHR), are evaluated and compared using top-K accuracy and Mean Reciprocal Rank (MRR). The results show that DHR, a retriever model utilising hierarchical passage selection through fine-tuning at both the document and passage levels, outperforms traditional methods in retrieving relevant technical information, achieving a Top-10 accuracy of 86.2%. Additionally, the Retriever-Augmented Generation (RAG) technique, used in the proposed QA system, is evaluated to demonstrate the benefits of using the hybrid dataset and the DHR. The proposed QA system, using the developed RAG model and the Generative Pretrained Transformer (GPT)-4, achieves a 14% improvement in answer accuracy, when compared to a previous benchmark on the same dataset.
Related papers
- Revisiting BPR: A Replicability Study of a Common Recommender System Baseline [78.00363373925758]
We study the features of the BPR model, indicating their impact on its performance, and investigate open-source BPR implementations.
Our analysis reveals inconsistencies between these implementations and the original BPR paper, leading to a significant decrease in performance of up to 50% for specific implementations.
We show that the BPR model can achieve performance levels close to state-of-the-art methods on the top-n recommendation tasks and even outperform them on specific datasets.
arXiv Detail & Related papers (2024-09-21T18:39:53Z) - Customized Retrieval Augmented Generation and Benchmarking for EDA Tool Documentation QA [5.0108982850526]
Retrieval augmented generation (RAG) enhances the accuracy and reliability of generative AI models by sourcing factual information from external databases.
This paper proposes a customized RAG framework along with three domain-specific techniques for EDA tool documentation QA.
We have developed and released a documentation QA evaluation benchmark, ORD-QA, for OpenROAD, an advanced RTL-to-GDSII design platform.
arXiv Detail & Related papers (2024-07-22T03:44:27Z) - Evaluating Retrieval Quality in Retrieval-Augmented Generation [21.115495457454365]
Traditional end-to-end evaluation methods are computationally expensive.
We propose eRAG, where each document in the retrieval list is individually utilized by the large language model within the RAG system.
eRAG offers significant computational advantages, improving runtime and consuming up to 50 times less GPU memory than end-to-end evaluation.
arXiv Detail & Related papers (2024-04-21T21:22:28Z) - Blended RAG: Improving RAG (Retriever-Augmented Generation) Accuracy with Semantic Search and Hybrid Query-Based Retrievers [0.0]
Retrieval-Augmented Generation (RAG) is a prevalent approach to infuse a private knowledge base of documents with Large Language Models (LLM) to build Generative Q&A (Question-Answering) systems.
We propose the 'Blended RAG' method of leveraging semantic search techniques, such as Vector indexes and Sparse indexes, blended with hybrid query strategies.
Our study achieves better retrieval results and sets new benchmarks for IR (Information Retrieval) datasets like NQ and TREC-COVID datasets.
arXiv Detail & Related papers (2024-03-22T17:13:46Z) - Is ChatGPT a game changer for geocoding -- a benchmark for geocoding
address parsing techniques [3.759936323189418]
We introduce a benchmark dataset of low-quality address descriptions synthesized based on human input patterns mining from actual input logs of a geocoding system in production.
This dataset has 21 different input errors and variations; contains over 239,000 address records that are uniquely selected from streets across all U.S. 50 states and D.C.
We train and gauge the performance of the GPT-3 model in extracting address components, contrasting its performance with transformer-based and LSTM-based models.
arXiv Detail & Related papers (2023-10-22T17:03:56Z) - Zero-shot Composed Text-Image Retrieval [72.43790281036584]
We consider the problem of composed image retrieval (CIR)
It aims to train a model that can fuse multi-modal information, e.g., text and images, to accurately retrieve images that match the query, extending the user's expression ability.
arXiv Detail & Related papers (2023-06-12T17:56:01Z) - QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions.
We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker.
We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - Conformer-based Hybrid ASR System for Switchboard Dataset [99.88988282353206]
We present and evaluate a competitive conformer-based hybrid model training recipe.
We study different training aspects and methods to improve word-error-rate as well as to increase training speed.
We conduct experiments on Switchboard 300h dataset and our conformer-based hybrid model achieves competitive results.
arXiv Detail & Related papers (2021-11-05T12:03:18Z) - Learning to Rank Question Answer Pairs with Bilateral Contrastive Data
Augmentation [39.22166065525888]
We propose a novel and easy-to-apply data augmentation strategy, namely Bilateral Generation (BiG)
With the augmented dataset, we design a contrastive training objective for learning to rank question answer pairs.
Experimental results on three benchmark datasets, namely TREC-QA, WikiQA, and ANTIQUE, show that our method significantly improves the performance of ranking models.
arXiv Detail & Related papers (2021-06-21T13:29:43Z) - Hybrid Generative-Retrieval Transformers for Dialogue Domain Adaptation [77.62366712130196]
We present the winning entry at the fast domain adaptation task of DSTC8, a hybrid generative-retrieval model based on GPT-2 fine-tuned to the multi-domain MetaLWOz dataset.
Our model uses retrieval logic as a fallback, being SoTA on MetaLWOz in human evaluation (>4% improvement over the 2nd place system) and attaining competitive generalization performance in adaptation to the unseen MultiWOZ dataset.
arXiv Detail & Related papers (2020-03-03T18:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.