DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search
- URL: http://arxiv.org/abs/2410.14609v1
- Date: Fri, 18 Oct 2024 17:03:17 GMT
- Title: DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search
- Authors: Simon Lupart, Mohammad Aliannejadi, Evangelos Kanoulas,
- Abstract summary: Conversational Search (CS) is the task of retrieving relevant documents from a corpus within a conversational context.
Current methods address this by distilling embeddings from human-rewritten queries to learn the context modeling task.
We propose a new distillation method, as a relaxation of the previous objective, unifying retrieval and context modeling.
- Score: 19.694957365385896
- License:
- Abstract: Conversational Search (CS) is the task of retrieving relevant documents from a corpus within a conversational context, combining retrieval with conversational context modeling. With the explosion of Large Language Models (LLMs), the CS field has seen major improvements with LLMs rewriting user queries, accounting for conversational context. However, engaging LLMs at inference time harms efficiency. Current methods address this by distilling embeddings from human-rewritten queries to learn the context modeling task. Yet, these approaches predominantly focus on context modeling, and only treat the contrastive component of the retrieval task within a distillation-independent loss term. To address these limitations, we propose a new distillation method, as a relaxation of the previous objective, unifying retrieval and context modeling. We relax the existing training objectives by distilling similarity scores between conversations and documents, rather than relying solely on representation learning. Our proposed distillation objective allows for more freedom in the representation space and leverages the contrastive nature of document relevance. Through experiments on Learned Sparse Retrieval (LSR) across 5 CS datasets, our approach demonstrates substantial improvements in both in-domain and out-of-domain retrieval performance, outperforming state-of-the-art with gains of up to 6 points in recall for out-of-domain datasets. Additionally, through the relaxation of the objective, we propose a multi-teacher distillation, using multiple LLMs as teachers, yielding additional gains, and outperforming the teachers themselves in in-domain experiments. Finally, analysis of the sparsity of the models reveals that our distillation allows for better control over the sparsity of the trained models.
Related papers
- Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors [74.04775677110179]
In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs)
In this work, we examine whether this is the result of the aggregation used in corresponding datasets, where trying to combine low-agreement, disparate annotations might lead to annotation artifacts that create detrimental noise in the prompt.
Our results indicate that aggregation is a confounding factor in the modeling of subjective tasks, and advocate focusing on modeling individuals instead.
arXiv Detail & Related papers (2024-10-17T17:16:00Z) - Unleashing the Power of Large Language Models in Zero-shot Relation Extraction via Self-Prompting [21.04933334040135]
We introduce the Self-Prompting framework, a novel method designed to fully harness the embedded RE knowledge within Large Language Models.
Our framework employs a three-stage diversity approach to prompt LLMs, generating multiple synthetic samples that encapsulate specific relations from scratch.
Experimental evaluations on benchmark datasets show our approach outperforms existing LLM-based zero-shot RE methods.
arXiv Detail & Related papers (2024-10-02T01:12:54Z) - Relation Extraction with Fine-Tuned Large Language Models in Retrieval Augmented Generation Frameworks [0.0]
Relation Extraction (RE) is crucial for converting unstructured data into structured formats like Knowledge Graphs (KGs)
Recent studies leveraging pre-trained language models (PLMs) have shown significant success in this area.
This work explores the performance of fine-tuned LLMs and their integration into the Retrieval Augmented-based (RAG) RE approach.
arXiv Detail & Related papers (2024-06-20T21:27:57Z) - Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor [4.35807211471107]
This work proposes a novel two-stage consistency learning approach for retrieved information compression in retrieval-augmented language models.
The proposed method is empirically validated across multiple datasets, demonstrating notable enhancements in precision and efficiency for question-answering tasks.
arXiv Detail & Related papers (2024-06-04T12:43:23Z) - Text-Video Retrieval with Global-Local Semantic Consistent Learning [122.15339128463715]
We propose a simple yet effective method, Global-Local Semantic Consistent Learning (GLSCL)
GLSCL capitalizes on latent shared semantics across modalities for text-video retrieval.
Our method achieves comparable performance with SOTA as well as being nearly 220 times faster in terms of computational cost.
arXiv Detail & Related papers (2024-05-21T11:59:36Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Revisiting Large Language Models as Zero-shot Relation Extractors [8.953462875381888]
Relation extraction (RE) consistently involves a certain degree of labeled or unlabeled data even if under zero-shot setting.
Recent studies have shown that large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt.
This work focuses on the study of exploring LLMs as zero-shot relation extractors.
arXiv Detail & Related papers (2023-10-08T06:17:39Z) - Making Retrieval-Augmented Language Models Robust to Irrelevant Context [55.564789967211844]
An important desideratum of RALMs, is that retrieved information helps model performance when it is relevant.
Recent work has shown that retrieval augmentation can sometimes have a negative effect on performance.
arXiv Detail & Related papers (2023-10-02T18:52:35Z) - RAVEN: In-Context Learning with Retrieval-Augmented Encoder-Decoder Language Models [57.12888828853409]
RAVEN is a model that combines retrieval-augmented masked language modeling and prefix language modeling.
Fusion-in-Context Learning enables the model to leverage more in-context examples without requiring additional training.
Our work underscores the potential of retrieval-augmented encoder-decoder language models for in-context learning.
arXiv Detail & Related papers (2023-08-15T17:59:18Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - Learning from Context or Names? An Empirical Study on Neural Relation
Extraction [112.06614505580501]
We study the effect of two main information sources in text: textual context and entity mentions (names)
We propose an entity-masked contrastive pre-training framework for relation extraction (RE)
Our framework can improve the effectiveness and robustness of neural models in different RE scenarios.
arXiv Detail & Related papers (2020-10-05T11:21:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.