Keyword Extraction for Improved Document Retrieval in Conversational
Search
- URL: http://arxiv.org/abs/2109.05979v1
- Date: Mon, 13 Sep 2021 13:55:37 GMT
- Title: Keyword Extraction for Improved Document Retrieval in Conversational
Search
- Authors: Oleg Borisov, Mohammad Aliannejadi, Fabio Crestani
- Abstract summary: Mixed-initiative conversational search provides enormous advantages.
incorporating additional information provided by the user from the conversation poses some challenges.
We have collected two conversational keyword extraction datasets and propose an end-to-end document retrieval pipeline incorporating them.
- Score: 10.798537120200006
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent research has shown that mixed-initiative conversational search, based
on the interaction between users and computers to clarify and improve a query,
provides enormous advantages. Nonetheless, incorporating additional information
provided by the user from the conversation poses some challenges. In fact,
further interactions could confuse the system as a user might use words
irrelevant to the information need but crucial for correct sentence
construction in the context of multi-turn conversations. To this aim, in this
paper, we have collected two conversational keyword extraction datasets and
propose an end-to-end document retrieval pipeline incorporating them.
Furthermore, we study the performance of two neural keyword extraction models,
namely, BERT and sequence to sequence, in terms of extraction accuracy and
human annotation. Finally, we study the effect of keyword extraction on the
end-to-end neural IR performance and show that our approach beats
state-of-the-art IR models. We make the two datasets publicly available to
foster research in this area.
Related papers
- Leveraging Inter-Chunk Interactions for Enhanced Retrieval in Large Language Model-Based Question Answering [12.60063463163226]
IIER captures the internal connections between document chunks by considering three types of interactions: structural, keyword, and semantic.
It identifies multiple seed nodes based on the target question and iteratively searches for relevant chunks to gather supporting evidence.
It refines the context and reasoning chain, aiding the large language model in reasoning and answer generation.
arXiv Detail & Related papers (2024-08-06T02:39:55Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - A Deep Reinforcement Learning Approach for Interactive Search with
Sentence-level Feedback [12.712416630402119]
Interactive search can provide a better experience by incorporating interaction feedback from the users.
Existing state-of-the-art (SOTA) systems use reinforcement learning (RL) models to incorporate the interactions.
Yet such feedback requires extensive RL action space exploration and large amounts of annotated data.
This work proposes a new deep Q-learning (DQ) approach, DQrank.
arXiv Detail & Related papers (2023-10-03T18:45:21Z) - SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model.
To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
arXiv Detail & Related papers (2023-07-02T13:36:36Z) - Improve Retrieval-based Dialogue System via Syntax-Informed Attention [46.79601705850277]
We propose SIA, Syntax-Informed Attention, considering both intra- and inter-sentence syntax information.
We evaluate our method on three widely used benchmarks and experimental results demonstrate the general superiority of our method on dialogue response selection.
arXiv Detail & Related papers (2023-03-12T08:14:16Z) - Towards Relation Extraction From Speech [56.36416922396724]
We propose a new listening information extraction task, i.e., speech relation extraction.
We construct the training dataset for speech relation extraction via text-to-speech systems, and we construct the testing dataset via crowd-sourcing with native English speakers.
We conduct comprehensive experiments to distinguish the challenges in speech relation extraction, which may shed light on future explorations.
arXiv Detail & Related papers (2022-10-17T05:53:49Z) - Improving Keyphrase Extraction with Data Augmentation and Information
Filtering [67.43025048639333]
Keyphrase extraction is one of the essential tasks for document understanding in NLP.
We present a novel corpus and method for keyphrase extraction from the videos streamed on the Behance platform.
arXiv Detail & Related papers (2022-09-11T22:38:02Z) - Unsupervised Keyphrase Extraction via Interpretable Neural Networks [27.774524511005172]
Keyphrases that are most useful for predicting the topic of a text are important keyphrases.
InSPECT is a self-explaining neural framework for identifying influential keyphrases.
We show that INSPECT achieves state-of-the-art results in unsupervised key extraction across four diverse datasets.
arXiv Detail & Related papers (2022-03-15T04:30:47Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.