Response Enhanced Semi-supervised Dialogue Query Generation
- URL: http://arxiv.org/abs/2312.12713v2
- Date: Fri, 16 Feb 2024 02:19:39 GMT
- Title: Response Enhanced Semi-supervised Dialogue Query Generation
- Authors: Jianheng Huang, Ante Wang, Linfeng Gao, Linfeng Song, Jinsong Su
- Abstract summary: We propose a semi-supervised learning framework -- SemiDQG -- to improve model performance with unlabeled conversations.
We first apply a similarity-based query selection strategy to select high-quality RA-generated pseudo queries.
We adopt the REINFORCE algorithm to further enhance QP, with RA-provided rewards as fine-grained training signals.
- Score: 40.17161986495854
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging vast and continually updated knowledge from the Internet has been
considered an important ability for a dialogue system. Therefore, the dialogue
query generation task is proposed for generating search queries from dialogue
histories, which will be submitted to a search engine for retrieving relevant
websites on the Internet. In this regard, previous efforts were devoted to
collecting conversations with annotated queries and training a query producer
(QP) via standard supervised learning. However, these studies still face the
challenges of data scarcity and domain adaptation. To address these issues, in
this paper, we propose a semi-supervised learning framework -- SemiDQG, to
improve model performance with unlabeled conversations. Based on the
observation that the search query is typically related to the topic of dialogue
response, we train a response-augmented query producer (RA) to provide rich and
effective training signals for QP. We first apply a similarity-based query
selection strategy to select high-quality RA-generated pseudo queries, which
are used to construct pseudo instances for training QP and RA. Then, we adopt
the REINFORCE algorithm to further enhance QP, with RA-provided rewards as
fine-grained training signals. Experimental results and in-depth analysis of
three benchmarks show the effectiveness of our framework in cross-domain and
low-resource scenarios. Particularly, SemiDQG significantly surpasses ChatGPT
and competitive baselines. Our code is available at
\url{https://github.com/DeepLearnXMU/SemiDQG}.
Related papers
- Mitigating the Negative Impact of Over-association for Conversational Query Production [44.661864532728615]
Conversational query generation aims at producing search queries from dialogue histories, which are then used to retrieve relevant knowledge from a search engine.
Previous models suffer from the data hunger issue, and they tend to both drop important concepts from dialogue histories and generate irrelevant concepts at inference time.
We propose effective instance-level weighting strategies for training to mitigate these issues from multiple perspectives.
arXiv Detail & Related papers (2024-09-29T06:19:59Z) - Selecting Query-bag as Pseudo Relevance Feedback for Information-seeking Conversations [76.70349332096693]
Information-seeking dialogue systems are widely used in e-commerce systems.
We propose a Query-bag based Pseudo Relevance Feedback framework (QB-PRF)
It constructs a query-bag with related queries to serve as pseudo signals to guide information-seeking conversations.
arXiv Detail & Related papers (2024-03-22T08:10:32Z) - Social Commonsense-Guided Search Query Generation for Open-Domain
Knowledge-Powered Conversations [66.16863141262506]
We present a novel approach that focuses on generating internet search queries guided by social commonsense.
Our proposed framework addresses passive user interactions by integrating topic tracking, commonsense response generation and instruction-driven query generation.
arXiv Detail & Related papers (2023-10-22T16:14:56Z) - Retrieval-Generation Alignment for End-to-End Task-Oriented Dialogue
System [40.33178881317882]
We propose the application of maximal marginal likelihood to train a perceptive retriever by utilizing signals from response generation for supervision.
We evaluate our approach on three task-oriented dialogue datasets using T5 and ChatGPT as the backbone models.
arXiv Detail & Related papers (2023-10-13T06:03:47Z) - PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded
Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities.
We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework.
We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z) - CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement
Learning [16.470428531658232]
We develop a query rewriting model CONQRR that rewrites a conversational question in context into a standalone question.
We show that CONQRR achieves state-of-the-art results on a recent open-domain CQA dataset.
arXiv Detail & Related papers (2021-12-16T01:40:30Z) - Learning an Effective Context-Response Matching Model with
Self-Supervised Tasks for Retrieval-based Dialogues [88.73739515457116]
We introduce four self-supervised tasks including next session prediction, utterance restoration, incoherence detection and consistency discrimination.
We jointly train the PLM-based response selection model with these auxiliary tasks in a multi-task manner.
Experiment results indicate that the proposed auxiliary self-supervised tasks bring significant improvement for multi-turn response selection.
arXiv Detail & Related papers (2020-09-14T08:44:46Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.