Differentiable Retrieval Augmentation via Generative Language Modeling
for E-commerce Query Intent Classification
- URL: http://arxiv.org/abs/2308.09308v3
- Date: Fri, 15 Sep 2023 09:27:20 GMT
- Title: Differentiable Retrieval Augmentation via Generative Language Modeling
for E-commerce Query Intent Classification
- Authors: Chenyu Zhao, Yunjiang Jiang, Yiming Qiu, Han Zhang, Wen-Yun Yang
- Abstract summary: We propose Differentiable Retrieval Augmentation via Generative lANguage modeling(Dragan) to address this problem by a novel differentiable reformulation.
We demonstrate the effectiveness of our proposed method on a challenging NLP task in e-commerce search, namely query intent classification.
- Score: 8.59563091603226
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Retrieval augmentation, which enhances downstream models by a knowledge
retriever and an external corpus instead of by merely increasing the number of
model parameters, has been successfully applied to many natural language
processing (NLP) tasks such as text classification, question answering and so
on. However, existing methods that separately or asynchronously train the
retriever and downstream model mainly due to the non-differentiability between
the two parts, usually lead to degraded performance compared to end-to-end
joint training. In this paper, we propose Differentiable Retrieval Augmentation
via Generative lANguage modeling(Dragan), to address this problem by a novel
differentiable reformulation. We demonstrate the effectiveness of our proposed
method on a challenging NLP task in e-commerce search, namely query intent
classification. Both the experimental results and ablation study show that the
proposed method significantly and reasonably improves the state-of-the-art
baselines on both offline evaluation and online A/B test.
Related papers
- PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking [0.0]
PRefLexOR combines preference optimization with concepts from Reinforcement Learning to enable models to self-teach.
We focus on applications in biological materials science and demonstrate the method in a variety of case studies.
arXiv Detail & Related papers (2024-10-16T08:46:26Z) - How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models [63.36637269634553]
We present a novel method of further improving performance by requiring models to compare multiple reasoning chains.
We find that instruction tuning on DCoT datasets boosts the performance of even smaller, and therefore more accessible, language models.
arXiv Detail & Related papers (2024-07-03T15:01:18Z) - Enhancing Retrieval-Augmented LMs with a Two-stage Consistency Learning Compressor [4.35807211471107]
This work proposes a novel two-stage consistency learning approach for retrieved information compression in retrieval-augmented language models.
The proposed method is empirically validated across multiple datasets, demonstrating notable enhancements in precision and efficiency for question-answering tasks.
arXiv Detail & Related papers (2024-06-04T12:43:23Z) - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction [49.729908337372436]
We propose a new evaluation method, SQC-Score.
Inspired by the principles in subjective question correction, we propose a new evaluation method, SQC-Score.
Results on three information extraction tasks show that SQC-Score is more preferred by human annotators than the baseline metrics.
arXiv Detail & Related papers (2024-04-04T15:36:53Z) - SimOAP: Improve Coherence and Consistency in Persona-based Dialogue
Generation via Over-sampling and Post-evaluation [54.66399120084227]
Language models trained on large-scale corpora can generate remarkably fluent results in open-domain dialogue.
For the persona-based dialogue generation task, consistency and coherence are great challenges for language models.
A two-stage SimOAP strategy is proposed, i.e., over-sampling and post-evaluation.
arXiv Detail & Related papers (2023-05-18T17:23:00Z) - Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue
Response Generation Models by Causal Discovery [52.95935278819512]
We conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work.
Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model.
arXiv Detail & Related papers (2023-03-02T06:33:48Z) - A Cognitive Study on Semantic Similarity Analysis of Large Corpora: A
Transformer-based Approach [0.0]
We perform semantic similarity analysis and modeling on the U.S. Patent Phrase to Phrase Matching dataset using both traditional and transformer-based techniques.
The experimental results demonstrate our methodology's enhanced performance compared to traditional techniques, with an average Pearson correlation score of 0.79.
arXiv Detail & Related papers (2022-07-24T11:06:56Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.