Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI
- URL: http://arxiv.org/abs/2205.00029v1
- Date: Fri, 29 Apr 2022 18:18:40 GMT
- Title: Self-Aware Feedback-Based Self-Learning in Large-Scale Conversational AI
- Authors: Pragaash Ponnusamy, Clint Solomon Mathialagan, Gustavo Aguilar,
Chengyuan Ma, Chenlei Guo
- Abstract summary: Self-learning paradigms in large-scale conversational AI agents tend to leverage user feedback in bridging between what they say and what they mean.
We show that our self-aware model improves the overall PR-AUC by 27.45%, achieves a relative defect reduction of up to 31.22%, and is able to adapt quicker to changes in global preferences.
- Score: 8.638846754482467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-learning paradigms in large-scale conversational AI agents tend to
leverage user feedback in bridging between what they say and what they mean.
However, such learning, particularly in Markov-based query rewriting systems
have far from addressed the impact of these models on future training where
successive feedback is inevitably contingent on the rewrite itself, especially
in a continually updating environment. In this paper, we explore the
consequences of this inherent lack of self-awareness towards impairing the
model performance, ultimately resulting in both Type I and II errors over time.
To that end, we propose augmenting the Markov Graph construction with a
superposition-based adjacency matrix. Here, our method leverages an induced
stochasticity to reactively learn a locally-adaptive decision boundary based on
the performance of the individual rewrites in a bi-variate beta setting. We
also surface a data augmentation strategy that leverages template-based
generation in abridging complex conversation hierarchies of dialogs so as to
simplify the learning process. All in all, we demonstrate that our self-aware
model improves the overall PR-AUC by 27.45%, achieves a relative defect
reduction of up to 31.22%, and is able to adapt quicker to changes in global
preferences across a large number of customers.
Related papers
- ReLearn: Unlearning via Learning for Large Language Models [64.2802606302194]
We propose ReLearn, a data augmentation and fine-tuning pipeline for effective unlearning.
This framework introduces Knowledge Forgetting Rate (KFR) and Knowledge Retention Rate (KRR) to measure knowledge-level preservation.
Our experiments show that ReLearn successfully achieves targeted forgetting while preserving high-quality output.
arXiv Detail & Related papers (2025-02-16T16:31:00Z) - Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training [18.896813839389893]
We propose an iterative self-training framework, Agent-R, that enables language Agent to Reflect on the fly.
Unlike traditional methods that reward or penalize actions based on correctness, Agent-R leverages MCTS to construct training data that recover correct trajectories from erroneous ones.
Our findings demonstrate that Agent-R continuously improves the model's ability to recover from errors and enables timely error correction.
arXiv Detail & Related papers (2025-01-20T11:46:04Z) - Self-Improvement in Language Models: The Sharpening Mechanism [70.9248553790022]
We offer a new perspective on the capabilities of self-improvement through a lens we refer to as sharpening.
Motivated by the observation that language models are often better at verifying response quality than they are at generating correct responses, we formalize self-improvement as using the model itself as a verifier during post-training.
We analyze two natural families of self-improvement algorithms based on SFT and RLHF.
arXiv Detail & Related papers (2024-12-02T20:24:17Z) - End-to-End Speech Recognition: A Survey [68.35707678386949]
The goal of this survey is to provide a taxonomy of E2E ASR models and corresponding improvements.
All relevant aspects of E2E ASR are covered in this work, accompanied by discussions of performance and deployment opportunities.
arXiv Detail & Related papers (2023-03-03T01:46:41Z) - Stateful Offline Contextual Policy Evaluation and Learning [88.9134799076718]
We study off-policy evaluation and learning from sequential data.
We formalize the relevant causal structure of problems such as dynamic personalized pricing.
We show improved out-of-sample policy performance in this class of relevant problems.
arXiv Detail & Related papers (2021-10-19T16:15:56Z) - Layer-wise Analysis of a Self-supervised Speech Representation Model [26.727775920272205]
Self-supervised learning approaches have been successful for pre-training speech representation models.
Not much has been studied about the type or extent of information encoded in the pre-trained representations themselves.
arXiv Detail & Related papers (2021-07-10T02:13:25Z) - Enhancing Dialogue Generation via Multi-Level Contrastive Learning [57.005432249952406]
We propose a multi-level contrastive learning paradigm to model the fine-grained quality of the responses with respect to the query.
A Rank-aware (RC) network is designed to construct the multi-level contrastive optimization objectives.
We build a Knowledge Inference (KI) component to capture the keyword knowledge from the reference during training and exploit such information to encourage the generation of informative words.
arXiv Detail & Related papers (2020-09-19T02:41:04Z) - Joint Contextual Modeling for ASR Correction and Language Understanding [60.230013453699975]
We propose multi-task neural approaches to perform contextual language correction on ASR outputs jointly with language understanding (LU)
We show that the error rates of off the shelf ASR and following LU systems can be reduced significantly by 14% relative with joint models trained using small amounts of in-domain data.
arXiv Detail & Related papers (2020-01-28T22:09:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.