Contextual Biasing of Named-Entities with Large Language Models
- URL: http://arxiv.org/abs/2309.00723v2
- Date: Fri, 22 Sep 2023 02:06:10 GMT
- Title: Contextual Biasing of Named-Entities with Large Language Models
- Authors: Chuanneng Sun, Zeeshan Ahmed, Yingyi Ma, Zhe Liu, Lucas Kabela, Yutong
Pang, Ozlem Kalinli
- Abstract summary: This paper studies contextual biasing with Large Language Models (LLMs)
During second-pass rescoring additional contextual information is provided to a LLM to boost Automatic Speech Recognition (ASR) performance.
We propose to leverage prompts for a LLM without fine tuning during rescoring which incorporate a biasing list and few-shot examples.
- Score: 12.396054621526643
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies contextual biasing with Large Language Models (LLMs),
where during second-pass rescoring additional contextual information is
provided to a LLM to boost Automatic Speech Recognition (ASR) performance. We
propose to leverage prompts for a LLM without fine tuning during rescoring
which incorporate a biasing list and few-shot examples to serve as additional
information when calculating the score for the hypothesis. In addition to
few-shot prompt learning, we propose multi-task training of the LLM to predict
both the entity class and the next token. To improve the efficiency for
contextual biasing and to avoid exceeding LLMs' maximum sequence lengths, we
propose dynamic prompting, where we select the most likely class using the
class tag prediction, and only use entities in this class as contexts for next
token prediction. Word Error Rate (WER) evaluation is performed on i) an
internal calling, messaging, and dictation dataset, and ii) the SLUE-Voxpopuli
dataset. Results indicate that biasing lists and few-shot examples can achieve
17.8% and 9.6% relative improvement compared to first pass ASR, and that
multi-task training and dynamic prompting can achieve 20.0% and 11.3% relative
WER improvement, respectively.
Related papers
- Evaluating LLM Prompts for Data Augmentation in Multi-label Classification of Ecological Texts [1.565361244756411]
Large language models (LLMs) play a crucial role in natural language processing (NLP) tasks.
This study applied prompt-based data augmentation to detect mentions of green practices in Russian social media.
arXiv Detail & Related papers (2024-11-22T12:37:41Z) - Self-Calibrated Listwise Reranking with Large Language Models [137.6557607279876]
Large language models (LLMs) have been employed in reranking tasks through a sequence-to-sequence approach.
This reranking paradigm requires a sliding window strategy to iteratively handle larger candidate sets.
We propose a novel self-calibrated listwise reranking method, which aims to leverage LLMs to produce global relevance scores for ranking.
arXiv Detail & Related papers (2024-11-07T10:31:31Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - RepEval: Effective Text Evaluation with LLM Representation [55.26340302485898]
RepEval is a metric that leverages the projection of Large Language Models (LLMs) representations for evaluation.
Our work underscores the richness of information regarding text quality embedded within LLM representations, offering insights for the development of new metrics.
arXiv Detail & Related papers (2024-04-30T13:50:55Z) - Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information.
We propose an approach to distill the generated information during fine-tuning of self-supervised speech models.
We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z) - LLMaAA: Making Large Language Models as Active Annotators [32.57011151031332]
We propose LLMaAA, which takes large language models as annotators and puts them into an active learning loop to determine what to annotate efficiently.
We conduct experiments and analysis on two classic NLP tasks, named entity recognition and relation extraction.
With LLMaAA, task-specific models trained from LLM-generated labels can outperform the teacher within only hundreds of annotated examples.
arXiv Detail & Related papers (2023-10-30T14:54:15Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data.
Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [103.6153593636399]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)
It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.
Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.