Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for
Simultaneous Speech Translation
- URL: http://arxiv.org/abs/2206.05807v2
- Date: Wed, 15 Jun 2022 18:31:15 GMT
- Title: Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for
Simultaneous Speech Translation
- Authors: Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi
- Abstract summary: Simultaneous speech translation (SimulST) systems aim at generating their output with the lowest possible latency.
Average Lagging (AL) provides underestimated scores for systems that generate longer predictions compared to the corresponding references.
We show that this problem has practical relevance, as recent SimulST systems have indeed a tendency to over-generate.
- Score: 17.305879157385675
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Simultaneous speech translation (SimulST) systems aim at generating their
output with the lowest possible latency, which is normally computed in terms of
Average Lagging (AL). In this paper we highlight that, despite its widespread
adoption, AL provides underestimated scores for systems that generate longer
predictions compared to the corresponding references. We also show that this
problem has practical relevance, as recent SimulST systems have indeed a
tendency to over-generate. As a solution, we propose LAAL (Length-Adaptive
Average Lagging), a modified version of the metric that takes into account the
over-generation phenomenon and allows for unbiased evaluation of both
under-/over-generating systems.
Related papers
- Inference Scaling for Bridging Retrieval and Augmented Generation [47.091086803980765]
Retrieval-augmented generation (RAG) has emerged as a popular approach to steering the output of a large language model (LLM)<n>We show such bias can be mitigated, from inference scaling, aggregating inference calls from the permuted order of retrieved contexts.<n>We showcase the effectiveness of MOI on diverse RAG tasks, improving ROUGE-L on MS MARCO and EM on HotpotQA benchmarks by 7 points.
arXiv Detail & Related papers (2024-12-14T05:06:43Z) - FlowTS: Time Series Generation via Rectified Flow [67.41208519939626]
FlowTS is an ODE-based model that leverages rectified flow with straight-line transport in probability space.
For unconditional setting, FlowTS achieves state-of-the-art performance, with context FID scores of 0.019 and 0.011 on Stock and ETTh datasets.
For conditional setting, we have achieved superior performance in solar forecasting.
arXiv Detail & Related papers (2024-11-12T03:03:23Z) - Provenance: A Light-weight Fact-checker for Retrieval Augmented LLM Generation Output [49.893971654861424]
We present a light-weight approach for detecting nonfactual outputs from retrieval-augmented generation (RAG)
We compute a factuality score that can be thresholded to yield a binary decision.
Our experiments show high area under the ROC curve (AUC) across a wide range of relevant open source datasets.
arXiv Detail & Related papers (2024-11-01T20:44:59Z) - Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach [51.76826149868971]
Policy evaluation via Monte Carlo simulation is at the core of many MC Reinforcement Learning (RL) algorithms.
We propose as a quality index a surrogate of the mean squared error of a return estimator that uses trajectories of different lengths.
We present an adaptive algorithm called Robust and Iterative Data collection strategy Optimization (RIDO)
arXiv Detail & Related papers (2024-10-17T11:47:56Z) - Long-Context Linear System Identification [20.835344826113307]
This paper addresses the problem of long-context linear system identification, where the state $x_t$ of a dynamical system at time $t$ depends linearly on previous states $x_s$ over a fixed context window of length $p$.
We establish a sample complexity bound that matches the i.i.d. parametric rate up to logarithmic factors for a broad class of systems, extending previous works that considered only first-order dependencies.
arXiv Detail & Related papers (2024-10-08T05:15:21Z) - Nearest Neighbor Speculative Decoding for LLM Generation and Attribution [87.3259169631789]
Nearest Speculative Decoding (NEST) is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources.
NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks.
In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.
arXiv Detail & Related papers (2024-05-29T17:55:03Z) - Analytical Verification of Performance of Deep Neural Network Based
Time-Synchronized Distribution System State Estimation [0.18726646412385334]
Recently, we demonstrated success of a time-synchronized state estimator using deep neural networks (DNNs)
In this letter, we provide analytical bounds on the performance of that state estimator as a function of perturbations in the input measurements.
arXiv Detail & Related papers (2023-11-12T22:01:34Z) - Time-series Generation by Contrastive Imitation [87.51882102248395]
We study a generative framework that seeks to combine the strengths of both: Motivated by a moment-matching objective to mitigate compounding error, we optimize a local (but forward-looking) transition policy.
At inference, the learned policy serves as the generator for iterative sampling, and the learned energy serves as a trajectory-level measure for evaluating sample quality.
arXiv Detail & Related papers (2023-11-02T16:45:25Z) - Factual Error Correction for Abstractive Summaries Using Entity
Retrieval [57.01193722520597]
We propose an efficient factual error correction system RFEC based on entities retrieval post-editing process.
RFEC retrieves the evidence sentences from the original document by comparing the sentences with the target summary.
Next, RFEC detects the entity-level errors in the summaries by considering the evidence sentences and substitutes the wrong entities with the accurate entities from the evidence sentences.
arXiv Detail & Related papers (2022-04-18T11:35:02Z) - SALAD: Self-Adaptive Lightweight Anomaly Detection for Real-time
Recurrent Time Series [1.0437764544103274]
This paper introduces SALAD, which is a Self-Adaptive Lightweight Anomaly Detection approach based on a special type of recurrent neural networks called Long Short-Term Memory (LSTM)
Experiments based on two real-world open-source time series datasets demonstrate that SALAD outperforms five other state-of-the-art anomaly detection approaches in terms of detection accuracy.
In addition, the results also show that SALAD is lightweight and can be deployed on a commodity machine.
arXiv Detail & Related papers (2021-04-19T10:36:23Z) - Stream-level Latency Evaluation for Simultaneous Machine Translation [5.50178437495268]
Simultaneous machine translation has recently gained traction thanks to significant quality improvements and the advent of streaming applications.
This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation.
arXiv Detail & Related papers (2021-04-18T11:16:17Z) - Non-Stationary Delayed Bandits with Intermediate Observations [10.538264213183076]
Online recommender systems often face long delays in receiving feedback, especially when optimizing for some long-term metrics.
We introduce the problem of non-stationary, delayed bandits with intermediate observations.
We develop an efficient algorithm based on UCRL, and prove sublinear regret guarantees for its performance.
arXiv Detail & Related papers (2020-06-03T09:27:03Z) - Pseudo-Convolutional Policy Gradient for Sequence-to-Sequence
Lip-Reading [96.48553941812366]
Lip-reading aims to infer the speech content from the lip movement sequence.
Traditional learning process of seq2seq models suffers from two problems.
We propose a novel pseudo-convolutional policy gradient (PCPG) based method to address these two problems.
arXiv Detail & Related papers (2020-03-09T09:12:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.