CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation
- URL: http://arxiv.org/abs/2410.16011v1
- Date: Mon, 21 Oct 2024 13:42:19 GMT
- Title: CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation
- Authors: Xi Xu, Wenda Xu, Siqi Ouyang, Lei Li,
- Abstract summary: Simultaneous speech translation (SimulST) systems must balance translation quality with response time.
There has been a longstanding belief that current metrics yield unrealistically high latency measurements in unsegmented streaming settings.
- Score: 17.473263201972483
- License:
- Abstract: Simultaneous speech translation (SimulST) systems must balance translation quality with response time, making latency measurement crucial for evaluating their real-world performance. However, there has been a longstanding belief that current metrics yield unrealistically high latency measurements in unsegmented streaming settings. In this paper, we investigate this phenomenon, revealing its root cause in a fundamental misconception underlying existing latency evaluation approaches. We demonstrate that this issue affects not only streaming but also segment-level latency evaluation across different metrics. Furthermore, we propose a modification to correctly measure computation-aware latency for SimulST systems, addressing the limitations present in existing metrics.
Related papers
- Systematic Evaluation of Online Speaker Diarization Systems Regarding their Latency [44.99833362998488]
The latency is the time span from audio input to the output of the corresponding speaker label.
The lowest latency is achieved for the DIART-pipeline with the embedding model pyannote/embedding.
The FS-EEND system shows a similarly good latency.
arXiv Detail & Related papers (2024-07-05T06:54:27Z) - Average Token Delay: A Duration-aware Latency Metric for Simultaneous
Translation [16.954965417930254]
We propose a novel latency evaluation metric for simultaneous translation called emphAverage Token Delay (ATD)
We demonstrate its effectiveness through analyses simulating user-side latency based on Ear-Voice Span (EVS)
arXiv Detail & Related papers (2023-11-24T08:53:52Z) - Analytical Verification of Performance of Deep Neural Network Based
Time-Synchronized Distribution System State Estimation [0.18726646412385334]
Recently, we demonstrated success of a time-synchronized state estimator using deep neural networks (DNNs)
In this letter, we provide analytical bounds on the performance of that state estimator as a function of perturbations in the input measurements.
arXiv Detail & Related papers (2023-11-12T22:01:34Z) - Neural Laplace Control for Continuous-time Delayed Systems [76.81202657759222]
We propose a continuous-time model-based offline RL method that combines a Neural Laplace dynamics model with a model predictive control (MPC) planner.
We show experimentally on continuous-time delayed environments it is able to achieve near expert policy performance.
arXiv Detail & Related papers (2023-02-24T12:40:28Z) - Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception.
We build a simple and effective framework for streaming perception.
Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z) - Stream-level Latency Evaluation for Simultaneous Machine Translation [5.50178437495268]
Simultaneous machine translation has recently gained traction thanks to significant quality improvements and the advent of streaming applications.
This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation.
arXiv Detail & Related papers (2021-04-18T11:16:17Z) - Dissecting User-Perceived Latency of On-Device E2E Speech Recognition [34.645194215436966]
We show that factors affecting token emission latency, and endpointing behavior significantly impact on user-perceived latency (UPL)
We achieve the best trade-off between latency and word error rate when performing ASR jointly with endpointing, and using the recently proposed alignment regularization.
arXiv Detail & Related papers (2021-04-06T00:55:11Z) - GO FIGURE: A Meta Evaluation of Factuality in Summarization [131.1087461486504]
We introduce GO FIGURE, a meta-evaluation framework for evaluating factuality evaluation metrics.
Our benchmark analysis on ten factuality metrics reveals that our framework provides a robust and efficient evaluation.
It also reveals that while QA metrics generally improve over standard metrics that measure factuality across domains, performance is highly dependent on the way in which questions are generated.
arXiv Detail & Related papers (2020-10-24T08:30:20Z) - FastEmit: Low-latency Streaming ASR with Sequence-level Emission
Regularization [78.46088089185156]
Streaming automatic speech recognition (ASR) aims to emit each hypothesized word as quickly and accurately as possible.
Existing approaches penalize emission delay by manipulating per-token or per-frame probability prediction in sequence transducer models.
We propose a sequence-level emission regularization method, named FastEmit, that applies latency regularization directly on per-sequence probability in training transducer models.
arXiv Detail & Related papers (2020-10-21T17:05:01Z) - Towards Streaming Perception [70.68520310095155]
We present an approach that coherently integrates latency and accuracy into a single metric for real-time online perception.
The key insight behind this metric is to jointly evaluate the output of the entire perception stack at every time instant.
We focus on the illustrative tasks of object detection and instance segmentation in urban video streams, and contribute a novel dataset with high-quality and temporally-dense annotations.
arXiv Detail & Related papers (2020-05-21T01:51:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.