Are EEG-to-Text Models Working?
- URL: http://arxiv.org/abs/2405.06459v4
- Date: Sat, 26 Oct 2024 05:41:51 GMT
- Title: Are EEG-to-Text Models Working?
- Authors: Hyejeong Jo, Yiqian Yang, Juhyeok Han, Yiqun Duan, Hui Xiong, Won Hee Lee,
- Abstract summary: This work critically analyzes existing models for open-vocabulary EEG-to-Text translation.
We propose a methodology to differentiate between models that truly learn from EEG signals and those that simply memorize training data.
- Score: 17.35247047061063
- License:
- Abstract: This work critically analyzes existing models for open-vocabulary EEG-to-Text translation. We identify a crucial limitation: previous studies often employed implicit teacher-forcing during evaluation, artificially inflating performance metrics. Additionally, they lacked a critical benchmark - comparing model performance on pure noise inputs. We propose a methodology to differentiate between models that truly learn from EEG signals and those that simply memorize training data. Our analysis reveals that model performance on noise data can be comparable to that on EEG data. These findings highlight the need for stricter evaluation practices in EEG-to-Text research, emphasizing transparent reporting and rigorous benchmarking with noise inputs. This approach will lead to more reliable assessments of model capabilities and pave the way for robust EEG-to-Text communication systems. Code is available at https://github.com/NeuSpeech/EEG-To-Text
Related papers
- DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models [39.493913608472404]
Large language model (LLM)-based Grammatical Error Correction (GEC) models often produce corrections that diverge from provided gold references.
This discrepancy undermines the reliability of traditional reference-based evaluation metrics.
We propose a novel evaluation framework for GEC models, DSGram, integrating Semantic Coherence, Edit Level, and Fluency, and utilizing a dynamic weighting mechanism.
arXiv Detail & Related papers (2024-12-17T11:54:16Z) - Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings [27.418738450536047]
We propose a two-step pipeline for converting EEG signals into sentences.
We first confirm that word-level semantic information can be learned from EEG data recorded during natural reading.
We employ a training-free retrieval method to retrieve sentences based on the predictions from the EEG encoder.
arXiv Detail & Related papers (2024-08-08T03:40:25Z) - RDBE: Reasoning Distillation-Based Evaluation Enhances Automatic Essay Scoring [0.0]
Reasoning Distillation-Based Evaluation (RDBE) integrates interpretability to elucidate the rationale behind model scores.
Our experimental results demonstrate the efficacy of RDBE across all scoring rubrics considered in the dataset.
arXiv Detail & Related papers (2024-07-03T05:49:01Z) - AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension [95.8442896569132]
We introduce AIR-Bench, the first benchmark to evaluate the ability of Large Audio-Language Models (LALMs) to understand various types of audio signals and interact with humans in the textual format.
Results demonstrate a high level of consistency between GPT-4-based evaluation and human evaluation.
arXiv Detail & Related papers (2024-02-12T15:41:22Z) - Analysing the Impact of Audio Quality on the Use of Naturalistic
Long-Form Recordings for Infant-Directed Speech Research [62.997667081978825]
Modelling of early language acquisition aims to understand how infants bootstrap their language skills.
Recent developments have enabled the use of more naturalistic training data for computational models.
It is currently unclear how the sound quality could affect analyses and modelling experiments conducted on such data.
arXiv Detail & Related papers (2023-05-03T08:25:37Z) - Improving the Robustness of Summarization Models by Detecting and
Removing Input Noise [50.27105057899601]
We present a large empirical study quantifying the sometimes severe loss in performance from different types of input noise for a range of datasets and model sizes.
We propose a light-weight method for detecting and removing such noise in the input during model inference without requiring any training, auxiliary models, or even prior knowledge of the type of noise.
arXiv Detail & Related papers (2022-12-20T00:33:11Z) - LDNet: Unified Listener Dependent Modeling in MOS Prediction for
Synthetic Speech [67.88748572167309]
We present LDNet, a unified framework for mean opinion score (MOS) prediction.
We propose two inference methods that provide more stable results and efficient computation.
arXiv Detail & Related papers (2021-10-18T08:52:31Z) - AES Systems Are Both Overstable And Oversensitive: Explaining Why And
Proposing Defenses [66.49753193098356]
We investigate the reason behind the surprising adversarial brittleness of scoring models.
Our results indicate that autoscoring models, despite getting trained as "end-to-end" models, behave like bag-of-words models.
We propose detection-based protection models that can detect oversensitivity and overstability causing samples with high accuracies.
arXiv Detail & Related papers (2021-09-24T03:49:38Z) - How much pretraining data do language models need to learn syntax? [12.668478784932878]
Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks.
We study the impact of pretraining data size on the knowledge of the models using RoBERTa.
arXiv Detail & Related papers (2021-09-07T15:51:39Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.