Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset
- URL: http://arxiv.org/abs/2510.21038v2
- Date: Thu, 30 Oct 2025 10:23:32 GMT
- Title: Elementary, My Dear Watson: Non-Invasive Neural Keyword Spotting in the LibriBrain Dataset
- Authors: Gereon Elvers, Gilad Landau, Oiwi Parker Jones,
- Abstract summary: Keywords Spotting (KWS) is a privacy-aware intermediate task for brain-computer interfaces.<n>We release an updated version of the pnpl library with word-level dataloaders and Colab-ready tutorials.
- Score: 1.497166779417398
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Non-invasive brain-computer interfaces (BCIs) are beginning to benefit from large, public benchmarks. However, current benchmarks target relatively simple, foundational tasks like Speech Detection and Phoneme Classification, while application-ready results on tasks like Brain-to-Text remain elusive. We propose Keyword Spotting (KWS) as a practically applicable, privacy-aware intermediate task. Using the deep 52-hour, within-subject LibriBrain corpus, we provide standardized train/validation/test splits for reproducible benchmarking, and adopt an evaluation protocol tailored to extreme class imbalance. Concretely, we use area under the precision-recall curve (AUPRC) as a robust evaluation metric, complemented by false alarms per hour (FA/h) at fixed recall to capture user-facing trade-offs. To simplify deployment and further experimentation within the research community, we are releasing an updated version of the pnpl library with word-level dataloaders and Colab-ready tutorials. As an initial reference model, we present a compact 1-D Conv/ResNet baseline with focal loss and top-k pooling that is trainable on a single consumer-class GPU. The reference model achieves approximately 13x the permutation baseline AUPRC on held-out sessions, demonstrating the viability of the task. Exploratory analyses reveal: (i) predictable within-subject scaling - performance improves log-linearly with more training hours - and (ii) the existence of word-level factors (frequency and duration) that systematically modulate detectability.
Related papers
- A Rubric-Supervised Critic from Sparse Real-World Outcomes [87.11204512676193]
Real-world coding agents operate with humans in the loop, where success signals are typically noisy, delayed, and sparse.<n>We propose a process to learn a "critic" model from sparse and noisy interaction data, which can then be used both as a reward model for either RL-based training or inference-time scaling.
arXiv Detail & Related papers (2026-03-04T07:23:54Z) - PDR: A Plug-and-Play Positional Decay Framework for LLM Pre-training Data Detection [30.13331191100816]
We introduce Positional Decay Reweighting (PDR), a training-free and plug-and-play framework to detect pre-training data in Large Language Models (LLMs)<n>PDR explicitly reweights token-level scores to amplify distinct signals from early positions while suppressing noise from later ones.
arXiv Detail & Related papers (2026-01-11T09:32:13Z) - Test-time Offline Reinforcement Learning on Goal-related Experience [50.94457794664909]
Research in foundation models has shown that performance can be substantially improved through test-time training.<n>We propose a novel self-supervised data selection criterion, which selects transitions from an offline dataset according to their relevance to the current state.<n>Our goal-conditioned test-time training (GC-TTT) algorithm applies this routine in a receding-horizon fashion during evaluation, adapting the policy to the current trajectory as it is being rolled out.
arXiv Detail & Related papers (2025-07-24T21:11:39Z) - CountingDINO: A Training-free Pipeline for Class-Agnostic Counting using Unsupervised Backbones [7.717986156838291]
Class-agnostic counting (CAC) aims to estimate the number of objects in images without being restricted to predefined categories.<n>Current exemplar-based CAC methods rely heavily on labeled data for training.<n>We introduce CountingDINO, the first training-free exemplar-based CAC framework.
arXiv Detail & Related papers (2025-04-23T09:48:08Z) - Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes.
Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts.
We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z) - Online Continual Learning in Keyword Spotting for Low-Resource Devices
via Pooling High-Order Temporal Statistics [22.129910930772]
Keywords Spotting (KWS) models on embedded devices should adapt fast to new user-defined words without forgetting previous ones.
We consider the setup of embedded online continual learning (EOCL), where KWS models with frozen backbone are trained to incrementally recognize new words from a non-repeated stream of samples.
We propose Temporal Aware Pooling (TAP) which constructs an enriched feature space computing high-order moments of speech features extracted by a pre-trained backbone.
arXiv Detail & Related papers (2023-07-24T10:04:27Z) - Revisit Few-shot Intent Classification with PLMs: Direct Fine-tuning vs. Continual Pre-training [20.98770732015944]
Few-shot intent detection involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data.
We show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected.
To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance.
arXiv Detail & Related papers (2023-06-08T15:26:52Z) - M-Tuning: Prompt Tuning with Mitigated Label Bias in Open-Set Scenarios [58.617025733655005]
We propose a vision-language prompt tuning method with mitigated label bias (M-Tuning)<n>It introduces open words from the WordNet to extend the range of words forming the prompt texts from only closed-set label words to more, and thus prompts are tuned in a simulated open-set scenario.<n>Our method achieves the best performance on datasets with various scales, and extensive ablation studies also validate its effectiveness.
arXiv Detail & Related papers (2023-03-09T09:05:47Z) - A Closer Look at Debiased Temporal Sentence Grounding in Videos:
Dataset, Metric, and Approach [53.727460222955266]
Temporal Sentence Grounding in Videos (TSGV) aims to ground a natural language sentence in an untrimmed video.
Recent studies have found that current benchmark datasets may have obvious moment annotation biases.
We introduce a new evaluation metric "dR@n,IoU@m" that discounts the basic recall scores to alleviate the inflating evaluation caused by biased datasets.
arXiv Detail & Related papers (2022-03-10T08:58:18Z) - Open-set Short Utterance Forensic Speaker Verification using
Teacher-Student Network with Explicit Inductive Bias [59.788358876316295]
We propose a pipeline solution to improve speaker verification on a small actual forensic field dataset.
By leveraging large-scale out-of-domain datasets, a knowledge distillation based objective function is proposed for teacher-student learning.
We show that the proposed objective function can efficiently improve the performance of teacher-student learning on short utterances.
arXiv Detail & Related papers (2020-09-21T00:58:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.