Related papers: The 2025 PNPL Competition: Speech Detection and Phoneme Classification in the LibriBrain Dataset

The 2025 PNPL Competition: Speech Detection and Phoneme Classification in the LibriBrain Dataset

URL: http://arxiv.org/abs/2506.10165v1
Date: Wed, 11 Jun 2025 20:34:33 GMT
Title: The 2025 PNPL Competition: Speech Detection and Phoneme Classification in the LibriBrain Dataset
Authors: Gilad Landau, Miran Özdogan, Gereon Elvers, Francesco Mantegna, Pratik Somaiya, Dulhan Jayalath, Luisa Kurth, Teyun Kwon, Brendan Shillingford, Greg Farquhar, Minqi Jiang, Karim Jerbi, Hamza Abdelhedi, Yorguin Mantilla Ramos, Caglar Gulcehre, Mark Woolrich, Natalie Voets, Oiwi Parker Jones,
Abstract summary: Speech decoding from non-invasive brain data holds potential for profound societal impact.<n>The ultimate aim of the 2025 PNPL competition is to produce the conditions for an "ImageNet moment"<n>We present the largest within-subject MEG dataset recorded to date (LibriBrain) together with a user-friendly Python library (pnpl)<n>The competition features a Standard track that emphasises algorithmic innovation, as well as an Extended track that is expected to reward larger-scale computing.
Score: 10.214825301231025
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The advance of speech decoding from non-invasive brain data holds the potential for profound societal impact. Among its most promising applications is the restoration of communication to paralysed individuals affected by speech deficits such as dysarthria, without the need for high-risk surgical interventions. The ultimate aim of the 2025 PNPL competition is to produce the conditions for an "ImageNet moment" or breakthrough in non-invasive neural decoding, by harnessing the collective power of the machine learning community. To facilitate this vision we present the largest within-subject MEG dataset recorded to date (LibriBrain) together with a user-friendly Python library (pnpl) for easy data access and integration with deep learning frameworks. For the competition we define two foundational tasks (i.e. Speech Detection and Phoneme Classification from brain data), complete with standardised data splits and evaluation metrics, illustrative benchmark models, online tutorial code, a community discussion board, and public leaderboard for submissions. To promote accessibility and participation the competition features a Standard track that emphasises algorithmic innovation, as well as an Extended track that is expected to reward larger-scale computing, accelerating progress toward a non-invasive brain-computer interface for speech.

Related papers

LibriBrain: Over 50 Hours of Within-Subject MEG to Improve Speech Decoding Methods at Scale [2.225053366951265]
LibriBrain is the largest single-subject MEG dataset to date for speech decoding.<n>This unprecedented depth' of within-subject data enables exploration of neural representations at a scale previously unavailable with non-invasive methods.
arXiv Detail & Related papers (2025-06-02T17:59:41Z)
Few-shot Hate Speech Detection Based on the MindSpore Framework [2.6396343924017915]
We propose MS-Hate, a prompt-enhanced neural framework for few-shot hate speech detection implemented on the MindSpore deep learning platform.<n> Experimental results on two benchmark datasets-HateXplain and HSOL-demonstrate that our approach outperforms competitive baselines in precision, recall, and F1-score.<n>These findings highlight the potential of combining prompt-based learning with adversarial augmentation for robust and adaptable hate speech detection in few-shot scenarios.
arXiv Detail & Related papers (2025-04-22T15:42:33Z)
Brain-to-Text Benchmark '24: Lessons Learned [30.41641771704316]
Speech brain-computer interfaces aim to decipher what a person is trying to say from neural activity alone.<n>The Brain-to-Text Benchmark '24 foster the advancement of decoding algorithms that convert neural activity to text.<n>The benchmark will remain open indefinitely to support further work towards increasing the accuracy of brain-to-text algorithms.
arXiv Detail & Related papers (2024-12-23T02:44:35Z)
Hypergame Theory for Decentralized Resource Allocation in Multi-user Semantic Communications [60.63472821600567]
A novel framework for decentralized computing and communication resource allocation in multiuser SC systems is proposed. The challenge of efficiently allocating communication and computing resources is addressed through the application of Stackelberg hyper game theory. Simulation results show that the proposed Stackelberg hyper game results in efficient usage of communication and computing resources.
arXiv Detail & Related papers (2024-09-26T15:55:59Z)
Semantic Meta-Split Learning: A TinyML Scheme for Few-Shot Wireless Image Classification [50.28867343337997]
This work presents a TinyML-based semantic communication framework for few-shot wireless image classification. We exploit split-learning to limit the computations performed by the end-users while ensuring privacy-preserving. meta-learning overcomes data availability concerns and speeds up training by utilizing similarly trained tasks.
arXiv Detail & Related papers (2024-09-03T05:56:55Z)
MindSpeech: Continuous Imagined Speech Decoding using High-Density fNIRS and Prompt Tuning for Advanced Human-AI Interaction [0.0]
This paper reports a novel method for human-AI interaction by developing a direct brain-AI interface. We discuss a novel AI model, called MindSpeech, which enables open-vocabulary, continuous decoding for imagined speech. We demonstrate significant improvements in key metrics, such as BLEU-1 and BERT P scores, for three out of four participants.
arXiv Detail & Related papers (2024-07-25T16:39:21Z)
Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z)
Sequential Best-Arm Identification with Application to Brain-Computer Interface [34.87975833920409]
A brain-computer interface (BCI) is a technology that enables direct communication between the brain and an external device or computer system. An electroencephalogram (EEG) and event-related potential (ERP)-based speller system is a type of BCI that allows users to spell words without using a physical keyboard. We propose a sequential top-two Thompson sampling (STTS) algorithm under the fixed-confidence setting and the fixed-budget setting.
arXiv Detail & Related papers (2023-05-17T18:49:44Z)
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language [85.9019051663368]
data2vec is a framework that uses the same learning method for either speech, NLP or computer vision. The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup. Experiments on the major benchmarks of speech recognition, image classification, and natural language understanding demonstrate a new state of the art or competitive performance.
arXiv Detail & Related papers (2022-02-07T22:52:11Z)
Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z)
Exploiting Unsupervised Data for Emotion Recognition in Conversations [76.01690906995286]
Emotion Recognition in Conversations (ERC) aims to predict the emotional state of speakers in conversations. The available supervised data for the ERC task is limited. We propose a novel approach to leverage unsupervised conversation data.
arXiv Detail & Related papers (2020-10-02T13:28:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.