Correlation based Multi-phasal models for improved imagined speech EEG
recognition
- URL: http://arxiv.org/abs/2011.02195v1
- Date: Wed, 4 Nov 2020 09:39:53 GMT
- Title: Correlation based Multi-phasal models for improved imagined speech EEG
recognition
- Authors: Rini A Sharon, Hema A Murthy
- Abstract summary: This work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding to specific speech units.
A bi-phase common representation learning module using neural networks is designed to model the correlation and between an analysis phase and a support phase.
The proposed approach further handles the non-availability of multi-phasal data during decoding.
- Score: 22.196642357767338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Translation of imagined speech electroencephalogram(EEG) into human
understandable commands greatly facilitates the design of naturalistic brain
computer interfaces. To achieve improved imagined speech unit classification,
this work aims to profit from the parallel information contained in
multi-phasal EEG data recorded while speaking, imagining and performing
articulatory movements corresponding to specific speech units. A bi-phase
common representation learning module using neural networks is designed to
model the correlation and reproducibility between an analysis phase and a
support phase. The trained Correlation Network is then employed to extract
discriminative features of the analysis phase. These features are further
classified into five binary phonological categories using machine learning
models such as Gaussian mixture based hidden Markov model and deep neural
networks. The proposed approach further handles the non-availability of
multi-phasal data during decoding. Topographic visualizations along with
result-based inferences suggest that the multi-phasal correlation modelling
approach proposed in the paper enhances imagined-speech EEG recognition
performance.
Related papers
- EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models [0.0]
We propose an ensemble learning framework for electroencephalogram-based overt speech classification.
The ensemble comprises three models with kernel sizes of 51, 101, and 201.
Results indicate that the proposed ensemble-based approach significantly outperforms individual models and existing state-of-the-art techniques.
arXiv Detail & Related papers (2024-11-14T09:23:58Z) - NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention [47.8479647938849]
We present a neuro-guided speaker extraction model, i.e. NeuroSpex, using the EEG response of the listener as the sole auxiliary reference cue.
We propose a novel EEG signal encoder that captures the attention information. Additionally, we propose a cross-attention (CA) mechanism to enhance the speech feature representations.
arXiv Detail & Related papers (2024-09-04T07:33:01Z) - A Lesion-aware Edge-based Graph Neural Network for Predicting Language Ability in Patients with Post-stroke Aphasia [12.129896943547912]
We propose a lesion-aware graph neural network (LEGNet) to predict language ability from resting-state fMRI (rs-fMRI) connectivity in patients with post-stroke aphasia.
Our model integrates three components: an edge-based learning module that encodes functional connectivity between brain regions, a lesion encoding module, and a subgraph learning module.
arXiv Detail & Related papers (2024-09-03T21:28:48Z) - Investigating the Timescales of Language Processing with EEG and Language Models [0.0]
This study explores the temporal dynamics of language processing by examining the alignment between word representations from a pre-trained language model and EEG data.
Using a Temporal Response Function (TRF) model, we investigate how neural activity corresponds to model representations across different layers.
Our analysis reveals patterns in TRFs from distinct layers, highlighting varying contributions to lexical and compositional processing.
arXiv Detail & Related papers (2024-06-28T12:49:27Z) - Brain-Driven Representation Learning Based on Diffusion Model [25.375490061512]
Denoising diffusion probabilistic models (DDPMs) are explored in our research as a means to address this issue.
Using DDPMs in conjunction with a conditional autoencoder, our new approach considerably outperforms traditional machine learning algorithms.
Our results highlight the potential of DDPMs as a sophisticated computational method for the analysis of speech-related EEG signals.
arXiv Detail & Related papers (2023-11-14T05:59:58Z) - Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match
vs. Mismatch Classification [28.186129896907694]
We propose a "match-vs-mismatch" deep learning model to classify whether a video clip induces excitatory responses in recorded EEG signals.
We demonstrate that the proposed model is able to achieve the highest accuracy on unseen subjects.
These results have the potential to facilitate the development of neural recording-based video reconstruction.
arXiv Detail & Related papers (2023-09-08T06:37:25Z) - Self-supervised models of audio effectively explain human cortical
responses to speech [71.57870452667369]
We capitalize on the progress of self-supervised speech representation learning to create new state-of-the-art models of the human auditory system.
We show that these results show that self-supervised models effectively capture the hierarchy of information relevant to different stages of speech processing in human cortex.
arXiv Detail & Related papers (2022-05-27T22:04:02Z) - Deep Neural Convolutive Matrix Factorization for Articulatory
Representation Decomposition [48.56414496900755]
This work uses a neural implementation of convolutive sparse matrix factorization to decompose the articulatory data into interpretable gestures and gestural scores.
Phoneme recognition experiments were additionally performed to show that gestural scores indeed code phonological information successfully.
arXiv Detail & Related papers (2022-04-01T14:25:19Z) - Discretization and Re-synthesis: an alternative method to solve the
Cocktail Party Problem [65.25725367771075]
This study demonstrates, for the first time, that the synthesis-based approach can also perform well on this problem.
Specifically, we propose a novel speech separation/enhancement model based on the recognition of discrete symbols.
By utilizing the synthesis model with the input of discrete symbols, after the prediction of discrete symbol sequence, each target speech could be re-synthesized.
arXiv Detail & Related papers (2021-12-17T08:35:40Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Multi-modal Automated Speech Scoring using Attention Fusion [46.94442359735952]
We propose a novel multi-modal end-to-end neural approach for automated assessment of non-native English speakers' spontaneous speech.
We employ Bi-directional Recurrent Convolutional Neural Networks and Bi-directional Long Short-Term Memory Neural Networks to encode acoustic and lexical cues from spectrograms and transcriptions.
We find combined attention to both lexical and acoustic cues significantly improves the overall performance of the system.
arXiv Detail & Related papers (2020-05-17T07:53:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.