EEG-Language Modeling for Pathology Detection
- URL: http://arxiv.org/abs/2409.07480v3
- Date: Fri, 31 Jan 2025 13:16:12 GMT
- Title: EEG-Language Modeling for Pathology Detection
- Authors: Sam Gijsen, Kerstin Ritter,
- Abstract summary: This paper pioneers EEG-language models (ELMs) trained on clinical reports and 15000 EEGs.<n>We propose to combine multimodal alignment in this novel domain with timeseries cropping and text segmentation.<n>Our multimodal models significantly improve pathology detection compared to EEG-only models across four evaluations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multimodal language modeling has enabled breakthroughs for representation learning, yet remains unexplored in the realm of functional brain data for pathology detection. This paper pioneers EEG-language models (ELMs) trained on clinical reports and 15000 EEGs. We propose to combine multimodal alignment in this novel domain with timeseries cropping and text segmentation, enabling an extension based on multiple instance learning to alleviate misalignment between irrelevant EEG or text segments. Our multimodal models significantly improve pathology detection compared to EEG-only models across four evaluations and for the first time enable zero-shot classification as well as retrieval of both neural signals and reports. In sum, these results highlight the potential of ELMs, representing significant progress for clinical applications.
Related papers
- Large Cognition Model: Towards Pretrained EEG Foundation Model [0.0]
We propose a transformer-based foundation model designed to generalize across diverse EEG datasets and downstream tasks.
Our findings highlight the potential of pretrained EEG foundation models to accelerate advancements in neuroscience, personalized medicine, and BCI technology.
arXiv Detail & Related papers (2025-02-11T04:28:10Z) - A generative framework to bridge data-driven models and scientific theories in language neuroscience [84.76462599023802]
We present generative explanation-mediated validation, a framework for generating concise explanations of language selectivity in the brain.
We show that explanatory accuracy is closely related to the predictive power and stability of the underlying statistical models.
arXiv Detail & Related papers (2024-10-01T15:57:48Z) - FoME: A Foundation Model for EEG using Adaptive Temporal-Lateral Attention Scaling [19.85701025524892]
FoME (Foundation Model for EEG) is a novel approach using adaptive temporal-lateral attention scaling.
FoME is pre-trained on a diverse 1.7TB dataset of scalp and intracranial EEG recordings, comprising 745M parameters trained for 1,096k steps.
arXiv Detail & Related papers (2024-09-19T04:22:40Z) - ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation [49.42525661521625]
This paper presents ShapeMamba-EM, a specialized fine-tuning method for 3D EM segmentation.
It is tested over a wide range of EM images, covering five segmentation tasks and 10 datasets.
arXiv Detail & Related papers (2024-08-26T08:59:22Z) - TRRG: Towards Truthful Radiology Report Generation With Cross-modal Disease Clue Enhanced Large Language Model [22.305034251561835]
We propose a truthful radiology report generation framework, namely TRRG, based on stage-wise training for cross-modal disease clue injection into large language models.
Our proposed framework achieves state-of-the-art performance in radiology report generation on datasets such as IU-Xray and MIMIC-CXR.
arXiv Detail & Related papers (2024-08-22T05:52:27Z) - CXR-Agent: Vision-language models for chest X-ray interpretation with uncertainty aware radiology reporting [0.0]
We evaluate the publicly available, state of the art, foundational vision-language models for chest X-ray interpretation.
We find that vision-language models often hallucinate with confident language, which slows down clinical interpretation.
We develop an agent-based vision-language approach for report generation using CheXagent's linear probes and BioViL-T's phrase grounding tools.
arXiv Detail & Related papers (2024-07-11T18:39:19Z) - Potential of Multimodal Large Language Models for Data Mining of Medical Images and Free-text Reports [51.45762396192655]
Multimodal large language models (MLLMs) have recently transformed many domains, significantly affecting the medical field. Notably, Gemini-Vision-series (Gemini) and GPT-4-series (GPT-4) models have epitomized a paradigm shift in Artificial General Intelligence for computer vision.
This study evaluated the performance of the Gemini, GPT-4, and 4 popular large models for an exhaustive evaluation across 14 medical imaging datasets.
arXiv Detail & Related papers (2024-07-08T09:08:42Z) - Language Representations Can be What Recommenders Need: Findings and Potentials [57.90679739598295]
We show that item representations, when linearly mapped from advanced LM representations, yield superior recommendation performance.
This outcome suggests the possible homomorphism between the advanced language representation space and an effective item representation space for recommendation.
Our findings highlight the connection between language modeling and behavior modeling, which can inspire both natural language processing and recommender system communities.
arXiv Detail & Related papers (2024-07-07T17:05:24Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - CephGPT-4: An Interactive Multimodal Cephalometric Measurement and
Diagnostic System with Visual Large Language Model [4.64641334287597]
The CephGPT-4 model exhibits excellent performance and has the potential to revolutionize orthodontic measurement and diagnostic applications.
These innovations hold revolutionary application potential in the field of orthodontics.
arXiv Detail & Related papers (2023-07-01T15:41:12Z) - A Transformer-based representation-learning model with unified
processing of multimodal input for clinical diagnostics [63.106382317917344]
We report a Transformer-based representation-learning model as a clinical diagnostic aid that processes multimodal input in a unified manner.
The unified model outperformed an image-only model and non-unified multimodal diagnosis models in the identification of pulmonary diseases.
arXiv Detail & Related papers (2023-06-01T16:23:47Z) - Language Models are Few-shot Learners for Prognostic Prediction [0.4254099382808599]
We explore the use of transformers and language models in prognostic prediction for immunotherapy using real-world patients' clinical data and molecular profiles.
The study benchmarks the efficacy of baselines and language models on prognostic prediction across multiple cancer types and investigates the impact of different pretrained language models under few-shot regimes.
arXiv Detail & Related papers (2023-02-24T15:35:36Z) - fMRI from EEG is only Deep Learning away: the use of interpretable DL to
unravel EEG-fMRI relationships [68.8204255655161]
We present an interpretable domain grounded solution to recover the activity of several subcortical regions from multichannel EEG data.
We recover individual spatial and time-frequency patterns of scalp EEG predictive of the hemodynamic signal in the subcortical nuclei.
arXiv Detail & Related papers (2022-10-23T15:11:37Z) - Making the Most of Text Semantics to Improve Biomedical Vision--Language
Processing [17.96645738679543]
We show that textual semantic modelling can substantially improve contrastive learning in self-supervised vision--language processing.
We propose a self-supervised joint vision--language approach with a focus on better text modelling.
arXiv Detail & Related papers (2022-04-21T00:04:35Z) - CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark [51.38557174322772]
We present the first Chinese Biomedical Language Understanding Evaluation benchmark.
It is a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification.
We report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling.
arXiv Detail & Related papers (2021-06-15T12:25:30Z) - Neural Language Models with Distant Supervision to Identify Major
Depressive Disorder from Clinical Notes [2.1060613825447407]
Major depressive disorder (MDD) is a prevalent psychiatric disorder that is associated with significant healthcare burden worldwide.
Recent advancements in neural language models, such as Bidirectional Representations for Transformers (BERT) model, resulted in state-of-the-art neural language models.
We propose to leverage the neural language models in a distant supervision paradigm to identify MDD phenotypes from clinical notes.
arXiv Detail & Related papers (2021-04-19T21:11:41Z) - Correlation based Multi-phasal models for improved imagined speech EEG
recognition [22.196642357767338]
This work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding to specific speech units.
A bi-phase common representation learning module using neural networks is designed to model the correlation and between an analysis phase and a support phase.
The proposed approach further handles the non-availability of multi-phasal data during decoding.
arXiv Detail & Related papers (2020-11-04T09:39:53Z) - A Novel Transferability Attention Neural Network Model for EEG Emotion
Recognition [51.203579838210885]
We propose a transferable attention neural network (TANN) for EEG emotion recognition.
TANN learns the emotional discriminative information by highlighting the transferable EEG brain regions data and samples adaptively.
This can be implemented by measuring the outputs of multiple brain-region-level discriminators and one single sample-level discriminator.
arXiv Detail & Related papers (2020-09-21T02:42:30Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - sEMG Gesture Recognition with a Simple Model of Attention [0.0]
We present our research in surface electromyography (sEMG) signal classification.
Our novel attention-based model achieves benchmark leading results on multiple industry-standard datasets.
Our results indicate that sEMG represents a promising avenue for future machine learning research.
arXiv Detail & Related papers (2020-06-05T19:28:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.