Related papers: Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning

Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning

URL: http://arxiv.org/abs/2507.01196v1
Date: Tue, 01 Jul 2025 21:21:42 GMT
Title: Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning
Authors: Na Lee, Konstantinos Barmpas, Yannis Panagakis, Dimitrios Adamos, Nikolaos Laskaris, Stefanos Zafeiriou,
Abstract summary: We evaluate current Large Brainwave Foundation Models (LBMs) through systematic fine-tuning experiments.<n>Our analysis shows that state-of-the-art LBMs achieve only marginal improvements (0.9%-1.2%) over traditional deep architectures.
Score: 41.40603531008809
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Foundation Models have demonstrated significant success across various domains in Artificial Intelligence (AI), yet their capabilities for brainwave modeling remain unclear. In this paper, we comprehensively evaluate current Large Brainwave Foundation Models (LBMs) through systematic fine-tuning experiments across multiple Brain-Computer Interface (BCI) benchmark tasks, including memory tasks and sleep stage classification. Our extensive analysis shows that state-of-the-art LBMs achieve only marginal improvements (0.9%-1.2%) over traditional deep architectures while requiring significantly more parameters (millions vs thousands), raising important questions about their efficiency and applicability in BCI contexts. Moreover, through detailed ablation studies and Low-Rank Adaptation (LoRA), we significantly reduce trainable parameters without performance degradation, while demonstrating that architectural and training inefficiencies limit LBMs' current capabilities. Our experiments span both full model fine-tuning and parameter-efficient adaptation techniques, providing insights into optimal training strategies for BCI applications. We pioneer the application of LoRA to LBMs, revealing that performance benefits generally emerge when adapting multiple neural network components simultaneously. These findings highlight the critical need for domain-specific development strategies to advance LBMs, suggesting that current architectures may require redesign to fully leverage the potential of foundation models in brainwave analysis.

Related papers

LRM-1B: Towards Large Routing Model [26.18687224390521]
Vehicle routing problems (VRPs) are central to optimization with significant practical implications.<n>Recent advancements in neural optimization (NCO) have demonstrated promising results by leveraging neural networks to solve VRPs.<n>This study introduces a Large Routing Model with 1 billion parameters (LRM-1B) designed to address diverse VRP scenarios.
arXiv Detail & Related papers (2025-07-04T05:10:20Z)
Dynamic Acoustic Model Architecture Optimization in Training for ASR [51.21112094223223]
DMAO is an architecture optimization framework that employs a grow-and-drop strategy to automatically reallocate parameters during training.<n>We evaluate DMAO through experiments with CTC onSpeech, TED-LIUM-v2 and Switchboard datasets.
arXiv Detail & Related papers (2025-06-16T07:47:34Z)
Advancing Brainwave Modeling with a Codebook-Based Foundation Model [41.525984326072596]
We introduce LaBraM++, an enhanced Large Brainwave Foundation Model (LBM) that incorporates principled improvements grounded in robust signal processing foundations.<n>LaBraM++ demonstrates substantial gains across a variety of tasks, consistently outperforming its originally-based architecture and achieving competitive results when compared to other open-source LBMs.
arXiv Detail & Related papers (2025-05-22T14:32:56Z)
Evaluating Mathematical Reasoning Across Large Language Models: A Fine-Grained Approach [15.960271016276447]
We present a systematic evaluation of mathematical reasoning abilities across eight leading Large Language Models (LLMs)<n>Our analyses reveal several key findings: DeepSeek-R1 performs competitively with o1 across most domains and achieves the highest accuracy on the MMLU Formal Logic benchmark.<n>We explore how architectural choices, training paradigms, and optimization strategies contribute to variation in reasoning performance.
arXiv Detail & Related papers (2025-03-13T17:23:45Z)
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs [70.91804882618243]
This paper proposes DSMoE, a novel approach that achieves sparsification by partitioning pre-trained FFN layers into computational blocks.<n>We implement adaptive expert routing using sigmoid activation and straight-through estimators, enabling tokens to flexibly access different aspects of model knowledge.<n>Experiments on LLaMA models demonstrate that under equivalent computational constraints, DSMoE achieves superior performance compared to existing pruning and MoE approaches.
arXiv Detail & Related papers (2025-02-18T02:37:26Z)
On Accelerating Edge AI: Optimizing Resource-Constrained Environments [1.7355861031903428]
Resource-constrained edge deployments demand AI solutions that balance high performance with stringent compute, memory, and energy limitations.<n>We present a comprehensive overview of the primary strategies for accelerating deep learning models under such constraints.
arXiv Detail & Related papers (2025-01-25T01:37:03Z)
A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z)
RedTest: Towards Measuring Redundancy in Deep Neural Networks Effectively [10.812755570974929]
We use Model Structural Redundancy Score (MSRS) to measure the degree of redundancy in a deep learning model structure. MSRS is effective in both revealing and assessing the redundancy issues in many state-of-the-art models. We design a novel redundancy-aware algorithm to guide the search for the optimal model structure.
arXiv Detail & Related papers (2024-11-15T14:36:07Z)
When Parameter-efficient Tuning Meets General-purpose Vision-language Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique. Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.