Adaptive Split-MMD Training for Small-Sample Cross-Dataset P300 EEG Classification
- URL: http://arxiv.org/abs/2510.21969v1
- Date: Fri, 24 Oct 2025 18:48:21 GMT
- Title: Adaptive Split-MMD Training for Small-Sample Cross-Dataset P300 EEG Classification
- Authors: Weiyu Chen, Arnaud Delorme,
- Abstract summary: Cross-dataset shift occurs when trying to boost a small target set with a large source dataset.<n>We introduce Adaptive Split Maximum Mean Discrepancy Training (AS-MMD)<n>AS-MMD combines a target-weighted loss with warm-up tied to the square root of the source/target size ratio.<n>It outperforms target-only and pooled training.
- Score: 12.103074826558531
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting single-trial P300 from EEG is difficult when only a few labeled trials are available. When attempting to boost a small target set with a large source dataset through transfer learning, cross-dataset shift arises. To address this challenge, we study transfer between two public visual-oddball ERP datasets using five shared electrodes (Fz, Pz, P3, P4, Oz) under a strict small-sample regime (target: 10 trials/subject; source: 80 trials/subject). We introduce Adaptive Split Maximum Mean Discrepancy Training (AS-MMD), which combines (i) a target-weighted loss with warm-up tied to the square root of the source/target size ratio, (ii) Split Batch Normalization (Split-BN) with shared affine parameters and per-domain running statistics, and (iii) a parameter-free logit-level Radial Basis Function kernel Maximum Mean Discrepancy (RBF-MMD) term using the median-bandwidth heuristic. Implemented on an EEG Conformer, AS-MMD is backbone-agnostic and leaves the inference-time model unchanged. Across both transfer directions, it outperforms target-only and pooled training (Active Visual Oddball: accuracy/AUC 0.66/0.74; ERP CORE P3: 0.61/0.65), with gains over pooling significant under corrected paired t-tests. Ablations attribute improvements to all three components.
Related papers
- Do Generative Metrics Predict YOLO Performance? An Evaluation Across Models, Augmentation Ratios, and Dataset Complexity [43.338311770275745]
We present a controlled evaluation of synthetic augmentation for YOLOv11 across three single-class detection regimes.<n>We benchmark six GAN-, diffusion-, and hybrid-based generators over augmentation ratios from 10% to 150% of the real training split.<n>For each dataset-generator-augmentation configuration, we compute pre-training dataset metrics under a matched-size bootstrap protocol.
arXiv Detail & Related papers (2026-02-20T03:02:36Z) - Diffusion Language Models are Super Data Learners [61.721441061210896]
When unique data is limited, diffusion language models (DLMs) consistently surpass autoregressive (AR) models by training for more epochs.<n>We attribute the gains to three compounding factors: (1) any-order modeling, (2) super-dense compute from iterative bidirectional denoising, and (3) built-in Monte Carlo augmentation.
arXiv Detail & Related papers (2025-11-05T08:17:42Z) - T3: Test-Time Model Merging in VLMs for Zero-Shot Medical Imaging Analysis [15.624549727053475]
Existing model-merging techniques fail to deliver consistent gains across diverse medical modalities.<n>We introduce Test-Time Task adaptive merging (T3), a backpropagation-free framework that computes per-sample coefficients.<n>We present a rigorous cross-evaluation protocol spanning in-domain, base-to-novel, and corruptions across four modalities.
arXiv Detail & Related papers (2025-10-31T08:05:40Z) - Ensemble Threshold Calibration for Stable Sensitivity Control [0.0]
We present an end-to-end framework that achieves exact recall with sub-percent variance over tens of millions of geometry pairs.<n>Our approach consistently hits a recall target within a small error, decreases redundant verifications relative to other calibrations, and runs end-to-end on a single TPU v3 core.
arXiv Detail & Related papers (2025-10-02T15:22:28Z) - Multidimensional Bayesian Active Machine Learning of Working Memory Task Performance [4.8878998002743606]
We show a validation of a Bayesian, two-axis, active-classification approach for a working-memory reconstruction task.<n>In a young adult population, we compare GP-driven Adaptive Mode (AM) with a traditional adaptive staircase Classic Mode (CM)<n>AM estimates converge more quickly than other sampling strategies, demonstrating that only about 30 samples are required for accurate fitting of the full model.
arXiv Detail & Related papers (2025-10-01T00:48:14Z) - APML: Adaptive Probabilistic Matching Loss for Robust 3D Point Cloud Reconstruction [16.82777427285544]
Training deep learning models for point cloud prediction tasks depends critically on loss functions that measure discrepancies between predicted and ground-truth point sets.<n>We propose Adaptive Probabilistic Matching Loss (APML), a fully differentiable approximation of one-to-one matching.<n>We analytically compute the temperature to guarantee a minimum probability, eliminating manual tuning.
arXiv Detail & Related papers (2025-09-09T19:31:06Z) - Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z) - Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics.
We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data.
Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z) - Inception Convolution with Efficient Dilation Search [121.41030859447487]
Dilation convolution is a critical mutant of standard convolution neural network to control effective receptive fields and handle large scale variance of objects.
We propose a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers.
We explore a practical method for fitting the complex inception convolution to the data, a simple while effective dilation search algorithm(EDO) based on statistical optimization is developed.
arXiv Detail & Related papers (2020-12-25T14:58:35Z) - Attentional-Biased Stochastic Gradient Descent [74.49926199036481]
We present a provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning.
Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
ABSGD is flexible enough to combine with other robust losses without any additional cost.
arXiv Detail & Related papers (2020-12-13T03:41:52Z) - Learning to Match Distributions for Domain Adaptation [116.14838935146004]
This paper proposes Learning to Match (L2M) to automatically learn the cross-domain distribution matching.
L2M reduces the inductive bias by using a meta-network to learn the distribution matching loss in a data-driven way.
Experiments on public datasets substantiate the superiority of L2M over SOTA methods.
arXiv Detail & Related papers (2020-07-17T03:26:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.