HCFT: Hierarchical Convolutional Fusion Transformer for EEG Decoding
- URL: http://arxiv.org/abs/2601.12279v1
- Date: Sun, 18 Jan 2026 06:36:30 GMT
- Title: HCFT: Hierarchical Convolutional Fusion Transformer for EEG Decoding
- Authors: Haodong Zhang, Jiapeng Zhu, Yitong Chen, Hongqi Li,
- Abstract summary: We propose a lightweight decoding framework named Hierarchical Conencephaloal Fusion Transformer (HCFT)<n>HCFT combines dual-branchal encoders and hierarchical Transformer blocks for multi-scale representation.<n>Results show that HCFT achieves 80.83% average accuracy and a Cohen's kappa of 0.6165 on BCI IV-2b, as well as 99.10% sensitivity, 0.0236 false positives per hour, and 98.82% specificity on CHB-MIT.
- Score: 9.572621097681646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Electroencephalography (EEG) decoding requires models that can effectively extract and integrate complex temporal, spectral, and spatial features from multichannel signals. To address this challenge, we propose a lightweight and generalizable decoding framework named Hierarchical Convolutional Fusion Transformer (HCFT), which combines dual-branch convolutional encoders and hierarchical Transformer blocks for multi-scale EEG representation learning. Specifically, the model first captures local temporal and spatiotemporal dynamics through time-domain and time-space convolutional branches, and then aligns these features via a cross-attention mechanism that enables interaction between branches at each stage. Subsequently, a hierarchical Transformer fusion structure is employed to encode global dependencies across all feature stages, while a customized Dynamic Tanh normalization module is introduced to replace traditional Layer Normalization in order to enhance training stability and reduce redundancy. Extensive experiments are conducted on two representative benchmark datasets, BCI Competition IV-2b and CHB-MIT, covering both event-related cross-subject classification and continuous seizure prediction tasks. Results show that HCFT achieves 80.83% average accuracy and a Cohen's kappa of 0.6165 on BCI IV-2b, as well as 99.10% sensitivity, 0.0236 false positives per hour, and 98.82% specificity on CHB-MIT, consistently outperforming over ten state-of-the-art baseline methods. Ablation studies confirm that each core component of the proposed framework contributes significantly to the overall decoding performance, demonstrating HCFT's effectiveness in capturing EEG dynamics and its potential for real-world BCI applications.
Related papers
- Time2Vec Transformer for Robust Gesture Recognition from Low-Density sEMG [1.231764991565978]
This paper presents a novel, data-efficient deep learning framework for myoelectric prosthesis control.<n>Our approach implements a hybrid Transformer optimized for sparse, two-channel surface electromyography (sEMG)<n>The proposed framework offers a robust, cost-effective blueprint for next-generation prosthetic interfaces capable of rapid personalization.
arXiv Detail & Related papers (2026-02-02T09:28:27Z) - SKANet: A Cognitive Dual-Stream Framework with Adaptive Modality Fusion for Robust Compound GNSS Interference Classification [47.20483076887704]
Global Navigation Satellite Systems (GNSS) face growing threats from sophisticated jamming interference.<n>We propose a cognitive deep learning framework built upon a dual-stream architecture that integrates Time-Frequency Images (TFIs) and Power Spectral Density (PSD)<n>We show that SKANet achieves an overall accuracy of 96.99%, exhibiting superior robustness for compound jamming classification.
arXiv Detail & Related papers (2026-01-19T07:42:45Z) - GCMCG: A Clustering-Aware Graph Attention and Expert Fusion Network for Multi-Paradigm, Multi-task, and Cross-Subject EEG Decoding [0.7871262900865523]
Brain-Computer Interfaces (BCIs) based on Motor Imagery (MI) electroencephalogram (EEG) signals offer a direct pathway for human-machine interaction.<n>This paper proposes Graph-guided Clustering Mixture-of-Experts CNNGRUG, a novel unified framework for MI-ME EEG decoding.
arXiv Detail & Related papers (2025-11-29T18:05:33Z) - Bidirectional Time-Frequency Pyramid Network for Enhanced Robust EEG Classification [2.512406961007489]
BITE (Bidirectional Time-Freq Pyramid Network) is an end-to-end unified architecture featuring robust multistream synergy, pyramid time-frequency attention (PTFA), and bidirectional adaptive convolutions.<n>As a unified architecture, it combines robust performance across both MI and SSVEP tasks with exceptional computational efficiency.<n>Our work validates that paradigm-aligned spectral-temporal processing is essential for reliable BCI systems.
arXiv Detail & Related papers (2025-10-11T04:14:48Z) - Bidirectional Feature-aligned Motion Transformation for Efficient Dynamic Point Cloud Compression [97.66080040613726]
We propose a Bidirectional Feature-aligned Motion Transformation (Bi-FMT) framework that implicitly models motion in the feature space.<n>Bi-FMT aligns features across both past and future frames to produce temporally consistent latent representations.<n>We show Bi-FMT surpasses D-DPCC and AdaDPCC in both compression efficiency and runtime.
arXiv Detail & Related papers (2025-09-18T03:51:06Z) - BHViT: Binarized Hybrid Vision Transformer [53.38894971164072]
Model binarization has made significant progress in enabling real-time and energy-efficient computation for convolutional neural networks (CNN)<n>We propose BHViT, a binarization-friendly hybrid ViT architecture and its full binarization model with the guidance of three important observations.<n>Our proposed algorithm achieves SOTA performance among binary ViT methods.
arXiv Detail & Related papers (2025-03-04T08:35:01Z) - Dual-TSST: A Dual-Branch Temporal-Spectral-Spatial Transformer Model for EEG Decoding [2.0721229324537833]
We propose a novel decoding architecture network with a dual-branch temporal-spectral-spatial transformer (Dual-TSST)
Our proposed Dual-TSST performs superiorly in various tasks, which achieves the promising EEG classification performance of average accuracy of 80.67%.
This study provides a new approach to high-performance EEG decoding, and has great potential for future CNN-Transformer based applications.
arXiv Detail & Related papers (2024-09-05T05:08:43Z) - Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion [56.38386580040991]
Consistency Trajectory Model (CTM) is a generalization of Consistency Models (CM)
CTM enables the efficient combination of adversarial training and denoising score matching loss to enhance performance.
Unlike CM, CTM's access to the score function can streamline the adoption of established controllable/conditional generation methods.
arXiv Detail & Related papers (2023-10-01T05:07:17Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase
Classification Using EEG [56.155331323304]
Deep learning based electroencephalogram channels' feature level fusion is carried out in this work.
Channel selection, fusion, and classification procedures were optimized by two optimization algorithms.
arXiv Detail & Related papers (2021-12-18T14:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.