HSA-Net: Hierarchical and Structure-Aware Framework for Efficient and Scalable Molecular Language Modeling
- URL: http://arxiv.org/abs/2508.08334v1
- Date: Sun, 10 Aug 2025 15:22:42 GMT
- Title: HSA-Net: Hierarchical and Structure-Aware Framework for Efficient and Scalable Molecular Language Modeling
- Authors: Zihang Shao, Wentao Lei, Lei Wang, Wencai Ye, Li Liu,
- Abstract summary: We propose Hierarchical and Structure-Aware Network (HSA-Net), a novel framework with two modules that enables a hierarchical feature projection and fusion.<n>To adaptively merge multi-level features, we design a Source-Aware Fusion (SAF) module, which flexibly selects fusion experts based on the characteristics of the aggregation features.<n>Extensive experiments demonstrate that our HSA-Net framework quantitatively and qualitatively outperforms current state-of-the-art (SOTA) methods.
- Score: 7.26697833663902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Molecular representation learning, a cornerstone for downstream tasks like molecular captioning and molecular property prediction, heavily relies on Graph Neural Networks (GNN). However, GNN suffers from the over-smoothing problem, where node-level features collapse in deep GNN layers. While existing feature projection methods with cross-attention have been introduced to mitigate this issue, they still perform poorly in deep features. This motivated our exploration of using Mamba as an alternative projector for its ability to handle complex sequences. However, we observe that while Mamba excels at preserving global topological information from deep layers, it neglects fine-grained details in shallow layers. The capabilities of Mamba and cross-attention exhibit a global-local trade-off. To resolve this critical global-local trade-off, we propose Hierarchical and Structure-Aware Network (HSA-Net), a novel framework with two modules that enables a hierarchical feature projection and fusion. Firstly, a Hierarchical Adaptive Projector (HAP) module is introduced to process features from different graph layers. It learns to dynamically switch between a cross-attention projector for shallow layers and a structure-aware Graph-Mamba projector for deep layers, producing high-quality, multi-level features. Secondly, to adaptively merge these multi-level features, we design a Source-Aware Fusion (SAF) module, which flexibly selects fusion experts based on the characteristics of the aggregation features, ensuring a precise and effective final representation fusion. Extensive experiments demonstrate that our HSA-Net framework quantitatively and qualitatively outperforms current state-of-the-art (SOTA) methods.
Related papers
- HyM-UNet: Synergizing Local Texture and Global Context via Hybrid CNN-Mamba Architecture for Medical Image Segmentation [3.976000861085382]
HyM-UNet is designed to synergize the local feature extraction capabilities of CNNs with the efficient global modeling capabilities of Mamba.<n>To bridge the semantic gap between the encoder and the decoder, we propose a Mamba-Guided Fusion Skip Connection.<n>The results demonstrate that HyM-UNet significantly outperforms existing state-of-the-art methods in terms of Dice coefficient and IoU.
arXiv Detail & Related papers (2025-11-22T09:02:06Z) - Semantics and Content Matter: Towards Multi-Prior Hierarchical Mamba for Image Deraining [95.00432497331583]
Multi-Prior Hierarchical Mamba (MPHM) network for image deraining.<n>MPHM integrates macro-semantic textual priors (CLIP) for task-level semantic guidance and micro-structural visual priors (DINOv2) for scene-aware structural information.<n>Experiments demonstrate MPHM's state-of-the-art performance, achieving a 0.57 dB PSNR gain on the Rain200H dataset.
arXiv Detail & Related papers (2025-11-17T08:08:59Z) - GCRPNet: Graph-Enhanced Contextual and Regional Perception Network for Salient Object Detection in Optical Remote Sensing Images [68.33481681452675]
We propose a graph-enhanced contextual and regional perception network (GCRPNet)<n>It builds upon the Mamba architecture to simultaneously capture long-range dependencies and enhance regional feature representation.<n>It performs adaptive patch scanning on feature maps processed via multi-scale convolutions, thereby capturing rich local region information.
arXiv Detail & Related papers (2025-08-14T11:31:43Z) - MHAF-YOLO: Multi-Branch Heterogeneous Auxiliary Fusion YOLO for accurate object detection [3.87627600245713]
We propose MHAF-YOLO, a novel detection framework featuring a versatile neck design called the Multi-Branch Auxiliary FPN (MAFPN)<n>The SAF bridges the backbone and the neck by fusing shallow features, effectively transferring crucial low-level spatial information with high fidelity.<n>The AAF integrates multi-scale feature information at deeper neck layers, delivering richer gradient information to the output layer and further enhancing the model learning capacity.<n>RepHMS is globally integrated into the network, utilizing GHFKS to select larger convolutional kernels for various feature layers, expanding the vertical receptive field and capturing contextual information across spatial hierarchies.
arXiv Detail & Related papers (2025-02-07T04:49:42Z) - GraphMamba: An Efficient Graph Structure Learning Vision Mamba for Hyperspectral Image Classification [19.740333867168108]
GraphMamba is an efficient graph structure learning vision Mamba classification framework to achieve deep spatial-spectral information mining.
The core components of GraphMamba include the HyperMamba module for improving computational efficiency and the SpectralGCN module for adaptive spatial context awareness.
arXiv Detail & Related papers (2024-07-11T07:56:08Z) - Hi-GMAE: Hierarchical Graph Masked Autoencoders [90.30572554544385]
Hierarchical Graph Masked AutoEncoders (Hi-GMAE)
Hi-GMAE is a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs.
Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.
arXiv Detail & Related papers (2024-05-17T09:08:37Z) - SENetV2: Aggregated dense layer for channelwise and global
representations [0.0]
We introduce a novel aggregated multilayer perceptron, a multi-branch dense layer, within the Squeeze residual module.
This fusion enhances the network's ability to capture channel-wise patterns and have global knowledge.
We conduct extensive experiments on benchmark datasets to validate the model and compare them with established architectures.
arXiv Detail & Related papers (2023-11-17T14:10:57Z) - TOPIQ: A Top-down Approach from Semantics to Distortions for Image
Quality Assessment [53.72721476803585]
Image Quality Assessment (IQA) is a fundamental task in computer vision that has witnessed remarkable progress with deep neural networks.
We propose a top-down approach that uses high-level semantics to guide the IQA network to focus on semantically important local distortion regions.
A key component of our approach is the proposed cross-scale attention mechanism, which calculates attention maps for lower level features.
arXiv Detail & Related papers (2023-08-06T09:08:37Z) - Attention-driven Graph Clustering Network [49.040136530379094]
We propose a novel deep clustering method named Attention-driven Graph Clustering Network (AGCN)
AGCN exploits a heterogeneous-wise fusion module to dynamically fuse the node attribute feature and the topological graph feature.
AGCN can jointly perform feature learning and cluster assignment in an unsupervised fashion.
arXiv Detail & Related papers (2021-08-12T02:30:38Z) - Interflow: Aggregating Multi-layer Feature Mappings with Attention
Mechanism [0.7614628596146599]
This paper proposes the Interflow algorithm specially for traditional CNN models.
Interflow divides CNNs into several stages according to the depth and makes predictions by the feature mappings in each stage.
It can alleviate gradient vanishing problem, lower the difficulty of network depth selection, and lighten possible over-fitting problem.
arXiv Detail & Related papers (2021-06-26T18:22:01Z) - Multi-Level Graph Convolutional Network with Automatic Graph Learning
for Hyperspectral Image Classification [63.56018768401328]
We propose a Multi-level Graph Convolutional Network (GCN) with Automatic Graph Learning method (MGCN-AGL) for HSI classification.
By employing attention mechanism to characterize the importance among spatially neighboring regions, the most relevant information can be adaptively incorporated to make decisions.
Our MGCN-AGL encodes the long range dependencies among image regions based on the expressive representations that have been produced at local level.
arXiv Detail & Related papers (2020-09-19T09:26:20Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z) - Cross-layer Feature Pyramid Network for Salient Object Detection [102.20031050972429]
We propose a novel Cross-layer Feature Pyramid Network to improve the progressive fusion in salient object detection.
The distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information.
arXiv Detail & Related papers (2020-02-25T14:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.