SPECTRA: Spectral Target-Aware Graph Augmentation for Imbalanced Molecular Property Regression
- URL: http://arxiv.org/abs/2511.04838v1
- Date: Thu, 06 Nov 2025 21:57:21 GMT
- Title: SPECTRA: Spectral Target-Aware Graph Augmentation for Imbalanced Molecular Property Regression
- Authors: Brenda Nogueira, Meng Jiang, Nitesh V. Chawla, Nuno Moniz,
- Abstract summary: SPECTRA is a Spectral Target-Aware graph augmentation framework.<n>It generates realistic molecular graphs in the spectral domain.<n>It consistently improves error in relevant target ranges while maintaining competitive overall MAE.
- Score: 45.62053904749856
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In molecular property prediction, the most valuable compounds (e.g., high potency) often occupy sparse regions of the target space. Standard Graph Neural Networks (GNNs) commonly optimize for the average error, underperforming on these uncommon but critical cases, with existing oversampling methods often distorting molecular topology. In this paper, we introduce SPECTRA, a Spectral Target-Aware graph augmentation framework that generates realistic molecular graphs in the spectral domain. SPECTRA (i) reconstructs multi-attribute molecular graphs from SMILES; (ii) aligns molecule pairs via (Fused) Gromov-Wasserstein couplings to obtain node correspondences; (iii) interpolates Laplacian eigenvalues, eigenvectors and node features in a stable share-basis; and (iv) reconstructs edges to synthesize physically plausible intermediates with interpolated targets. A rarity-aware budgeting scheme, derived from a kernel density estimation of labels, concentrates augmentation where data are scarce. Coupled with a spectral GNN using edge-aware Chebyshev convolutions, SPECTRA densifies underrepresented regions without degrading global accuracy. On benchmarks, SPECTRA consistently improves error in relevant target ranges while maintaining competitive overall MAE, and yields interpretable synthetic molecules whose structure reflects the underlying spectral geometry. Our results demonstrate that spectral, geometry-aware augmentation is an effective and efficient strategy for imbalanced molecular property regression.
Related papers
- Spectral Analysis of Molecular Kernels: When Richer Features Do Not Guarantee Better Generalization [3.2880869992413246]
We provide the first comprehensive spectral analysis of kernel ridge regression on the QM9 dataset.<n>Surprisingly, richer spectral features, measured by four different spectral metrics, do not consistently improve accuracy.<n>For transformer-based and local 3D representations, spectral richness can even have a negative correlation with performance.
arXiv Detail & Related papers (2025-10-16T01:52:26Z) - Aligned Manifold Property and Topology Point Clouds for Learning Molecular Properties [55.2480439325792]
This work introduces AMPTCR, a molecular surface representation that combines local quantum-derived scalar fields and custom topological descriptors within an aligned point cloud format.<n>For molecular weight, results confirm that AMPTCR encodes physically meaningful data, with a validation R2 of 0.87.<n>In the bacterial inhibition task, AMPTCR enables both classification and direct regression of E. coli inhibition values.
arXiv Detail & Related papers (2025-07-22T04:35:50Z) - Spectral Manifold Harmonization for Graph Imbalanced Regression [30.376583325991454]
We present Spectral Manifold Harmonization (SMH), a novel approach to address imbalanced regression challenges on graph-structured data.<n>SMH generates synthetic graph samples that preserve topological properties while focusing on the most relevant target distribution regions.<n> Experimental results demonstrate the potential of SMH on chemistry and drug discovery benchmark datasets.
arXiv Detail & Related papers (2025-07-01T18:48:43Z) - Spectro-Riemannian Graph Neural Networks [39.901731107377095]
Cusp Laplacian is an extension of the traditional graph Laplacian based on Ollivier-Ricci curvature.<n>Cusp Pooling is a hierarchical attention mechanism combined with a curvature-based positional encoding.
arXiv Detail & Related papers (2025-02-01T11:31:01Z) - HoloNets: Spectral Convolutions do extend to Directed Graphs [59.851175771106625]
Conventional wisdom dictates that spectral convolutional networks may only be deployed on undirected graphs.
Here we show this traditional reliance on the graph Fourier transform to be superfluous.
We provide a frequency-response interpretation of newly developed filters, investigate the influence of the basis used to express filters and discuss the interplay with characteristic operators on which networks are based.
arXiv Detail & Related papers (2023-10-03T17:42:09Z) - Handling Missing Data via Max-Entropy Regularized Graph Autoencoder [37.8103274049137]
MEGAE is a regularized graph autoencoder for graph attribute imputation.
It aims at mitigating spectral concentration problem by maximizing the graph spectral entropy.
It outperforms all the other state-of-the-art imputation methods on a variety of benchmark datasets.
arXiv Detail & Related papers (2022-11-30T06:22:40Z) - Spectral-Spatial Global Graph Reasoning for Hyperspectral Image
Classification [50.899576891296235]
Convolutional neural networks have been widely applied to hyperspectral image classification.
Recent methods attempt to address this issue by performing graph convolutions on spatial topologies.
arXiv Detail & Related papers (2021-06-26T06:24:51Z) - Distance-aware Molecule Graph Attention Network for Drug-Target Binding
Affinity Prediction [54.93890176891602]
We propose a diStance-aware Molecule graph Attention Network (S-MAN) tailored to drug-target binding affinity prediction.
As a dedicated solution, we first propose a position encoding mechanism to integrate the topological structure and spatial position information into the constructed pocket-ligand graph.
We also propose a novel edge-node hierarchical attentive aggregation structure which has edge-level aggregation and node-level aggregation.
arXiv Detail & Related papers (2020-12-17T17:44:01Z) - Uncovering the Folding Landscape of RNA Secondary Structure with Deep
Graph Embeddings [71.20283285671461]
We propose a geometric scattering autoencoder (GSAE) network for learning such graph embeddings.
Our embedding network first extracts rich graph features using the recently proposed geometric scattering transform.
We show that GSAE organizes RNA graphs both by structure and energy, accurately reflecting bistable RNA structures.
arXiv Detail & Related papers (2020-06-12T00:17:59Z) - Spectral Pyramid Graph Attention Network for Hyperspectral Image
Classification [5.572542792318872]
Convolutional neural networks (CNN) have made significant advances in hyperspectral image (HSI) classification.
Standard convolutional kernel neglects intrinsic connections between data points, resulting in poor region delineation and small spurious predictions.
This paper presents a novel architecture which explicitly addresses these two issues.
arXiv Detail & Related papers (2020-01-20T13:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.