Data-driven Synthesis of Magnetic Resonance Spectroscopy Data using a Variational Autoencoder
- URL: http://arxiv.org/abs/2603.00736v1
- Date: Sat, 28 Feb 2026 16:52:16 GMT
- Title: Data-driven Synthesis of Magnetic Resonance Spectroscopy Data using a Variational Autoencoder
- Authors: Dennis M. J. van de Sande, Julian P. Merkofer, Sina Amirrajab, Mitko Veta, Gerhard S. Drenthen, Jacobus F. A. Jansen, Marcel Breeuwer,
- Abstract summary: We propose a data-driven framework for synthesizing in-vivo MRS data using a variational autoencoder (VAE) trained exclusively on measured single-voxel spectroscopy data.<n>The VAE learns a low-dimensional latent representation of complex-valued spectra and enables generation of new samples through latent-space sampling and synthesis.<n>The results demonstrate that the VAE accurately reconstructs dominant spectral patterns and generates synthetic spectra that occupy the same feature space as in-vivo data.
- Score: 1.7789378551794652
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The development of deep learning methods for magnetic resonance spectroscopy (MRS) is often hindered by limited availability of large, high-quality training datasets. While physics-based simulations are commonly used to mitigate this limitation, accurately modeling all in-vivo signal components remains challenging. In this work, we propose a data-driven framework for synthesizing in-vivo MRS data using a variational autoencoder (VAE) trained exclusively on measured single-voxel spectroscopy data. The model learns a low-dimensional latent representation of complex-valued spectra and enables generation of new samples through latent-space sampling and interpolation. The generative performance of the proposed approach is evaluated using a comprehensive set of complementary analyses, including reconstruction quality, feature-level similarity using low-dimensional embeddings, application-based signal quality metrics, and metabolite quantification agreement. The results demonstrate that the VAE accurately reconstructs dominant spectral patterns and generates synthetic spectra that occupy the same feature space as in-vivo data. In an example application targeting GABA-edited spectroscopy, augmenting limited subsets of transients with synthetic spectra improves signal quality metrics such as signal-to-noise ratio, linewidth, and shape scores. However, the results also reveal limitations of the generative approach, including under-representation of stochastic noise and reduced accuracy in absolute metabolite quantification, particularly for applications sensitive to concentration estimates. These findings highlight both potential and limitations of data-driven MRS synthesis. Beyond the proposed model, this study introduces a structured evaluation framework for generative MRS methods, emphasizing the importance of application-aware validation when synthetic data are used for downstream analysis.
Related papers
- Conditional Generative Framework with Peak-Aware Attention for Robust Chemical Detection under Interferences [3.976291254896486]
In this paper, we propose an artificial intelligence discrimination framework based on a peak-aware conditional generative model.<n>The framework is learned with a novel peak-aware mechanism that highlights the characteristic peaks of GC-MS data.<n>In addition, chemical and solvent information is encoded in a latent vector embedded with it, allowing a conditional generative adversarial neural network to generate a synthetic GC-MS signal.
arXiv Detail & Related papers (2026-01-29T04:10:37Z) - SIGMA: Scalable Spectral Insights for LLM Collapse [51.863164847253366]
We introduce SIGMA (Spectral Inequalities for Gram Matrix Analysis), a unified framework for model collapse.<n>By utilizing benchmarks that deriving and deterministic bounds on the matrix's spectrum, SIGMA provides a mathematically grounded metric to track the contraction of the representation space.<n>We demonstrate that SIGMA effectively captures the transition towards states, offering both theoretical insights into the mechanics of collapse.
arXiv Detail & Related papers (2026-01-06T19:47:11Z) - Towards Syn-to-Real IQA: A Novel Perspective on Reshaping Synthetic Data Distributions [74.00222571094437]
Blind Image Quality Assessment (BIQA) has advanced significantly through deep learning, but the scarcity of large-scale labeled datasets remains a challenge.<n>We make a key observation that representations learned from synthetic datasets often exhibit a discrete and clustered pattern that hinders regression performance.<n>We introduce a novel framework SynDR-IQA, which reshapes synthetic data distribution to enhance BIQA generalization.
arXiv Detail & Related papers (2026-01-01T06:11:16Z) - AI-driven Generation of MALDI-TOF MS for Microbial Characterization [1.3155923068686746]
This study investigates the use of deep generative models to synthesize realistic MALDI-TOF MS spectra.<n>We adapt and evaluate three generative models, Variational Autoencoders (MALDIVAEs), Generative Adversarial Networks (MALDIGANs), and Denoising Probabilistic Model (MALDIffusion)<n>Experiments show that synthetic data generated by MALDIVAE, MALDIGAN, and MALDIffusion are statistically and diagnostically comparable to real measurements.
arXiv Detail & Related papers (2025-11-18T10:01:21Z) - Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference [89.5628648718851]
Causal inference is essential for developing and evaluating medical interventions.<n>Real-world medical datasets are often difficult to access due to regulatory barriers.<n>We present STEAM: a novel method for generating Synthetic data for Treatment Effect Analysis in Medicine.
arXiv Detail & Related papers (2025-10-21T16:16:00Z) - Adapting HFMCA to Graph Data: Self-Supervised Learning for Generalizable fMRI Representations [57.054499278843856]
Functional magnetic resonance imaging (fMRI) analysis faces significant challenges due to limited dataset sizes and domain variability between studies.<n>Traditional self-supervised learning methods inspired by computer vision often rely on positive and negative sample pairs.<n>We propose adapting a recently developed Hierarchical Functional Maximal Correlation Algorithm (HFMCA) to graph-structured fMRI data.
arXiv Detail & Related papers (2025-10-05T12:35:01Z) - OASIS: A Deep Learning Framework for Universal Spectroscopic Analysis Driven by Novel Loss Functions [4.0097349146966925]
We introduce a machine learning (ML) framework for technique-independent, automated spectral analysis.<n>OASIS achieves its versatility through models trained on a strategically designed synthetic dataset.<n>This study underscores the optimization of the loss function as a key resource-efficient strategy to develop high-performance ML models.
arXiv Detail & Related papers (2025-09-15T01:28:51Z) - Q-MRS: A Deep Learning Framework for Quantitative Magnetic Resonance Spectra Analysis [13.779430559468926]
This study introduces a deep learning (DL) framework that employs transfer learning, in which the model is pre-trained on simulated datasets before it undergoes fine-tuning on in vivo data.
The proposed framework showed promising performance when applied to the Philips dataset from the BIG GABA repository.
arXiv Detail & Related papers (2024-08-28T18:05:53Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Synthetic Wave-Geometric Impulse Responses for Improved Speech
Dereverberation [69.1351513309953]
We show that accurately simulating the low-frequency components of Room Impulse Responses (RIRs) is important to achieving good dereverberation.
We demonstrate that speech dereverberation models trained on hybrid synthetic RIRs outperform models trained on RIRs generated by prior geometric ray tracing methods.
arXiv Detail & Related papers (2022-12-10T20:15:23Z) - Spatio-temporally separable non-linear latent factor learning: an
application to somatomotor cortex fMRI data [0.0]
Models of fMRI data that can perform whole-brain discovery of latent factors are understudied.
New methods for efficient spatial weight-sharing are critical to deal with the high dimensionality of the data and the presence of noise.
Our approach is evaluated on data with multiple motor sub-tasks to assess whether the model captures disentangled latent factors that correspond to each sub-task.
arXiv Detail & Related papers (2022-05-26T21:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.