Related papers: A universal synthetic dataset for machine learning on spectroscopic data

Related papers

Data-driven Synthesis of Magnetic Resonance Spectroscopy Data using a Variational Autoencoder [1.7789378551794652]
We propose a data-driven framework for synthesizing in-vivo MRS data using a variational autoencoder (VAE) trained exclusively on measured single-voxel spectroscopy data.<n>The VAE learns a low-dimensional latent representation of complex-valued spectra and enables generation of new samples through latent-space sampling and synthesis.<n>The results demonstrate that the VAE accurately reconstructs dominant spectral patterns and generates synthetic spectra that occupy the same feature space as in-vivo data.
arXiv Detail & Related papers (2026-02-28T16:52:16Z)
OASIS: A Deep Learning Framework for Universal Spectroscopic Analysis Driven by Novel Loss Functions [4.0097349146966925]
We introduce a machine learning (ML) framework for technique-independent, automated spectral analysis.<n>OASIS achieves its versatility through models trained on a strategically designed synthetic dataset.<n>This study underscores the optimization of the loss function as a key resource-efficient strategy to develop high-performance ML models.
arXiv Detail & Related papers (2025-09-15T01:28:51Z)
What Makes Good Synthetic Training Data for Zero-Shot Stereo Matching? [57.49867420132091]
We report the effects on zero-shot stereo matching performance using standard benchmarks.<n>We validate our findings by collecting the best settings and creating a large-scale dataset.<n>We open-source our system to enable further research on procedural stereo datasets.
arXiv Detail & Related papers (2025-04-23T17:59:33Z)
Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction to Generation and Beyond [38.32974480709081]
The rapid advent of machine learning (ML) and artificial intelligence (AI) has catalyzed major transformations in chemistry. The application of these methods to spectroscopic and spectrometric data, referred to as Spectroscopy Machine Learning (SpectraML), remains relatively underexplored. We provide a unified review of SpectraML, systematically examining state-of-the-art approaches for both forward tasks and inverse tasks.
arXiv Detail & Related papers (2025-02-14T04:07:25Z)
Stellar parameter prediction and spectral simulation using machine learning [0.0]
We applied machine learning to the entire data history of ESO's High Accuracy Radial Velocity Planet Searcher (HARPS) instrument. We trained standard and variational autoencoders on HARPS data to predict spectral parameters and generate spectra. Our models excel at predicting spectral parameters and compressing real spectra, and they achieved a mean prediction error of approximately 50 K for effective temperatures.
arXiv Detail & Related papers (2024-12-12T07:09:42Z)
Enhancing radioisotope identification in gamma spectra with transfer learning [0.0]
We pretrain a model using physically derived synthetic data and leverage transfer learning techniques to fine-tune the model for a specific target domain. Results of this analysis indicate that fine-tuned models significantly outperform those trained exclusively on synthetic data or solely on target-domain data. This research serves as proof of concept for applying transfer learning techniques to application scenarios where access to experimental data is limited.
arXiv Detail & Related papers (2024-12-10T00:21:00Z)
Generating Diverse Synthetic Datasets for Evaluation of Real-life Recommender Systems [0.0]
Synthetic datasets are important for evaluating and testing machine learning models. We develop a novel framework for generating synthetic datasets that are diverse and statistically coherent. The framework is available as a free open Python package to facilitate research with minimal friction.
arXiv Detail & Related papers (2024-11-27T09:53:14Z)
Advancing fNIRS Neuroimaging through Synthetic Data Generation and Machine Learning Applications [0.0]
This study presents an integrated approach for advancing functional Near-Infrared Spectroscopy (fNIRS) neuroimaging. By addressing the scarcity of high-quality neuroimaging datasets, this work harnesses Monte Carlo simulations and parametric head models to generate a comprehensive synthetic dataset. A cloud-based infrastructure is established for scalable data generation and processing, enhancing the accessibility and quality of neuroimaging data.
arXiv Detail & Related papers (2024-05-18T09:50:19Z)
Learning from Synthetic Data for Visual Grounding [55.21937116752679]
We show that SynGround can improve the localization capabilities of off-the-shelf vision-and-language models. Data generated with SynGround improves the pointing game accuracy of a pretrained ALBEF and BLIP models by 4.81% and 17.11% absolute percentage points, respectively.
arXiv Detail & Related papers (2024-03-20T17:59:43Z)
Synthetic Information towards Maximum Posterior Ratio for deep learning on Imbalanced Data [1.7495515703051119]
We propose a technique for data balancing by generating synthetic data for the minority class. Our method prioritizes balancing the informative regions by identifying high entropy samples. Our experimental results on forty-one datasets demonstrate the superior performance of our technique.
arXiv Detail & Related papers (2024-01-05T01:08:26Z)
TarGEN: Targeted Data Generation with Large Language Models [51.87504111286201]
TarGEN is a multi-step prompting strategy for generating high-quality synthetic datasets. We augment TarGEN with a method known as self-correction empowering LLMs to rectify inaccurately labeled instances. A comprehensive analysis of the synthetic dataset compared to the original dataset reveals similar or higher levels of dataset complexity and diversity.
arXiv Detail & Related papers (2023-10-27T03:32:17Z)
Optimizations of Autoencoders for Analysis and Classification of Microscopic In Situ Hybridization Images [68.8204255655161]
We propose a deep-learning framework to detect and classify areas of microscopic images with similar levels of gene expression. The data we analyze requires an unsupervised learning model for which we employ a type of Artificial Neural Network - Deep Learning Autoencoders.
arXiv Detail & Related papers (2023-04-19T13:45:28Z)
Exploring Supervised Machine Learning for Multi-Phase Identification and Quantification from Powder X-Ray Diffraction Spectra [1.0660480034605242]
Powder X-ray diffraction analysis is a critical component of materials characterization methodologies. Deep learning has become a prime focus for predicting crystallographic parameters and features from X-ray spectra. Here, we are interested in conventional supervised learning algorithms in lieu of deep learning for multi-label crystalline phase identification.
arXiv Detail & Related papers (2022-11-16T00:36:13Z)
Trustworthiness of Laser-Induced Breakdown Spectroscopy Predictions via Simulation-based Synthetic Data Augmentation and Multitask Learning [4.633997895806144]
We consider quantitative analyses of spectral data using laser-induced breakdown spectroscopy. We address the small size of training data available, and the validation of the predictions during inference on unknown data.
arXiv Detail & Related papers (2022-10-07T18:00:09Z)
BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot Detection [63.447493500066045]
This work proposes a data driven learning model for the synthesis of keystroke biometric data. The proposed method is compared with two statistical approaches based on Universal and User-dependent models. Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects.
arXiv Detail & Related papers (2022-07-27T09:26:15Z)
Low-complexity deep learning frameworks for acoustic scene classification [64.22762153453175]
We present low-complexity deep learning frameworks for acoustic scene classification (ASC) The proposed frameworks can be separated into four main steps: Front-end spectrogram extraction, online data augmentation, back-end classification, and late fusion of predicted probabilities. Our experiments conducted on DCASE 2022 Task 1 Development dataset have fullfiled the requirement of low-complexity and achieved the best classification accuracy of 60.1%.
arXiv Detail & Related papers (2022-06-13T11:41:39Z)
Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet Transmission Spectra [68.8204255655161]
We focus on unsupervised techniques for analyzing spectral data from transiting exoplanets. We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations. We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes.
arXiv Detail & Related papers (2022-01-07T22:26:33Z)
A parameter refinement method for Ptychography based on Deep Learning concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability. A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction. We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
A probabilistic deep learning approach to automate the interpretation of multi-phase diffraction spectra [4.240899165468488]
We develop an ensemble convolutional neural network trained on simulated diffraction spectra to identify complex multi-phase mixtures. Our model is benchmarked on simulated and experimentally measured diffraction spectra, showing exceptional performance with accuracies exceeding those given by previously reported methods.
arXiv Detail & Related papers (2021-03-30T20:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.