BioDiffusion: A Versatile Diffusion Model for Biomedical Signal
Synthesis
- URL: http://arxiv.org/abs/2401.10282v2
- Date: Sat, 27 Jan 2024 17:44:55 GMT
- Title: BioDiffusion: A Versatile Diffusion Model for Biomedical Signal
Synthesis
- Authors: Xiaomin Li, Mykhailo Sakevych, Gentry Atkinson, Vangelis Metsis
- Abstract summary: BioDiffusion is a diffusion-based probabilistic model optimized for the synthesis of biomedical signals.
Our research encompasses both qualitative and quantitative assessments of the synthesized data quality.
- Score: 4.765541373485142
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Machine learning tasks involving biomedical signals frequently grapple with
issues such as limited data availability, imbalanced datasets, labeling
complexities, and the interference of measurement noise. These challenges often
hinder the optimal training of machine learning algorithms. Addressing these
concerns, we introduce BioDiffusion, a diffusion-based probabilistic model
optimized for the synthesis of multivariate biomedical signals. BioDiffusion
demonstrates excellence in producing high-fidelity, non-stationary,
multivariate signals for a range of tasks including unconditional,
label-conditional, and signal-conditional generation. Leveraging these
synthesized signals offers a notable solution to the aforementioned challenges.
Our research encompasses both qualitative and quantitative assessments of the
synthesized data quality, underscoring its capacity to bolster accuracy in
machine learning tasks tied to biomedical signals. Furthermore, when juxtaposed
with current leading time-series generative models, empirical evidence suggests
that BioDiffusion outperforms them in biomedical signal generation quality.
Related papers
- Simplicity within biological complexity [0.0]
We survey the literature and argue for the development of a comprehensive framework for embedding of multi-scale molecular network data.
Network embedding methods map nodes to points in low-dimensional space, so that proximity in the learned space reflects the network's topology-function relationships.
We propose to develop a general, comprehensive embedding framework for multi-omic network data, from models to efficient and scalable software implementation.
arXiv Detail & Related papers (2024-05-15T13:32:45Z) - Progress and Opportunities of Foundation Models in Bioinformatics [77.74411726471439]
Foundations models (FMs) have ushered in a new era in computational biology, especially in the realm of deep learning.
Central to our focus is the application of FMs to specific biological problems, aiming to guide the research community in choosing appropriate FMs for their research needs.
Review analyses challenges and limitations faced by FMs in biology, such as data noise, model explainability, and potential biases.
arXiv Detail & Related papers (2024-02-06T02:29:17Z) - ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab [67.24684071577211]
The challenge of replicating research results has posed a significant impediment to the field of molecular biology.
We first curate a comprehensive multimodal dataset, named ProBio, as an initial step towards this objective.
Next, we devise two challenging benchmarks, transparent solution tracking and multimodal action recognition, to emphasize the unique characteristics and difficulties associated with activity understanding in BioLab settings.
arXiv Detail & Related papers (2023-11-01T14:44:01Z) - Generating synthetic multi-dimensional molecular-mediator time series
data for artificial intelligence-based disease trajectory forecasting and
drug development digital twins: Considerations [0.0]
The use of synthetic data is recognized as a crucial step in the development of neural network-based Artificial Intelligence (AI) systems.
Insufficiency of statistical and data-centric machine learning means of generating this type of synthetic data is due to a combination of factors.
The generation of synthetic data that accounts for the identified factors of multi-dimensional time series data is an essential capability for the development of mediator-biomarker based AI forecasting systems.
arXiv Detail & Related papers (2023-03-16T03:13:53Z) - Multi-fidelity Gaussian Process for Biomanufacturing Process Modeling
with Small Data [1.4687789417816917]
We propose to use a statistical machine learning approach, multi-fidelity Gaussian process, for process modelling in biomanufacturing.
We apply the multi-fidelity Gaussian process to solve two significant problems in biomanufacturing, bioreactor scale-up and knowledge transfer across cell lines, and demonstrate its efficacy on real-world datasets.
arXiv Detail & Related papers (2022-11-26T06:38:34Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - 2021 BEETL Competition: Advancing Transfer Learning for Subject
Independence & Heterogenous EEG Data Sets [89.84774119537087]
We design two transfer learning challenges around diagnostics and Brain-Computer-Interfacing (BCI)
Task 1 is centred on medical diagnostics, addressing automatic sleep stage annotation across subjects.
Task 2 is centred on Brain-Computer Interfacing (BCI), addressing motor imagery decoding across both subjects and data sets.
arXiv Detail & Related papers (2022-02-14T12:12:20Z) - Benchmarking Uncertainty Qualification on Biosignal Classification Tasks
under Dataset Shift [16.15816241847314]
We propose a framework to evaluate the capability of the estimated uncertainty in capturing different types of biosignal dataset shifts.
In particular, we use three classification tasks based on respiratory sounds and electrocardiography signals to benchmark five representative uncertainty qualification methods.
arXiv Detail & Related papers (2021-12-16T20:42:17Z) - Discriminative Singular Spectrum Classifier with Applications on
Bioacoustic Signal Recognition [67.4171845020675]
We present a bioacoustic signal classifier equipped with a discriminative mechanism to extract useful features for analysis and classification efficiently.
Unlike current bioacoustic recognition methods, which are task-oriented, the proposed model relies on transforming the input signals into vector subspaces.
The validity of the proposed method is verified using three challenging bioacoustic datasets containing anuran, bee, and mosquito species.
arXiv Detail & Related papers (2021-03-18T11:01:21Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.