PROTECT: Protein circadian time prediction using unsupervised learning
- URL: http://arxiv.org/abs/2501.07405v1
- Date: Mon, 13 Jan 2025 15:21:20 GMT
- Title: PROTECT: Protein circadian time prediction using unsupervised learning
- Authors: Aram Ansary Ogholbake, Qiang Cheng,
- Abstract summary: We develop a novel method to predict circadian sample phases from proteomic data without requiring time labels or prior knowledge of proteins or genes.
We tested our method on both time-labeled and unlabeled proteomic data.
- Score: 13.110156202816112
- License:
- Abstract: Circadian rhythms regulate the physiology and behavior of humans and animals. Despite advancements in understanding these rhythms and predicting circadian phases at the transcriptional level, predicting circadian phases from proteomic data remains elusive. This challenge is largely due to the scarcity of time labels in proteomic datasets, which are often characterized by small sample sizes, high dimensionality, and significant noise. Furthermore, existing methods for predicting circadian phases from transcriptomic data typically rely on prior knowledge of known rhythmic genes, making them unsuitable for proteomic datasets. To address this gap, we developed a novel computational method using unsupervised deep learning techniques to predict circadian sample phases from proteomic data without requiring time labels or prior knowledge of proteins or genes. Our model involves a two-stage training process optimized for robust circadian phase prediction: an initial greedy one-layer-at-a-time pre-training which generates informative initial parameters followed by fine-tuning. During fine-tuning, a specialized loss function guides the model to align protein expression levels with circadian patterns, enabling it to accurately capture the underlying rhythmic structure within the data. We tested our method on both time-labeled and unlabeled proteomic data. For labeled data, we compared our predictions to the known time labels, achieving high accuracy, while for unlabeled human datasets, including postmortem brain regions and urine samples, we explored circadian disruptions. Notably, our analysis identified disruptions in rhythmic proteins between Alzheimer's disease and control subjects across these samples.
Related papers
- NovoBench: Benchmarking Deep Learning-based De Novo Peptide Sequencing Methods in Proteomics [58.03989832372747]
We present the first unified benchmark NovoBench for emphde novo peptide sequencing.
It comprises diverse mass spectrum data, integrated models, and comprehensive evaluation metrics.
Recent methods, including DeepNovo, PointNovo, Casanovo, InstaNovo, AdaNovo and $pi$-HelixNovo are integrated into our framework.
arXiv Detail & Related papers (2024-06-16T08:23:21Z) - Unbiased organism-agnostic and highly sensitive signal peptide predictor
with deep protein language model [12.37352652557512]
Signal peptide (SP) is a short peptide located in the N-terminus of proteins.
Here we present Unbiased Organism-agnostic Signal peptide Network (USPNet), a signal peptide classification and cleavage site prediction deep learning method.
We propose to apply label distribution-aware margin loss to handle data imbalance problems and use evolutionary information of protein to enrich representation.
arXiv Detail & Related papers (2023-12-14T14:32:48Z) - Binary Quantification and Dataset Shift: An Experimental Investigation [54.14283123210872]
Quantification is the supervised learning task that consists of training predictors of the class prevalence values of sets of unlabelled data.
The relationship between quantification and other types of dataset shift remains, by and large, unexplored.
We propose a fine-grained taxonomy of types of dataset shift, by establishing protocols for the generation of datasets affected by these types of shift.
arXiv Detail & Related papers (2023-10-06T20:11:27Z) - Generative Pre-Training of Time-Series Data for Unsupervised Fault Detection in Semiconductor Manufacturing [20.71342655519934]
This paper introduces TRACE-GPT, which stands for Time-seRies Anomaly-detection with Convolutional Embedding and Generative Pre-trained Transformers.
Our model has the highest F1 score at Equal Error Rate (EER) across all datasets and is only 0.026 below the supervised state-of-the-art baseline on the open dataset.
arXiv Detail & Related papers (2023-09-20T16:01:45Z) - Path Signatures for Seizure Forecasting [0.6282171844772422]
We consider the automated discovery of predictive features (or biomarkers) to forecast seizures in a patient-specific way.
We use the path signature, a recent development in the analysis of data streams, to map from measured time series to seizure prediction.
arXiv Detail & Related papers (2023-08-18T05:19:18Z) - Disruption Precursor Onset Time Study Based on Semi-supervised Anomaly
Detection [6.287037295927318]
We present a disruption prediction method based on anomaly detection that overcomes the drawbacks of unbalanced positive and negative data samples.
We optimize precursor labeling using the onset times inferred by the anomaly detection predictor and test the optimized labels on supervised learning disruption predictors.
arXiv Detail & Related papers (2023-03-27T07:54:56Z) - DriPP: Driven Point Processes to Model Stimuli Induced Patterns in M/EEG
Signals [62.997667081978825]
We develop a novel statistical point process model-called driven temporal point processes (DriPP)
We derive a fast and principled expectation-maximization (EM) algorithm to estimate the parameters of this model.
Results on standard MEG datasets demonstrate that our methodology reveals event-related neural responses.
arXiv Detail & Related papers (2021-12-08T13:07:21Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Convolutional Neural Networks for Sleep Stage Scoring on a Two-Channel
EEG Signal [63.18666008322476]
Sleep problems are one of the major diseases all over the world.
Basic tool used by specialists is the Polysomnogram, which is a collection of different signals recorded during sleep.
Specialists have to score the different signals according to one of the standard guidelines.
arXiv Detail & Related papers (2021-03-30T09:59:56Z) - Bayesian neural network with pretrained protein embedding enhances
prediction accuracy of drug-protein interaction [3.499870393443268]
Deep learning approaches can predict drug-protein interactions without trial-and-error by humans.
We propose two methods to construct a deep learning framework that exhibits superior performance with a small labeled dataset.
arXiv Detail & Related papers (2020-12-15T10:24:34Z) - Detecting Parkinsonian Tremor from IMU Data Collected In-The-Wild using
Deep Multiple-Instance Learning [59.74684475991192]
Parkinson's Disease (PD) is a slowly evolving neuro-logical disease that affects about 1% of the population above 60 years old.
PD symptoms include tremor, rigidity and braykinesia.
We present a method for automatically identifying tremorous episodes related to PD, based on IMU signals captured via a smartphone device.
arXiv Detail & Related papers (2020-05-06T09:02:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.