Transform-Domain Classification of Human Cells based on DNA Methylation
Datasets
- URL: http://arxiv.org/abs/1912.13167v1
- Date: Tue, 31 Dec 2019 04:18:11 GMT
- Title: Transform-Domain Classification of Human Cells based on DNA Methylation
Datasets
- Authors: Xueyuan Zhao and Dario Pompili
- Abstract summary: A new pipeline is proposed integrating the DNA methylation intensity measurements on all the CpG islands by the transformation of Walsh-Hadamard Transform (WHT)
The proposed method has broad applications in expedited disease and normal human cell classifications by the epigenome and genome datasets.
- Score: 8.922553037367075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A novel method to classify human cells is presented in this work based on the
transform-domain method on DNA methylation data. DNA methylation profile
variations are observed in human cells with the progression of disease stages,
and the proposal is based on this DNA methylation variation to classify normal
and disease cells including cancer cells. The cancer cell types investigated in
this work cover hepatocellular (sample size n = 40), colorectal (n = 44), lung
(n = 70) and endometrial (n = 87) cancer cells. A new pipeline is proposed
integrating the DNA methylation intensity measurements on all the CpG islands
by the transformation of Walsh-Hadamard Transform (WHT). The study reveals the
three-step properties of the DNA methylation transform-domain data and the step
values of association with the cell status. Further assessments have been
carried out on the proposed machine learning pipeline to perform classification
of the normal and cancer tissue cells. A number of machine learning classifiers
are compared for whole sequence and WHT sequence classification based on public
Whole-Genome Bisulfite Sequencing (WGBS) DNA methylation datasets. The
WHT-based method can speed up the computation time by more than one order of
magnitude compared with whole original sequence classification, while
maintaining comparable classification accuracy by the selected machine learning
classifiers. The proposed method has broad applications in expedited disease
and normal human cell classifications by the epigenome and genome datasets.
Related papers
- MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - FlowCyt: A Comparative Study of Deep Learning Approaches for Multi-Class Classification in Flow Cytometry Benchmarking [1.6712896227173808]
FlowCyt is the first comprehensive benchmark for multi-class single-cell classification in flowencoded data.
The dataset comprises bone marrow samples from 30 patients, with each cell characterized by twelve markers.
arXiv Detail & Related papers (2024-02-28T15:01:59Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - scBiGNN: Bilevel Graph Representation Learning for Cell Type
Classification from Single-cell RNA Sequencing Data [62.87454293046843]
Graph neural networks (GNNs) have been widely used for automatic cell type classification.
scBiGNN comprises two GNN modules to identify cell types.
scBiGNN outperforms a variety of existing methods for cell type classification from scRNA-seq data.
arXiv Detail & Related papers (2023-12-16T03:54:26Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Predicting Molecular Phenotypes with Single Cell RNA Sequencing Data: an
Assessment of Unsupervised Machine Learning Models [0.0]
This study is to evaluate unsupervised machine learning on classifying treatment-resistant phenotypes in heterogeneous tumors.
scRNAseq quantifies mRNA in cells and characterizes cell phenotypes.
clusters generated from this pipeline can be used to understand cancer cell behavior and malignant growth.
arXiv Detail & Related papers (2021-08-11T05:30:37Z) - Active feature selection discovers minimal gene-sets for classifying
cell-types and disease states in single-cell mRNA-seq data [2.578242050187029]
Single cell mRNA-seq costs currently prohibit the application of single cell mRNA-seq for many biological and clinical tasks of interest.
We introduce an active learning framework that constructs compressed gene sets that enable high accuracy classification of cell-types and physiological states.
The discovery of compact but highly informative gene sets might enable drastic reductions in sequencing requirements for applications of single-cell mRNA-seq.
arXiv Detail & Related papers (2021-06-15T17:49:26Z) - A Deep Embedded Refined Clustering Approach for Breast Cancer
Distinction based on DNA Methylation [0.0]
We propose a deep embedded refined clustering method for breast cancer differentiation based on DNA methylation.
The proposed approach is composed of two main stages. The first stage consists in the dimensionality reduction of the methylation data based on an autoencoder.
The second stage is a clustering algorithm based on the soft-assignment of the latent space provided by the autoencoder.
arXiv Detail & Related papers (2021-02-18T16:46:25Z) - Cancer Gene Profiling through Unsupervised Discovery [49.28556294619424]
We introduce a novel, automatic and unsupervised framework to discover low-dimensional gene biomarkers.
Our method is based on the LP-Stability algorithm, a high dimensional center-based unsupervised clustering algorithm.
Our signature reports promising results on distinguishing immune inflammatory and immune desert tumors.
arXiv Detail & Related papers (2021-02-11T09:04:45Z) - From Human Mesenchymal Stromal Cells to Osteosarcoma Cells
Classification by Deep Learning [0.18143184797612422]
In this paper, we focus the attention on osteosarcoma. Osteosarcoma is one of the primary malignant bone tumors which usually afflicts people in adolescence.
A DL approach is applied to discriminate human Mesenchymal Stromal Cells (MSCs) from osteosarcoma cells and to classify the different cell populations under investigation.
arXiv Detail & Related papers (2020-08-04T22:23:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.