T Cell Receptor Protein Sequences and Sparse Coding: A Novel Approach to
Cancer Classification
- URL: http://arxiv.org/abs/2304.13145v2
- Date: Tue, 5 Sep 2023 21:08:04 GMT
- Title: T Cell Receptor Protein Sequences and Sparse Coding: A Novel Approach to
Cancer Classification
- Authors: Zahra Tayebi, Sarwan Ali, Prakash Chourasia, Taslim Murad and Murray
Patterson
- Abstract summary: T cell receptors (TCRs) are essential proteins for the adaptive immune system.
Recent advancements in sequencing technologies have enabled the comprehensive profiling of TCR repertoires.
This has led to the discovery of TCRs with potent anti-cancer activity and the development of TCR-based immunotherapies.
- Score: 4.824821328103934
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cancer is a complex disease characterized by uncontrolled cell growth and
proliferation. T cell receptors (TCRs) are essential proteins for the adaptive
immune system, and their specific recognition of antigens plays a crucial role
in the immune response against diseases, including cancer. The diversity and
specificity of TCRs make them ideal for targeting cancer cells, and recent
advancements in sequencing technologies have enabled the comprehensive
profiling of TCR repertoires. This has led to the discovery of TCRs with potent
anti-cancer activity and the development of TCR-based immunotherapies. In this
study, we investigate the use of sparse coding for the multi-class
classification of TCR protein sequences with cancer categories as target
labels. Sparse coding is a popular technique in machine learning that enables
the representation of data with a set of informative features and can capture
complex relationships between amino acids and identify subtle patterns in the
sequence that might be missed by low-dimensional methods. We first compute the
k-mers from the TCR sequences and then apply sparse coding to capture the
essential features of the data. To improve the predictive performance of the
final embeddings, we integrate domain knowledge regarding different types of
cancer properties. We then train different machine learning (linear and
non-linear) classifiers on the embeddings of TCR sequences for the purpose of
supervised analysis. Our proposed embedding method on a benchmark dataset of
TCR sequences significantly outperforms the baselines in terms of predictive
performance, achieving an accuracy of 99.8\%. Our study highlights the
potential of sparse coding for the analysis of TCR protein sequences in cancer
research and other related fields.
Related papers
- DANCE: Deep Learning-Assisted Analysis of Protein Sequences Using Chaos Enhanced Kaleidoscopic Images [4.824821328103934]
Cancer is a complex disease characterized by uncontrolled cell growth.
T cell receptors (TCRs) play a key role in recognizing antigens, including those associated with cancer.
Recent advancements in sequencing technologies have facilitated comprehensive profiling of TCR repertoires.
arXiv Detail & Related papers (2024-09-10T17:55:59Z) - Pan-cancer gene set discovery via scRNA-seq for optimal deep learning based downstream tasks [6.869831177092736]
We analyzed scRNA-seq data from 181 tumor biopsies across 13 cancer types.
High-dimensional weighted gene co-expression network analysis (hdWGCNA) was performed to identify relevant gene sets.
Oncogenes from OncoKB evaluated with deep learning models, including multilayer perceptrons (MLPs) and graph neural networks (GNNs)
arXiv Detail & Related papers (2024-08-13T23:24:36Z) - TCR-GPT: Integrating Autoregressive Model and Reinforcement Learning for T-Cell Receptor Repertoires Generation [6.920411338236452]
T-cell receptors (TCRs) play a crucial role in the immune system by recognizing and binding to specific antigens presented by infected or cancerous cells.
Language models, such as auto-regressive transformers, offer a powerful solution by learning the probability distributions of TCR repertoires.
We introduce TCR-GPT, a probabilistic model built on a decoder-only transformer architecture, designed to uncover and replicate sequence patterns in TCR repertoires.
arXiv Detail & Related papers (2024-08-02T10:16:28Z) - Machine Learning Methods for Cancer Classification Using Gene Expression
Data: A Review [77.34726150561087]
Cancer is the second major cause of death after cardiovascular diseases.
Gene expression can play a fundamental role in the early detection of cancer.
This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods.
arXiv Detail & Related papers (2023-01-28T15:03:03Z) - Enhancing Clinical Support for Breast Cancer with Deep Learning Models
using Synthetic Correlated Diffusion Imaging [66.63200823918429]
We investigate enhancing clinical support for breast cancer with deep learning models.
We leverage a volumetric convolutional neural network to learn deep radiomic features from a pre-treatment cohort.
We find that the proposed approach can achieve better performance for both grade and post-treatment response prediction.
arXiv Detail & Related papers (2022-11-10T03:02:12Z) - Exploiting segmentation labels and representation learning to forecast
therapy response of PDAC patients [60.78505216352878]
We propose a hybrid deep neural network pipeline to predict tumour response to initial chemotherapy.
We leverage a combination of representation transfer from segmentation to classification, as well as localisation and representation learning.
Our approach yields a remarkably data-efficient method able to predict treatment response with a ROC-AUC of 63.7% using only 477 datasets in total.
arXiv Detail & Related papers (2022-11-08T11:50:31Z) - Multiple Instance Neural Networks Based on Sparse Attention for Cancer
Detection using T-cell Receptor Sequences [10.199698726118003]
We propose multiple instance neural networks based on sparse attention (MINN-SA) to enhance the performance in cancer detection and explainability.
MINN-SA yields the highest area under the ROC curve (AUC) scores on average measured across 10 different types of cancers.
arXiv Detail & Related papers (2022-08-09T03:24:03Z) - TCR: A Transformer Based Deep Network for Predicting Cancer Drugs
Response [12.86640026993276]
We proposeTransformer based network for Cancer drug Response (TCR) to predict anti-cancer drug response.
By utilizing an attention mechanism, TCR is able to learn the interactions between drug atom/sub-structure and molecular signatures efficiently.
Our study highlights the prediction power of TCR and its potential value for cancer drug repurpose and precision oncology treatment.
arXiv Detail & Related papers (2022-07-10T13:01:54Z) - Lung Cancer Lesion Detection in Histopathology Images Using Graph-Based
Sparse PCA Network [93.22587316229954]
We propose a graph-based sparse principal component analysis (GS-PCA) network, for automated detection of cancerous lesions on histological lung slides stained by hematoxylin and eosin (H&E)
We evaluate the performance of the proposed algorithm on H&E slides obtained from an SVM K-rasG12D lung cancer mouse model using precision/recall rates, F-score, Tanimoto coefficient, and area under the curve (AUC) of the receiver operator characteristic (ROC)
arXiv Detail & Related papers (2021-10-27T19:28:36Z) - Topological Data Analysis of copy number alterations in cancer [70.85487611525896]
We explore the potential to capture information contained in cancer genomic information using a novel topology-based approach.
We find that this technique has the potential to extract meaningful low-dimensional representations in cancer somatic genetic data.
arXiv Detail & Related papers (2020-11-22T17:31:23Z) - A Systematic Approach to Featurization for Cancer Drug Sensitivity
Predictions with Deep Learning [49.86828302591469]
We train >35,000 neural network models, sweeping over common featurization techniques.
We found the RNA-seq to be highly redundant and informative even with subsets larger than 128 features.
arXiv Detail & Related papers (2020-04-30T20:42:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.