Related papers: AFP-SRC: Identification of Antifreeze Proteins Using Sparse Representation Classifier

AFP-SRC: Identification of Antifreeze Proteins Using Sparse Representation Classifier

URL: http://arxiv.org/abs/2009.05277v3
Date: Fri, 24 Sep 2021 11:59:33 GMT
Title: AFP-SRC: Identification of Antifreeze Proteins Using Sparse Representation Classifier
Authors: Shujaat Khan, Muhammad Usman, Abdul Wahab
Abstract summary: Species living in the extreme cold environment fight against the harsh conditions using antifreeze proteins (AFPs) We propose a computational framework for the prediction of AFPs using a sample-specific classification method using the sparse reconstruction. The proposed method is found to outperform in terms of Balanced accuracy and Youden's index.
Score: 5.285065659030821
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Species living in the extreme cold environment fight against the harsh conditions using antifreeze proteins (AFPs), that manipulates the freezing mechanism of water in more than one way. This amazing nature of AFP turns out to be extremely useful in several industrial and medical applications. The lack of similarity in their structure and sequence makes their prediction an arduous task and identifying them experimentally in the wet-lab is time-consuming and expensive. In this research, we propose a computational framework for the prediction of AFPs which is essentially based on a sample-specific classification method using the sparse reconstruction. A linear model and an over-complete dictionary matrix of known AFPs are used to predict a sparse class-label vector that provides a sample-association score. Delta-rule is applied for the reconstruction of two pseudo-samples using lower and upper parts of the sample-association vector and based on the minimum recovery score, class labels are assigned. We compare our approach with contemporary methods on a standard dataset and the proposed method is found to outperform in terms of Balanced accuracy and Youden's index. The MATLAB implementation of the proposed method is available at the author's GitHub page (\{https://github.com/Shujaat123/AFP-SRC}{https://github.com/Shujaat123/AFP-SRC}).

Related papers

Sequential Testing for Descriptor-Agnostic LiDAR Loop Closure in Repetitive Environments [12.304166871828777]
We propose a multi-frame loop closure verification method that formulates LiDAR loop closure as a truncated Sequential Probability Ratio Test (SPRT)<n>Instead of deciding from a single descriptor comparison or using fixed thresholds with late-stage Iterative Closest Point (ICP) vetting, the verifier accumulates a short temporal stream of descriptor similarities between a query and each candidate.<n>This precision-first policy is designed to suppress false positives in structurally repetitive indoor environments.
arXiv Detail & Related papers (2025-12-10T09:20:09Z)
Contrastive Learning for Semi-Supervised Deep Regression with Generalized Ordinal Rankings from Spectral Seriation [18.192043514568187]
We extend contrastive regression methods to allow unlabeled data to be used in the semi-supervised setting.<n>Our method can surpass existing state-of-the-art semi-supervised deep regression methods.
arXiv Detail & Related papers (2025-12-10T02:45:23Z)
DistDF: Time-Series Forecasting Needs Joint-Distribution Wasserstein Alignment [92.70019102733453]
Training time-series forecast models requires aligning the conditional distribution of model forecasts with that of the label sequence.<n>We propose DistDF, which achieves alignment by alternatively minimizing a discrepancy between the conditional forecast and label distributions.
arXiv Detail & Related papers (2025-10-28T16:09:59Z)
Post-Hoc Split-Point Self-Consistency Verification for Efficient, Unified Quantification of Aleatoric and Epistemic Uncertainty in Deep Learning [5.996056764788456]
Uncertainty quantification (UQ) is vital for trustworthy deep learning, yet existing methods are either computationally intensive or provide only partial, task-specific estimates.<n>We propose a post-hoc single-forward-pass framework that jointly captures aleatoric and epistemic uncertainty without modifying or retraining pretrained models.<n>Our method applies emphSplit-Point Analysis (SPA) to decompose predictive residuals into upper and lower subsets, computing emphMean Absolute Residuals (MARs) on each side.
arXiv Detail & Related papers (2025-09-16T17:16:01Z)
Enhancing Variable Selection in Large-scale Logistic Regression: Leveraging Manual Labeling with Beneficial Noise [1.1477123412184609]
In large-scale supervised learning, penalized logistic regression (PLR) effectively addresses the overfitting problem by introducing regularization terms. This paper theoretically demonstrates that label noise stemming from manual labeling, which is solely related to classification difficulty, represents a type of beneficial noise for variable selection in PLR. Experimental results indicate that, as compared with traditional variable selection classification techniques, the PLR with manually-labeled noisy data achieves higher estimation and classification accuracy across multiple large-scale datasets.
arXiv Detail & Related papers (2025-04-23T10:05:54Z)
Feynman-Kac Correctors in Diffusion: Annealing, Guidance, and Product of Experts [64.34482582690927]
We provide an efficient and principled method for sampling from a sequence of annealed, geometric-averaged, or product distributions derived from pretrained score-based models. We propose Sequential Monte Carlo (SMC) resampling algorithms that leverage inference-time scaling to improve sampling quality.
arXiv Detail & Related papers (2025-03-04T17:46:51Z)
Decoupled Prototype Learning for Reliable Test-Time Adaptation [50.779896759106784]
Test-time adaptation (TTA) is a task that continually adapts a pre-trained source model to the target domain during inference. One popular approach involves fine-tuning model with cross-entropy loss according to estimated pseudo-labels. This study reveals that minimizing the classification error of each sample causes the cross-entropy loss's vulnerability to label noise. We propose a novel Decoupled Prototype Learning (DPL) method that features prototype-centric loss computation.
arXiv Detail & Related papers (2024-01-15T03:33:39Z)
SUnAA: Sparse Unmixing using Archetypal Analysis [62.997667081978825]
This paper introduces a new geological error map technique using archetypal sparse analysis (SUnAA) First, we design a new model based on archetypal sparse analysis (SUnAA)
arXiv Detail & Related papers (2023-08-09T07:58:33Z)
Parametric Classification for Generalized Category Discovery: A Baseline Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples. We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem. We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z)
False membership rate control in mixture models [1.387448620257867]
A clustering task consists in partitioning elements of a sample into homogeneous groups. In the supervised setting, this approach is well known and referred to as classification with an abstention option. In this paper the approach is revisited in an unsupervised mixture model framework and the purpose is to develop a method that comes with the guarantee that the false membership rate does not exceed a pre-defined nominal level.
arXiv Detail & Related papers (2022-03-04T22:37:59Z)
Self-Certifying Classification by Linearized Deep Assignment [65.0100925582087]
We propose a novel class of deep predictors for classifying metric data on graphs within PAC-Bayes risk certification paradigm. Building on the recent PAC-Bayes literature and data-dependent priors, this approach enables learning posterior distributions on the hypothesis space.
arXiv Detail & Related papers (2022-01-26T19:59:14Z)
Bayes in Wonderland! Predictive supervised classification inference hits unpredictability [1.8814209805277506]
We show the convergence of the sBpc and mBpc under de Finetti type of exchangeability. We also provide a parameter estimation of the generative model giving rise to the partition exchangeable sequence.
arXiv Detail & Related papers (2021-12-03T12:34:52Z)
CARMS: Categorical-Antithetic-REINFORCE Multi-Sample Gradient Estimator [60.799183326613395]
We propose an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling. We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline.
arXiv Detail & Related papers (2021-10-26T20:14:30Z)
AdaPT-GMM: Powerful and robust covariate-assisted multiple testing [0.7614628596146599]
We propose a new empirical Bayes method for co-assisted multiple testing with false discovery rate (FDR) control. Our method refines the adaptive p-value thresholding (AdaPT) procedure by generalizing its masking scheme. We show in extensive simulations and real data examples that our new method, which we call AdaPT-GMM, consistently delivers high power.
arXiv Detail & Related papers (2021-06-30T05:06:18Z)
Semi-Supervised Speech Recognition via Graph-based Temporal Classification [59.58318952000571]
Semi-supervised learning has demonstrated promising results in automatic speech recognition by self-training. The effectiveness of this approach largely relies on the pseudo-label accuracy. Alternative ASR hypotheses of an N-best list can provide more accurate labels for an unlabeled speech utterance.
arXiv Detail & Related papers (2020-10-29T14:56:56Z)
A Compressive Classification Framework for High-Dimensional Data [12.284934135116515]
We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size. The proposed method, referred to as regularized discriminant analysis (CRDA), is based on linear discriminant analysis. It has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule.
arXiv Detail & Related papers (2020-05-09T06:55:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.