Contrastive losses as generalized models of global epistasis
- URL: http://arxiv.org/abs/2305.03136v3
- Date: Fri, 1 Dec 2023 18:09:00 GMT
- Title: Contrastive losses as generalized models of global epistasis
- Authors: David H. Brookes, Jakub Otwinowski, and Sam Sinai
- Abstract summary: Fitness functions map large spaces of biological sequences to properties of interest.
We show that minimizing contrastive loss functions is a simple and flexible technique for extracting the sparse latent function implied by global epistasis.
We show that contrastive losses are able to accurately estimate a ranking function from limited data even in regimes where MSE is ineffective.
- Score: 0.5461938536945721
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fitness functions map large combinatorial spaces of biological sequences to
properties of interest. Inferring these multimodal functions from experimental
data is a central task in modern protein engineering. Global epistasis models
are an effective and physically-grounded class of models for estimating fitness
functions from observed data. These models assume that a sparse latent function
is transformed by a monotonic nonlinearity to emit measurable fitness. Here we
demonstrate that minimizing contrastive loss functions, such as the
Bradley-Terry loss, is a simple and flexible technique for extracting the
sparse latent function implied by global epistasis. We argue by way of a
fitness-epistasis uncertainty principle that the nonlinearities in global
epistasis models can produce observed fitness functions that do not admit
sparse representations, and thus may be inefficient to learn from observations
when using a Mean Squared Error (MSE) loss (a common practice). We show that
contrastive losses are able to accurately estimate a ranking function from
limited data even in regimes where MSE is ineffective. We validate the
practical utility of this insight by showing contrastive loss functions result
in consistently improved performance on benchmark tasks.
Related papers
- Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - On the Efficacy of Generalization Error Prediction Scoring Functions [33.24980750651318]
Generalization error predictors (GEPs) aim to predict model performance on unseen distributions by deriving dataset-level error estimates from sample-level scores.
We rigorously study the effectiveness of popular scoring functions (confidence, local manifold smoothness, model agreement) independent of mechanism choice.
arXiv Detail & Related papers (2023-03-23T18:08:44Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Using Focal Loss to Fight Shallow Heuristics: An Empirical Analysis of
Modulated Cross-Entropy in Natural Language Inference [0.0]
In some datasets, deep neural networks discover underlyings that allow them to take shortcuts in the learning process, resulting in poor generalization capability.
Instead of using standard cross-entropy, we explore whether a modulated version of cross-entropy called focal loss can constrain the model so as not to use underlyings and improve generalization performance.
Our experiments in natural language inference show that focal loss has a regularizing impact on the learning process, increasing accuracy on out-of-distribution data, but slightly decreasing performance on in-distribution data.
arXiv Detail & Related papers (2022-11-23T22:19:00Z) - A Fair Loss Function for Network Pruning [93.0013343535411]
We introduce the performance weighted loss function, a simple modified cross-entropy loss function that can be used to limit the introduction of biases during pruning.
Experiments using biased classifiers for facial classification and skin-lesion classification tasks demonstrate that the proposed method is a simple and effective tool.
arXiv Detail & Related papers (2022-11-18T15:17:28Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Efficient Multidimensional Functional Data Analysis Using Marginal
Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data.
We show that the resulting estimation problem can be solved efficiently by the tensor decomposition.
We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z) - Non-parametric Models for Non-negative Functions [48.7576911714538]
We provide the first model for non-negative functions from the same good linear models.
We prove that it admits a representer theorem and provide an efficient dual formulation for convex problems.
arXiv Detail & Related papers (2020-07-08T07:17:28Z) - The Real-World-Weight Cross-Entropy Loss Function: Modeling the Costs of
Mislabeling [0.0]
We introduce the Real-World- Weight Crossentropy loss function, in both binary and single-label classification variants.
Both variants allow direct input of real world costs as weights.
For single-label, multicategory classification, our loss function also allows directization of probabilistic false positives, weighted by label, during the training of a machine learning model.
arXiv Detail & Related papers (2020-01-03T08:54:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.