Related papers: Diversity by Design: Addressing Mode Collapse Improves scRNA-seq Perturbation Modeling on Well-Calibrated Metrics

Diversity by Design: Addressing Mode Collapse Improves scRNA-seq Perturbation Modeling on Well-Calibrated Metrics

URL: http://arxiv.org/abs/2506.22641v1
Date: Fri, 27 Jun 2025 21:12:46 GMT
Title: Diversity by Design: Addressing Mode Collapse Improves scRNA-seq Perturbation Modeling on Well-Calibrated Metrics
Authors: Gabriel M. Mejia, Henry E. Miller, Francis J. A. Leblanc, Bo Wang, Brendan Swain, Lucas Paulo de Lima Camillo,
Abstract summary: Recent benchmarks reveal that models for single-cell perturbation response are often outperformed by simply predicting the dataset mean.<n>We trace this anomaly to a metric artifact: control-referenced deltas and unweighted error metrics reward mode collapse whenever the control is biased or the biological signal is sparse.<n>We introduce differentially expressed gene (DEG)-aware metrics, weighted mean-squared error (WMSE) and weighted delta $R2$ ($R2_w(Delta)$) with respect to all perturbations, that measure error in niche signals with high sensitivity.
Score: 1.756284691863048
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent benchmarks reveal that models for single-cell perturbation response are often outperformed by simply predicting the dataset mean. We trace this anomaly to a metric artifact: control-referenced deltas and unweighted error metrics reward mode collapse whenever the control is biased or the biological signal is sparse. Large-scale \textit{in silico} simulations and analysis of two real-world perturbation datasets confirm that shared reference shifts, not genuine biological change, drives high performance in these evaluations. We introduce differentially expressed gene (DEG)-aware metrics, weighted mean-squared error (WMSE) and weighted delta $R^{2}$ ($R^{2}_{w}(\Delta)$) with respect to all perturbations, that measure error in niche signals with high sensitivity. We further introduce negative and positive performance baselines to calibrate these metrics. With these improvements, the mean baseline sinks to null performance while genuine predictors are correctly rewarded. Finally, we show that using WMSE as a loss function reduces mode collapse and improves model performance.

Related papers

Strengthening Anomaly Awareness [0.0]
We present a refined version of the Anomaly Awareness framework for enhancing unsupervised anomaly detection.<n>Our approach introduces minimal supervision into Variational Autoencoders (VAEs) through a two-stage training strategy.
arXiv Detail & Related papers (2025-04-15T16:52:22Z)
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [41.50816120270017]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z)
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods [11.895321856533934]
gradient descent (SGD) and adaptive gradient methods have been widely used in training deep neural networks. We empirically show that while the difference between the standard generalization performance of models trained using these methods is small, those trained using SGD exhibit far greater robustness under input perturbations.
arXiv Detail & Related papers (2023-08-13T07:03:22Z)
Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data. We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z)
Global Convergence of Sub-gradient Method for Robust Matrix Recovery: Small Initialization, Noisy Measurements, and Over-parameterization [4.7464518249313805]
Sub-gradient method (SubGM) is used to recover a low-rank matrix from a limited number of measurements. We show that SubGM converges to the true solution, even under arbitrarily large and arbitrarily dense noise values.
arXiv Detail & Related papers (2022-02-17T17:50:04Z)
BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG) Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data. We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z)
OMASGAN: Out-of-Distribution Minimum Anomaly Score GAN for Sample Generation on the Boundary [0.0]
Generative models set high likelihood and low reconstruction loss to Out-of-Distribution (OoD) samples. OMASGAN generates, in a negative data augmentation manner, anomalous samples on the estimated distribution boundary. OMASGAN performs retraining by including the abnormal minimum-anomaly-score OoD samples generated on the distribution boundary.
arXiv Detail & Related papers (2021-10-28T16:35:30Z)
On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD) We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting. We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z)
Regression Bugs Are In Your Model! Measuring, Reducing and Analyzing Regressions In NLP Model Updates [68.09049111171862]
This work focuses on quantifying, reducing and analyzing regression errors in the NLP model updates. We formulate the regression-free model updates into a constrained optimization problem. We empirically analyze how model ensemble reduces regression.
arXiv Detail & Related papers (2021-05-07T03:33:00Z)
Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift. We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness. The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
When Does Preconditioning Help or Hurt Generalization? [74.25170084614098]
We show how the textitimplicit bias of first and second order methods affects the comparison of generalization properties. We discuss several approaches to manage the bias-variance tradeoff, and the potential benefit of interpolating between GD and NGD.
arXiv Detail & Related papers (2020-06-18T17:57:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.