Related papers: Similarity-Distance-Magnitude Activations

Similarity-Distance-Magnitude Activations

URL: http://arxiv.org/abs/2509.12760v2
Date: Thu, 30 Oct 2025 06:08:58 GMT
Title: Similarity-Distance-Magnitude Activations
Authors: Allen Schmaltz,
Abstract summary: We introduce the Similarity-Distance-Magnitude activation function, a more robust and interpretable formulation of the standard softmax activation function.<n>We also introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce the Similarity-Distance-Magnitude (SDM) activation function, a more robust and interpretable formulation of the standard softmax activation function, adding Similarity (i.e., correctly predicted depth-matches into training) awareness and Distance-to-training-distribution awareness to the existing output Magnitude (i.e., decision-boundary) awareness, and enabling interpretability-by-exemplar via dense matching. We further introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation, to control the class- and prediction-conditional accuracy among selective classifications. When used as the final-layer activation over pre-trained language models for selective classification, the SDM estimator is more robust to co-variate shifts and out-of-distribution inputs than existing calibration methods using softmax activations, while remaining informative over in-distribution data.

Related papers

Similarity-Distance-Magnitude Language Models [0.0]
We introduce Similarity-Distance-Magnitude (SDM) language models (LMs)<n>LMs are sequence prediction models fine-tuned to maximize the proportion of generations in the well-calibrated, high-probability region partitioned by a final-layer SDM activation layer used for binary classification of instruction-following.
arXiv Detail & Related papers (2025-10-30T06:42:15Z)
Similarity-Distance-Magnitude Universal Verification [0.0]
We build SDM networks with uncertainty-aware verification and interpretability-by-exemplar as intrinsic properties.<n>We provide open-source software implementing these results.
arXiv Detail & Related papers (2025-02-27T15:05:00Z)
Post-hoc Probabilistic Vision-Language Models [51.12284891724463]
Vision-language models (VLMs) have found remarkable success in classification, retrieval, and generative tasks.<n>We propose post-hoc uncertainty estimation in VLMs that does not require additional training.<n>Our results show promise for safety-critical applications of large-scale models.
arXiv Detail & Related papers (2024-12-08T18:16:13Z)
Efficient Distribution Matching of Representations via Noise-Injected Deep InfoMax [73.03684002513218]
We enhance Deep InfoMax (DIM) to enable automatic matching of learned representations to a selected prior distribution.<n>We show that such modification allows for learning uniformly and normally distributed representations.<n>The results indicate a moderate trade-off between the performance on the downstream tasks and quality of DM.
arXiv Detail & Related papers (2024-10-09T15:40:04Z)
Downstream-Pretext Domain Knowledge Traceback for Active Learning [138.02530777915362]
We propose a downstream-pretext domain knowledge traceback (DOKT) method that traces the data interactions of downstream knowledge and pre-training guidance. DOKT consists of a traceback diversity indicator and a domain-based uncertainty estimator. Experiments conducted on ten datasets show that our model outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-20T01:34:13Z)
Towards Robust and Interpretable EMG-based Hand Gesture Recognition using Deep Metric Meta Learning [37.21211404608413]
We propose a shift to deep metric-based meta-learning in EMG PR to supervise the creation of meaningful and interpretable representations. We derive a robust class proximity-based confidence estimator that leads to a better rejection of incorrect decisions.
arXiv Detail & Related papers (2024-04-17T23:37:50Z)
PALM: Pushing Adaptive Learning Rate Mechanisms for Continual Test-Time Adaptation [6.181548939188321]
Real-world vision models in dynamic environments face rapid shifts in domain distributions, leading to decreased recognition performance.<n>We propose continuous test-time adaptation (CTTA) to adjust a pre-trained source discriminative model to these changing domains.<n>We conduct extensive image classification experiments on CIFAR-10C, CIFAR-100C, and ImageNet-C, demonstrating the superior efficacy of our method compared to prior approaches.
arXiv Detail & Related papers (2024-03-15T19:35:10Z)
Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data. One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is. This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Adaptive Dimension Reduction and Variational Inference for Transductive Few-Shot Classification [2.922007656878633]
We propose a new clustering method based on Variational Bayesian inference, further improved by Adaptive Dimension Reduction. Our proposed method significantly improves accuracy in the realistic unbalanced transductive setting on various Few-Shot benchmarks.
arXiv Detail & Related papers (2022-09-18T10:29:02Z)
Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer. With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices. We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z)
MCDAL: Maximum Classifier Discrepancy for Active Learning [74.73133545019877]
Recent state-of-the-art active learning methods have mostly leveraged Generative Adversarial Networks (GAN) for sample acquisition. We propose in this paper a novel active learning framework that we call Maximum Discrepancy for Active Learning (MCDAL) In particular, we utilize two auxiliary classification layers that learn tighter decision boundaries by maximizing the discrepancies among them.
arXiv Detail & Related papers (2021-07-23T06:57:08Z)
Learning Calibrated Uncertainties for Domain Shift: A Distributionally Robust Learning Approach [150.8920602230832]
We propose a framework for learning calibrated uncertainties under domain shifts. In particular, the density ratio estimation reflects the closeness of a target (test) sample to the source (training) distribution. We show that our proposed method generates calibrated uncertainties that benefit downstream tasks.
arXiv Detail & Related papers (2020-10-08T02:10:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.