How Scale Breaks "Normalized Stress" and KL Divergence: Rethinking Quality Metrics
- URL: http://arxiv.org/abs/2510.08660v1
- Date: Thu, 09 Oct 2025 13:11:31 GMT
- Title: How Scale Breaks "Normalized Stress" and KL Divergence: Rethinking Quality Metrics
- Authors: Kiran Smelser, Kaviru Gunaratne, Jacob Miller, Stephen Kobourov,
- Abstract summary: Researchers often use quality metrics to measure the accuracy of two-dimensional scatter plots.<n>One of the most commonly employed metrics, normalized stress, is sensitive to uniform scaling (stretching, shrinking) of the projection.<n>We show just how much the values change and how this affects dimension reduction technique evaluations.
- Score: 0.20999222360659606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Complex, high-dimensional data is ubiquitous across many scientific disciplines, including machine learning, biology, and the social sciences. One of the primary methods of visualizing these datasets is with two-dimensional scatter plots that visually capture some properties of the data. Because visually determining the accuracy of these plots is challenging, researchers often use quality metrics to measure the projection's accuracy and faithfulness to the original data. One of the most commonly employed metrics, normalized stress, is sensitive to uniform scaling (stretching, shrinking) of the projection, despite this act not meaningfully changing anything about the projection. Another quality metric, the Kullback--Leibler (KL) divergence used in the popular t-Distributed Stochastic Neighbor Embedding (t-SNE) technique, is also susceptible to this scale sensitivity. We investigate the effect of scaling on stress and KL divergence analytically and empirically by showing just how much the values change and how this affects dimension reduction technique evaluations. We introduce a simple technique to make both metrics scale-invariant and show that it accurately captures expected behavior on a small benchmark.
Related papers
- Generalization Below the Edge of Stability: The Role of Data Geometry [60.147710896851045]
We show how data geometry controls generalization in ReLU networks trained below the edge of stability.<n>For data distributions supported on a mixture of low-dimensional balls, we derive generalization bounds that provably adapt to the intrinsic dimension.<n>Our results consolidate disparate empirical findings that have appeared in the literature.
arXiv Detail & Related papers (2025-10-20T21:40:36Z) - A statistical theory of overfitting for imbalanced classification [0.6144680854063939]
We develop a statistical theory for high-dimensional imbalanced classification.<n>We find that dimensionality induces truncation or skewing effects on the logit distribution.<n>This phenomenon explains why the minority class is more severely affected by overfitting.
arXiv Detail & Related papers (2025-02-17T00:21:33Z) - Network scaling and scale-driven loss balancing for intelligent poroelastography [2.665036498336221]
A deep learning framework is developed for multiscale characterization of poroelastic media from full waveform data.
Two major challenges impede direct application of existing state-of-the-art techniques for this purpose.
We propose the idea of emphnetwork scaling where the neural property maps are constructed by unit shape functions composed into a scaling layer.
arXiv Detail & Related papers (2024-10-27T23:06:29Z) - The Star Geometry of Critic-Based Regularizer Learning [2.2530496464901106]
Variational regularization is a technique to solve statistical inference tasks and inverse problems.
Recent works learn task-dependent regularizers by integrating information about the measurements and ground-truth data.
There is little theory about the structure of regularizers learned via this process and how it relates to the two data distributions.
arXiv Detail & Related papers (2024-08-29T18:34:59Z) - "Normalized Stress" is Not Normalized: How to Interpret Stress Correctly [0.4915744683251151]
Stress is among the most commonly employed quality metrics and optimization criteria for dimension reduction projections of high dimensional data.
One of the most commonly employed metrics, normalized stress, is sensitive to uniform scaling of the projection, despite this act not meaningfully changing anything about the projection.
We introduce a simple technique to make normalized stress scale invariant and show that it accurately captures expected behavior on a small benchmark.
arXiv Detail & Related papers (2024-08-14T13:42:47Z) - Geo-Localization Based on Dynamically Weighted Factor-Graph [74.75763142610717]
Feature-based geo-localization relies on associating features extracted from aerial imagery with those detected by the vehicle's sensors.
This requires that the type of landmarks must be observable from both sources.
We present a dynamically weighted factor graph model for the vehicle's trajectory estimation.
arXiv Detail & Related papers (2023-11-13T12:44:14Z) - Gradient-Based Feature Learning under Structured Data [57.76552698981579]
In the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction.
We show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue.
In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent.
arXiv Detail & Related papers (2023-09-07T16:55:50Z) - An evaluation framework for dimensionality reduction through sectional
curvature [59.40521061783166]
In this work, we aim to introduce the first highly non-supervised dimensionality reduction performance metric.
To test its feasibility, this metric has been used to evaluate the performance of the most commonly used dimension reduction algorithms.
A new parameterized problem instance generator has been constructed in the form of a function generator.
arXiv Detail & Related papers (2023-03-17T11:59:33Z) - Self-similarity Driven Scale-invariant Learning for Weakly Supervised
Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL)
We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features.
Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z) - A Geometric Perspective towards Neural Calibration via Sensitivity
Decomposition [31.557715381838147]
It is well known that vision classification models suffer from poor calibration in the face of data distribution shifts.
We propose Geometric Sensitivity Decomposition (GSD) which decomposes the norm of a sample feature embedding into an instance-dependent and an instance-independent component.
Inspired by the decomposition, we analytically derive a simple extension to current softmax-linear models, which learns to disentangle the two components during training.
arXiv Detail & Related papers (2021-10-27T16:46:41Z) - Learning High-Precision Bounding Box for Rotated Object Detection via
Kullback-Leibler Divergence [100.6913091147422]
Existing rotated object detectors are mostly inherited from the horizontal detection paradigm.
In this paper, we are motivated to change the design of rotation regression loss from induction paradigm to deduction methodology.
arXiv Detail & Related papers (2021-06-03T14:29:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.