Which Similarity-Sensitive Entropy?
- URL: http://arxiv.org/abs/2511.03849v2
- Date: Mon, 10 Nov 2025 17:32:16 GMT
- Title: Which Similarity-Sensitive Entropy?
- Authors: Phuc Nguyen, Josiah Couch, Rahul Bansal, Alexandra Morgan, Chris Tam, Miao Li, Rima Arnaout, Ramy Arnaout,
- Abstract summary: We show that LCR and VS can differ by orders of magnitude and can capture complementary information about a system.<n>We conclude that VS is preferable only when interpreting elements as linear combinations of a more fundamental set of ur-elements'' or when the system or dataset possesses a quantum-mechanical character.
- Score: 39.154447089247114
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A canonical step in quantifying a system is to measure its entropy. Shannon entropy and other traditional entropy measures capture only the information encoded in the frequencies of a system's elements. Recently, Leinster, Cobbold, and Reeve (LCR) introduced a method that also captures the rich information encoded in the similarities and differences among elements, yielding similarity-sensitive entropy. More recently, the Vendi score (VS) was introduced as an alternative, raising the question of how LCR and VS compare, and which is preferable. Here we address these questions conceptually, analytically, and experimentally, using 53 machine-learning datasets. We show that LCR and VS can differ by orders of magnitude and can capture complementary information about a system, except in limiting cases. We demonstrate that both LCR and VS depend on how similarities are scaled and introduce the concept of ``half distance'' to parameterize this dependence. We prove that VS provides an upper bound on LCR for several values of the R\'enyi-Hill order parameter and conjecture that this bound holds for all values. We conclude that VS is preferable only when interpreting elements as linear combinations of a more fundamental set of ``ur-elements'' or when the system or dataset possesses a quantum-mechanical character. In the broader circumstance where one seeks simply to capture the rich information encoded by similarity, LCR is favored; nevertheless, for certain half-distances the two methods can complement each other.
Related papers
- The Information Theory of Similarity [0.0]
We establish a precise mathematical equivalence between witness-based similarity systems (REWA) and Shannon's information theory.<n>This unification reveals that fifty years of similarity search research implicitly developed information theory for relational data.
arXiv Detail & Related papers (2025-11-29T08:12:45Z) - What Representational Similarity Measures Imply about Decodable Information [6.5879381737929945]
We show that some neural network similarity measures can be equivalently motivated from a decoding perspective.
Measures like CKA and CCA quantify the average alignment between optimal linear readouts across a distribution of decoding tasks.
Overall, our work demonstrates a tight link between the geometry of neural representations and the ability to linearly decode information.
arXiv Detail & Related papers (2024-11-12T21:37:10Z) - GSSF: Generalized Structural Sparse Function for Deep Cross-modal Metric Learning [51.677086019209554]
We propose a Generalized Structural Sparse to capture powerful relationships across modalities for pair-wise similarity learning.
The distance metric delicately encapsulates two formats of diagonal and block-diagonal terms.
Experiments on cross-modal and two extra uni-modal retrieval tasks have validated its superiority and flexibility.
arXiv Detail & Related papers (2024-10-20T03:45:50Z) - Rethinking Distance Metrics for Counterfactual Explainability [53.436414009687]
We investigate a framing for counterfactual generation methods that considers counterfactuals not as independent draws from a region around the reference, but as jointly sampled with the reference from the underlying data distribution.
We derive a distance metric, tailored for counterfactual similarity that can be applied to a broad range of settings.
arXiv Detail & Related papers (2024-10-18T15:06:50Z) - Mutual information chain rules for security proofs robust against device imperfections [0.0]
We analyze quantum cryptography with imperfect devices that leak additional information to an adversary.<n>We show that these results can be used to handle some device imperfections in a variety of device-dependent and device-independent protocols.
arXiv Detail & Related papers (2024-07-29T19:47:47Z) - Optimizing entanglement in two-qubit systems [0.0]
We investigate entanglement in two-qubit systems using a geometric representation based on the minimum of essential parameters.<n>We find that optimized states of two qubits are X-shaped and host pairs of identical populations.<n>A geometric L-measure of entanglement is introduced as the distance between the points in S that represent entangled states and the closest point that defines separable states.
arXiv Detail & Related papers (2024-07-26T02:56:04Z) - Differentiable Optimization of Similarity Scores Between Models and Brains [1.5391321019692434]
Similarity measures such as linear regression, Centered Kernel Alignment (CKA), Normalized Bures Similarity (NBS), and angular Procrustes distance are often used to quantify this similarity.<n>Here, we introduce a novel tool to investigate what drives high similarity scores and what constitutes a "good" score.<n>Surprisingly, we find that high similarity scores do not guarantee encoding task-relevant information in a manner consistent with neural data.
arXiv Detail & Related papers (2024-07-09T17:31:47Z) - Unifying (Quantum) Statistical and Parametrized (Quantum) Algorithms [65.268245109828]
We take inspiration from Kearns' SQ oracle and Valiant's weak evaluation oracle.
We introduce an extensive yet intuitive framework that yields unconditional lower bounds for learning from evaluation queries.
arXiv Detail & Related papers (2023-10-26T18:23:21Z) - Dataset Condensation with Latent Space Knowledge Factorization and
Sharing [73.31614936678571]
We introduce a novel approach for solving dataset condensation problem by exploiting the regularity in a given dataset.
Instead of condensing the dataset directly in the original input space, we assume a generative process of the dataset with a set of learnable codes.
We experimentally show that our method achieves new state-of-the-art records by significant margins on various benchmark datasets.
arXiv Detail & Related papers (2022-08-21T18:14:08Z) - Disentangled Representation Learning for Text-Video Retrieval [51.861423831566626]
Cross-modality interaction is a critical component in Text-Video Retrieval (TVR)
We study the interaction paradigm in depth, where we find that its computation can be split into two terms.
We propose a disentangled framework to capture a sequential and hierarchical representation.
arXiv Detail & Related papers (2022-03-14T13:55:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.