Linking Robustness and Generalization: A k* Distribution Analysis of Concept Clustering in Latent Space for Vision Models
- URL: http://arxiv.org/abs/2408.09065v1
- Date: Sat, 17 Aug 2024 01:43:51 GMT
- Title: Linking Robustness and Generalization: A k* Distribution Analysis of Concept Clustering in Latent Space for Vision Models
- Authors: Shashank Kotyan, Pin-Yu Chen, Danilo Vasconcellos Vargas,
- Abstract summary: This article uses the k* Distribution, a local neighborhood analysis method, to examine the learned latent space at the level of individual concepts.
We introduce skewness-based true and approximate metrics for interpreting individual concepts to assess the overall quality of vision models' latent space.
- Score: 56.89974470863207
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most evaluations of vision models use indirect methods to assess latent space quality. These methods often involve adding extra layers to project the latent space into a new one. This projection makes it difficult to analyze and compare the original latent space. This article uses the k* Distribution, a local neighborhood analysis method, to examine the learned latent space at the level of individual concepts, which can be extended to examine the entire latent space. We introduce skewness-based true and approximate metrics for interpreting individual concepts to assess the overall quality of vision models' latent space. Our findings indicate that current vision models frequently fracture the distributions of individual concepts within the latent space. Nevertheless, as these models improve in generalization across multiple datasets, the degree of fracturing diminishes. A similar trend is observed in robust vision models, where increased robustness correlates with reduced fracturing. Ultimately, this approach enables a direct interpretation and comparison of the latent spaces of different vision models and reveals a relationship between a model's generalizability and robustness. Results show that as a model becomes more general and robust, it tends to learn features that result in better clustering of concepts. Project Website is available online at https://shashankkotyan.github.io/k-Distribution/
Related papers
- Comparing Fairness of Generative Mobility Models [3.699135947901772]
This work examines the fairness of generative mobility models, addressing the often overlooked dimension of equity in model performance across geographic regions.
Predictive models built on crowd flow data are instrumental in understanding urban structures and movement patterns.
We propose a novel framework for assessing fairness by measuring utility and equity of generated traces.
arXiv Detail & Related papers (2024-11-07T06:01:12Z) - Unsupervised Model Diagnosis [49.36194740479798]
This paper proposes Unsupervised Model Diagnosis (UMO) to produce semantic counterfactual explanations without any user guidance.
Our approach identifies and visualizes changes in semantics, and then matches these changes to attributes from wide-ranging text sources.
arXiv Detail & Related papers (2024-10-08T17:59:03Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Style-Hallucinated Dual Consistency Learning: A Unified Framework for
Visual Domain Generalization [113.03189252044773]
We propose a unified framework, Style-HAllucinated Dual consistEncy learning (SHADE), to handle domain shift in various visual tasks.
Our versatile SHADE can significantly enhance the generalization in various visual recognition tasks, including image classification, semantic segmentation and object detection.
arXiv Detail & Related papers (2022-12-18T11:42:51Z) - On the Transformation of Latent Space in Fine-Tuned NLP Models [21.364053591693175]
We study the evolution of latent space in fine-tuned NLP models.
We discover latent concepts in the representational space using hierarchical clustering.
We compare pre-trained and fine-tuned models across three models and three downstream tasks.
arXiv Detail & Related papers (2022-10-23T10:59:19Z) - Diversity vs. Recognizability: Human-like generalization in one-shot
generative models [5.964436882344729]
We propose a new framework to evaluate one-shot generative models along two axes: sample recognizability vs. diversity.
We first show that GAN-like and VAE-like models fall on opposite ends of the diversity-recognizability space.
In contrast, disentanglement transports the model along a parabolic curve that could be used to maximize recognizability.
arXiv Detail & Related papers (2022-05-20T13:17:08Z) - Unsupervised Learning of Global Factors in Deep Generative Models [6.362733059568703]
We present a novel deep generative model based on non i.i.d. variational autoencoders.
We show that the model performs domain alignment to find correlations and interpolate between different databases.
We also study the ability of the global space to discriminate between groups of observations with non-trivial underlying structures.
arXiv Detail & Related papers (2020-12-15T11:55:31Z) - Agglomerative Neural Networks for Multi-view Clustering [109.55325971050154]
We propose the agglomerative analysis to approximate the optimal consensus view.
We present Agglomerative Neural Network (ANN) based on Constrained Laplacian Rank to cluster multi-view data directly.
Our evaluations against several state-of-the-art multi-view clustering approaches on four popular datasets show the promising view-consensus analysis ability of ANN.
arXiv Detail & Related papers (2020-05-12T05:39:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.