Revealing the Underlying Patterns: Investigating Dataset Similarity,
Performance, and Generalization
- URL: http://arxiv.org/abs/2308.03580v3
- Date: Fri, 29 Dec 2023 15:48:41 GMT
- Title: Revealing the Underlying Patterns: Investigating Dataset Similarity,
Performance, and Generalization
- Authors: Akshit Achara, Ram Krishna Pandey
- Abstract summary: Supervised deep learning models require significant amount of labeled data to achieve an acceptable performance on a specific task.
We establish image-image, dataset-dataset, and image-dataset distances to gain insights into the model's behavior.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised deep learning models require significant amount of labeled data to
achieve an acceptable performance on a specific task. However, when tested on
unseen data, the models may not perform well. Therefore, the models need to be
trained with additional and varying labeled data to improve the generalization.
In this work, our goal is to understand the models, their performance and
generalization. We establish image-image, dataset-dataset, and image-dataset
distances to gain insights into the model's behavior. Our proposed distance
metric when combined with model performance can help in selecting an
appropriate model/architecture from a pool of candidate architectures. We have
shown that the generalization of these models can be improved by only adding a
small number of unseen images (say 1, 3 or 7) into the training set. Our
proposed approach reduces training and annotation costs while providing an
estimate of model performance on unseen data in dynamic environments.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Model Selection with Model Zoo via Graph Learning [45.30615308692713]
We introduce TransferGraph, a novel framework that reformulates model selection as a graph learning problem.
We demonstrate TransferGraph's effectiveness in capturing essential model-dataset relationships, yielding up to a 32% improvement in correlation between predicted performance and the actual fine-tuning results compared to the state-of-the-art methods.
arXiv Detail & Related papers (2024-04-05T09:50:00Z) - Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - Has Your Pretrained Model Improved? A Multi-head Posterior Based
Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models.
We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models.
Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z) - A Simple and Efficient Baseline for Data Attribution on Images [107.12337511216228]
Current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions.
In this work, we focus on a minimalist baseline, utilizing the feature space of a backbone pretrained via self-supervised learning to perform data attribution.
Our method is model-agnostic and scales easily to large datasets.
arXiv Detail & Related papers (2023-11-03T17:29:46Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Delving Deeper into Data Scaling in Masked Image Modeling [145.36501330782357]
We conduct an empirical study on the scaling capability of masked image modeling (MIM) methods for visual recognition.
Specifically, we utilize the web-collected Coyo-700M dataset.
Our goal is to investigate how the performance changes on downstream tasks when scaling with different sizes of data and models.
arXiv Detail & Related papers (2023-05-24T15:33:46Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Localized Latent Updates for Fine-Tuning Vision-Language Models [15.285292154680246]
In this work we suggest a lightweight adapter, that only updates the models predictions close to seen datapoints.
We demonstrate the effectiveness and speed of this relatively simple approach in the context of few-shot learning, where our results both on classes seen and unseen during training are comparable with or improve on the state of the art.
arXiv Detail & Related papers (2022-12-13T13:15:20Z) - Comprehensive and Efficient Data Labeling via Adaptive Model Scheduling [25.525371500391568]
In certain applications, such as image retrieval platforms and photo album management apps, it is often required to execute a collection of models to obtain sufficient labels.
We propose an Adaptive Model Scheduling framework, consisting of 1) a deep reinforcement learning-based approach to predict the value of untrivial models by mining semantic relationship among diverse models, and 2) two algorithms to adaptively schedule the model execution order under a deadline or deadline-memory constraints respectively.
Our design could save around 53% execution time without loss of any valuable labels.
arXiv Detail & Related papers (2020-02-08T03:54:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.