Data-Centric Debugging: mitigating model failures via targeted data
collection
- URL: http://arxiv.org/abs/2211.09859v1
- Date: Thu, 17 Nov 2022 19:44:02 GMT
- Title: Data-Centric Debugging: mitigating model failures via targeted data
collection
- Authors: Sahil Singla, Atoosa Malemir Chegini, Mazda Moayeri, Soheil Feiz
- Abstract summary: Deep neural networks can be unreliable in the real world when the training set does not adequately cover all the settings where they are deployed.
We propose a general methodology for model debug that can systemically improve model performance on $mathcalE$ while maintaining its performance on the original test set.
- Score: 4.599792546344752
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks can be unreliable in the real world when the training
set does not adequately cover all the settings where they are deployed.
Focusing on image classification, we consider the setting where we have an
error distribution $\mathcal{E}$ representing a deployment scenario where the
model fails. We have access to a small set of samples $\mathcal{E}_{sample}$
from $\mathcal{E}$ and it can be expensive to obtain additional samples. In the
traditional model development framework, mitigating failures of the model in
$\mathcal{E}$ can be challenging and is often done in an ad hoc manner. In this
paper, we propose a general methodology for model debugging that can
systemically improve model performance on $\mathcal{E}$ while maintaining its
performance on the original test set. Our key assumption is that we have access
to a large pool of weakly (noisily) labeled data $\mathcal{F}$. However,
naively adding $\mathcal{F}$ to the training would hurt model performance due
to the large extent of label noise. Our Data-Centric Debugging (DCD) framework
carefully creates a debug-train set by selecting images from $\mathcal{F}$ that
are perceptually similar to the images in $\mathcal{E}_{sample}$. To do this,
we use the $\ell_2$ distance in the feature space (penultimate layer
activations) of various models including ResNet, Robust ResNet and DINO where
we observe DINO ViTs are significantly better at discovering similar images
compared to Resnets. Compared to LPIPS, we find that our method reduces compute
and storage requirements by 99.58\%. Compared to the baselines that maintain
model performance on the test set, we achieve significantly (+9.45\%) improved
results on the debug-heldout sets.
Related papers
- Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis [55.561961365113554]
3D Gaussian Splatting (3DGS) has demonstrated remarkable effectiveness for novel view synthesis (NVS)
However, the 3DGS model tends to overfit when trained with sparse posed views, limiting its generalization ability to novel views.
We present a Self-Ensembling Gaussian Splatting (SE-GS) approach to alleviate the overfitting problem.
Our approach improves NVS quality with few-shot training views, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2024-10-31T18:43:48Z) - Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs [62.565573316667276]
We develop an objective that encodes how a sample relates to others.
We train vision models based on similarities in class or text caption descriptions.
Our objective appears to work particularly well in lower-data regimes, with gains over CLIP of $16.8%$ on ImageNet and $18.1%$ on ImageNet Real.
arXiv Detail & Related papers (2024-07-25T15:38:16Z) - Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors [58.661454334877256]
Drug-Target binding Affinity (DTA) prediction is essential for drug discovery.
Despite the application of deep learning methods to DTA prediction, the achieved accuracy remain suboptimal.
We propose $k$NN-DTA, a non-representation embedding-based retrieval method adopted on a pre-trained DTA prediction model.
arXiv Detail & Related papers (2024-07-21T15:49:05Z) - Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation [12.62718910894575]
Point cloud segmentation (PCS) plays an essential role in robot perception and navigation tasks.
To efficiently understand large-scale outdoor point clouds, their range image representation is commonly adopted.
However, undesirable missing values in the range images damage the shapes and patterns of objects.
This problem creates difficulty for the models in learning coherent and complete geometric information from the objects.
arXiv Detail & Related papers (2024-05-16T15:13:42Z) - ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object [78.58860252442045]
We introduce generative model as a data source for hard images that benchmark deep models' robustness.
We are able to generate images with more diversified backgrounds, textures, and materials than any prior work, where we term this benchmark as ImageNet-D.
Our work suggests that diffusion models can be an effective source to test vision models.
arXiv Detail & Related papers (2024-03-27T17:23:39Z) - Better Diffusion Models Further Improve Adversarial Training [97.44991845907708]
It has been recognized that the data generated by the diffusion probabilistic model (DDPM) improves adversarial training.
This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency.
Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data.
arXiv Detail & Related papers (2023-02-09T13:46:42Z) - Dep-$L_0$: Improving $L_0$-based Network Sparsification via Dependency
Modeling [6.081082481356211]
Training deep neural networks with an $L_0$ regularization is one of the prominent approaches for network pruning or sparsification.
We show that this method performs inconsistently on large-scale learning tasks, such as ResNet50 on ImageNet.
We propose a dependency modeling of binary gates, which can be modeled effectively as a multi-layer perceptron.
arXiv Detail & Related papers (2021-06-30T19:33:35Z) - Webly Supervised Image Classification with Self-Contained Confidence [36.87209906372911]
This paper focuses on webly supervised learning (WSL), where datasets are built by crawling samples from the Internet and directly using search queries as web labels.
We introduce Self-Contained Confidence ( SCC) by adapting model uncertainty for WSL setting, and use it to sample-wisely balance $mathcalL_s$ and $mathcalL_w$.
The proposed WSL framework has achieved the state-of-the-art results on two large-scale WSL datasets, WebVision-1000 and Food101-N. Code.
arXiv Detail & Related papers (2020-08-27T02:49:51Z) - Inner Ensemble Networks: Average Ensemble as an Effective Regularizer [20.33062212014075]
Inner Ensemble Networks (IENs) reduce the variance within the neural network itself without an increase in the model complexity.
IENs utilize ensemble parameters during the training phase to reduce the network variance.
arXiv Detail & Related papers (2020-06-15T11:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.