Exploiting the Potential of Datasets: A Data-Centric Approach for Model
Robustness
- URL: http://arxiv.org/abs/2203.05323v1
- Date: Thu, 10 Mar 2022 12:16:32 GMT
- Title: Exploiting the Potential of Datasets: A Data-Centric Approach for Model
Robustness
- Authors: Yiqi Zhong, Lei Wu, Xianming Liu, Junjun Jiang
- Abstract summary: We propose a novel algorithm for dataset enhancement that works well for many existing deep neural networks.
In the data-centric robust learning competition hosted by Alibaba Group and Tsinghua University, our algorithm came third out of more than 3000 competitors.
- Score: 48.70325679650579
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robustness of deep neural networks (DNNs) to malicious perturbations is a hot
topic in trustworthy AI. Existing techniques obtain robust models given fixed
datasets, either by modifying model structures, or by optimizing the process of
inference or training. While significant improvements have been made, the
possibility of constructing a high-quality dataset for model robustness remain
unexplored. Follow the campaign of data-centric AI launched by Andrew Ng, we
propose a novel algorithm for dataset enhancement that works well for many
existing DNN models to improve robustness. Transferable adversarial examples
and 14 kinds of common corruptions are included in our optimized dataset. In
the data-centric robust learning competition hosted by Alibaba Group and
Tsinghua University, our algorithm came third out of more than 3000 competitors
in the first stage while we ranked fourth in the second stage. Our code is
available at \url{https://github.com/hncszyq/tianchi_challenge}.
Related papers
- RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection [11.265512559447986]
We introduce RU-AI, a new large-scale multimodal dataset for detecting machine-generated content in text, image, and voice.
Our dataset is constructed from three large publicly available datasets: Flickr8K, COCO, and Places205.
Our proposed unified model, which incorporates a multimodal embedding module with a multilayer perceptron network, can effectively determine the origin of the data.
arXiv Detail & Related papers (2024-06-07T12:58:14Z) - 3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data.
We learn a set of vectors that deform the objects in an adversarial fashion.
We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z) - Pseudo-Trilateral Adversarial Training for Domain Adaptive
Traversability Prediction [8.145900996884993]
Traversability prediction is a fundamental perception capability for autonomous navigation.
We propose a novel perception model that adopts a coarse-to-fine alignment (CALI) to perform unsupervised domain adaptation (UDA)
We show the superiorities of our proposed models over multiple baselines in several challenging domain adaptation setups.
arXiv Detail & Related papers (2023-06-26T00:39:32Z) - CILIATE: Towards Fairer Class-based Incremental Learning by Dataset and
Training Refinement [20.591583747291892]
We show that CIL suffers both dataset and algorithm bias problems.
We propose a novel framework, CILIATE, that fixes both dataset and algorithm bias in CIL.
CILIATE improves the fairness of CIL by 17.03%, 22.46%, and 31.79% compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-04-09T12:10:39Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - A Tale of Two Cities: Data and Configuration Variances in Robust Deep
Learning [27.498927971861068]
Deep neural networks (DNNs) are widely used in many industries such as image recognition, supply chain, medical diagnosis, and autonomous driving.
Prior work has shown the high accuracy of a DNN model does not imply high robustness because the input data and external environment are constantly changing.
arXiv Detail & Related papers (2022-11-18T03:32:53Z) - Can we achieve robustness from data alone? [0.7366405857677227]
Adversarial training and its variants have come to be the prevailing methods to achieve adversarially robust classification using neural networks.
We devise a meta-learning method for robust classification, that optimize the dataset prior to its deployment in a principled way.
Experiments on MNIST and CIFAR-10 demonstrate that the datasets we produce enjoy very high robustness against PGD attacks.
arXiv Detail & Related papers (2022-07-24T12:14:48Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - 2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors
Challenges: An Efficient Optical Flow Stream Guided Framework [57.847010327319964]
We propose a data-efficient framework that can train the model from scratch on small datasets.
Specifically, by introducing a 3D central difference convolution operation, we proposed a novel C3D neural network-based two-stream framework.
It is proved that our method can achieve a promising result even without a pre-trained model on large scale datasets.
arXiv Detail & Related papers (2020-08-10T09:50:28Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.