Robustar: Interactive Toolbox Supporting Precise Data Annotation for
Robust Vision Learning
- URL: http://arxiv.org/abs/2207.08944v1
- Date: Mon, 18 Jul 2022 21:12:28 GMT
- Title: Robustar: Interactive Toolbox Supporting Precise Data Annotation for
Robust Vision Learning
- Authors: Chonghan Chen, Haohan Wang, Leyang Hu, Yuhao Zhang, Shuguang Lyu,
Jingcheng Wu, Xinnuo Li, Linjing Sun, Eric P. Xing
- Abstract summary: We introduce the initial release of our software Robustar.
It aims to improve the robustness of vision classification machine learning models through a data-driven perspective.
- Score: 53.900911121695536
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce the initial release of our software Robustar, which aims to
improve the robustness of vision classification machine learning models through
a data-driven perspective. Building upon the recent understanding that the lack
of machine learning model's robustness is the tendency of the model's learning
of spurious features, we aim to solve this problem from its root at the data
perspective by removing the spurious features from the data before training. In
particular, we introduce a software that helps the users to better prepare the
data for training image classification models by allowing the users to annotate
the spurious features at the pixel level of images. To facilitate this process,
our software also leverages recent advances to help identify potential images
and pixels worthy of attention and to continue the training with newly
annotated data. Our software is hosted at the GitHub Repository
https://github.com/HaohanWang/Robustar.
Related papers
- Enhancing Large Vision Language Models with Self-Training on Image Comprehension [99.9389737339175]
We introduce Self-Training on Image (STIC), which emphasizes a self-training approach specifically for image comprehension.
First, the model self-constructs a preference for image descriptions using unlabeled images.
To further self-improve reasoning on the extracted visual information, we let the model reuse a small portion of existing instruction-tuning data.
arXiv Detail & Related papers (2024-05-30T05:53:49Z) - DINOv2: Learning Robust Visual Features without Supervision [75.42921276202522]
This work shows that existing pretraining methods, especially self-supervised methods, can produce such features if trained on enough curated data from diverse sources.
Most of the technical contributions aim at accelerating and stabilizing the training at scale.
In terms of data, we propose an automatic pipeline to build a dedicated, diverse, and curated image dataset instead of uncurated data, as typically done in the self-supervised literature.
arXiv Detail & Related papers (2023-04-14T15:12:19Z) - Applied Federated Learning: Architectural Design for Robust and
Efficient Learning in Privacy Aware Settings [0.8454446648908585]
The classical machine learning paradigm requires the aggregation of user data in a central location.
Centralization of data poses risks, including a heightened risk of internal and external security incidents.
Federated learning with differential privacy is designed to avoid the server-side centralization pitfall.
arXiv Detail & Related papers (2022-06-02T00:30:04Z) - Vision Models Are More Robust And Fair When Pretrained On Uncurated
Images Without Supervision [38.22842778742829]
Discriminative self-supervised learning allows training models on any random group of internet images.
We train models on billions of random images without any data pre-processing or prior assumptions about what we want the model to learn.
We extensively study and validate our model performance on over 50 benchmarks including fairness, to distribution shift, geographical diversity, fine grained recognition, image copy detection and many image classification datasets.
arXiv Detail & Related papers (2022-02-16T22:26:47Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z) - Automated Cleanup of the ImageNet Dataset by Model Consensus,
Explainability and Confident Learning [0.0]
ImageNet was the backbone of various convolutional neural networks (CNNs) trained on ILSVRC12Net.
This paper describes automated applications based on model consensus, explainability and confident learning to correct labeling mistakes.
The ImageNet-Clean improves the model performance by 2-2.4 % for SqueezeNet and EfficientNet-B0 models.
arXiv Detail & Related papers (2021-03-30T13:16:35Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Cross-Model Image Annotation Platform with Active Learning [0.0]
This work presents an End-to-End pipeline tool for object annotation and recognition.
We have developed a modular image annotation platform which seamlessly incorporates assisted image annotation, active learning and model training and evaluation.
The highest accuracy achieved is 74%.
arXiv Detail & Related papers (2020-08-06T01:42:25Z) - Saliency-driven Class Impressions for Feature Visualization of Deep
Neural Networks [55.11806035788036]
It is advantageous to visualize the features considered to be essential for classification.
Existing visualization methods develop high confidence images consisting of both background and foreground features.
In this work, we propose a saliency-driven approach to visualize discriminative features that are considered most important for a given task.
arXiv Detail & Related papers (2020-07-31T06:11:06Z) - Unsupervised machine learning via transfer learning and k-means
clustering to classify materials image data [0.0]
This paper demonstrates how to construct, use, and evaluate a high performance unsupervised machine learning system for classifying images.
We use the VGG16 convolutional neural network pre-trained on the ImageNet dataset of natural images to extract feature representations for each micrograph.
The approach achieves $99.4% pm 0.16%$ accuracy, and the resulting model can be used to classify new images without retraining.
arXiv Detail & Related papers (2020-07-16T14:36:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.