Inspector Gadget: A Data Programming-based Labeling System for
Industrial Images
- URL: http://arxiv.org/abs/2004.03264v3
- Date: Fri, 21 Aug 2020 04:12:15 GMT
- Title: Inspector Gadget: A Data Programming-based Labeling System for
Industrial Images
- Authors: Geon Heo, Yuji Roh, Seonghyeon Hwang, Dayun Lee, Steven Euijong Whang
- Abstract summary: Inspector Gadget is an image labeling system that combines crowdsourcing, data augmentation, and data programming to produce weak labels at scale for image classification.
We perform experiments on real industrial image datasets and show that Inspector Gadget obtains better performance than other weak-labeling techniques.
- Score: 9.087890731629097
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As machine learning for images becomes democratized in the Software 2.0 era,
one of the serious bottlenecks is securing enough labeled data for training.
This problem is especially critical in a manufacturing setting where smart
factories rely on machine learning for product quality control by analyzing
industrial images. Such images are typically large and may only need to be
partially analyzed where only a small portion is problematic (e.g., identifying
defects on a surface). Since manual labeling these images is expensive, weak
supervision is an attractive alternative where the idea is to generate weak
labels that are not perfect, but can be produced at scale. Data programming is
a recent paradigm in this category where it uses human knowledge in the form of
labeling functions and combines them into a generative model. Data programming
has been successful in applications based on text or structured data and can
also be applied to images usually if one can find a way to convert them into
structured data. In this work, we expand the horizon of data programming by
directly applying it to images without this conversion, which is a common
scenario for industrial applications. We propose Inspector Gadget, an image
labeling system that combines crowdsourcing, data augmentation, and data
programming to produce weak labels at scale for image classification. We
perform experiments on real industrial image datasets and show that Inspector
Gadget obtains better performance than other weak-labeling techniques: Snuba,
GOGGLES, and self-learning baselines using convolutional neural networks (CNNs)
without pre-training.
Related papers
- Deep Image Composition Meets Image Forgery [0.0]
Image forgery has been studied for many years.
Deep learning models require large amounts of labeled data for training.
We use state of the art image composition deep learning models to generate spliced images close to the quality of real-life manipulations.
arXiv Detail & Related papers (2024-04-03T17:54:37Z) - Towards Pragmatic Semantic Image Synthesis for Urban Scenes [4.36080478413575]
We present a new task: given a dataset with synthetic images and labels and a dataset with unlabeled real images, our goal is to learn a model that can generate images with the content of the input mask and the appearance of real images.
We leverage the synthetic image as a guide to the content of the generated image by penalizing the difference between their high-level features on a patch level.
In contrast to previous works which employ one discriminator that overfits the target domain semantic distribution, we employ a discriminator for the whole image and multiscale discriminators on the image patches.
arXiv Detail & Related papers (2023-05-16T18:01:12Z) - Losses over Labels: Weakly Supervised Learning via Direct Loss
Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning.
We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label.
We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z) - Robustar: Interactive Toolbox Supporting Precise Data Annotation for
Robust Vision Learning [53.900911121695536]
We introduce the initial release of our software Robustar.
It aims to improve the robustness of vision classification machine learning models through a data-driven perspective.
arXiv Detail & Related papers (2022-07-18T21:12:28Z) - Facilitated machine learning for image-based fruit quality assessment in
developing countries [68.8204255655161]
Automated image classification is a common task for supervised machine learning in food science.
We propose an alternative method based on pre-trained vision transformers (ViTs)
It can be easily implemented with limited resources on a standard device.
arXiv Detail & Related papers (2022-07-10T19:52:20Z) - Alternative Data Augmentation for Industrial Monitoring using
Adversarial Learning [0.0]
This study examines an industry application of data synthesization using generative adversarial networks.
We apply two different methods to create binary labels: a problem-tailored trigonometric function and a WGAN model.
The labels are translated into color images using pix2pix and used to train a U-Net.
arXiv Detail & Related papers (2022-05-09T12:16:38Z) - Towards Good Practices for Efficiently Annotating Large-Scale Image
Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images.
We propose modifications and best practices aimed at minimizing human labeling effort.
Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z) - Multi-label Zero-shot Classification by Learning to Transfer from
External Knowledge [36.04579549557464]
Multi-label zero-shot classification aims to predict multiple unseen class labels for an input image.
This paper introduces a novel multi-label zero-shot classification framework by learning to transfer from external knowledge.
arXiv Detail & Related papers (2020-07-30T17:26:46Z) - From ImageNet to Image Classification: Contextualizing Progress on
Benchmarks [99.19183528305598]
We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset.
Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for.
arXiv Detail & Related papers (2020-05-22T17:39:16Z) - Generative Adversarial Data Programming [32.2164057862111]
We show how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time.
This framework is extended to different setups, including self-supervised labeled image generation, zero-shot text to labeled image generation, transfer learning, and multi-task learning.
arXiv Detail & Related papers (2020-04-30T07:06:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.