Extracting Human Attention through Crowdsourced Patch Labeling
- URL: http://arxiv.org/abs/2403.15013v1
- Date: Fri, 22 Mar 2024 07:57:27 GMT
- Title: Extracting Human Attention through Crowdsourced Patch Labeling
- Authors: Minsuk Chang, Seokhyeon Park, Hyeon Jeon, Aeri Cho, Soohyun Lee, Jinwook Seo,
- Abstract summary: In image classification, a significant problem arises from bias in the datasets.
One approach to mitigate such biases is to direct the model's attention toward the target object's location.
We propose a novel patch-labeling method that integrates AI assistance with crowdsourcing to capture human attention from images.
- Score: 18.947126675569667
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In image classification, a significant problem arises from bias in the datasets. When it contains only specific types of images, the classifier begins to rely on shortcuts - simplistic and erroneous rules for decision-making. This leads to high performance on the training dataset but inferior results on new, varied images, as the classifier's generalization capability is reduced. For example, if the images labeled as mustache consist solely of male figures, the model may inadvertently learn to classify images by gender rather than the presence of a mustache. One approach to mitigate such biases is to direct the model's attention toward the target object's location, usually marked using bounding boxes or polygons for annotation. However, collecting such annotations requires substantial time and human effort. Therefore, we propose a novel patch-labeling method that integrates AI assistance with crowdsourcing to capture human attention from images, which can be a viable solution for mitigating bias. Our method consists of two steps. First, we extract the approximate location of a target using a pre-trained saliency detection model supplemented by human verification for accuracy. Then, we determine the human-attentive area in the image by iteratively dividing the image into smaller patches and employing crowdsourcing to ascertain whether each patch can be classified as the target object. We demonstrated the effectiveness of our method in mitigating bias through improved classification accuracy and the refined focus of the model. Also, crowdsourced experiments validate that our method collects human annotation up to 3.4 times faster than annotating object locations with polygons, significantly reducing the need for human resources. We conclude the paper by discussing the advantages of our method in a crowdsourcing context, mainly focusing on aspects of human errors and accessibility.
Related papers
- DiffUHaul: A Training-Free Method for Object Dragging in Images [78.93531472479202]
We propose a training-free method, dubbed DiffUHaul, for the object dragging task.
We first apply attention masking in each denoising step to make the generation more disentangled across different objects.
In the early denoising steps, we interpolate the attention features between source and target images to smoothly fuse new layouts with the original appearance.
arXiv Detail & Related papers (2024-06-03T17:59:53Z) - Mitigating Bias Using Model-Agnostic Data Attribution [2.9868610316099335]
Mitigating bias in machine learning models is a critical endeavor for ensuring fairness and equity.
We propose a novel approach to address bias by leveraging pixel image attributions to identify and regularize regions of images containing bias attributes.
arXiv Detail & Related papers (2024-05-08T13:00:56Z) - Multilevel Saliency-Guided Self-Supervised Learning for Image Anomaly
Detection [15.212031255539022]
Anomaly detection (AD) is a fundamental task in computer vision.
We propose CutSwap, which leverages saliency guidance to incorporate semantic cues for augmentation.
CutSwap achieves state-of-the-art AD performance on two mainstream AD benchmark datasets.
arXiv Detail & Related papers (2023-11-30T08:03:53Z) - Addressing Weak Decision Boundaries in Image Classification by
Leveraging Web Search and Generative Models [14.732229124148596]
One major issue among many is that machine learning models do not perform equally well for underrepresented groups.
We propose an approach that leverages the power of web search and generative models to alleviate some of the shortcomings of discriminative models.
Although we showcase our method on vulnerable populations in this study, the proposed technique is extendable to a wide range of problems and domains.
arXiv Detail & Related papers (2023-10-30T20:04:50Z) - Self-similarity Driven Scale-invariant Learning for Weakly Supervised
Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL)
We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features.
Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z) - An Experience-based Direct Generation approach to Automatic Image
Cropping [0.0]
We propose a novel method to crop images directly without explicitly modeling image aesthetics.
Our model is trained on a large dataset of images cropped by experienced editors.
We show that our strategy is competitive with or performs better than existing methods in two related tasks.
arXiv Detail & Related papers (2022-12-30T06:25:27Z) - Spuriosity Rankings: Sorting Data to Measure and Mitigate Biases [62.54519787811138]
We present a simple but effective method to measure and mitigate model biases caused by reliance on spurious cues.
We rank images within their classes based on spuriosity, proxied via deep neural features of an interpretable network.
Our results suggest that model bias due to spurious feature reliance is influenced far more by what the model is trained on than how it is trained.
arXiv Detail & Related papers (2022-12-05T23:15:43Z) - Towards Good Practices for Efficiently Annotating Large-Scale Image
Classification Datasets [90.61266099147053]
We investigate efficient annotation strategies for collecting multi-class classification labels for a large collection of images.
We propose modifications and best practices aimed at minimizing human labeling effort.
Simulated experiments on a 125k image subset of the ImageNet100 show that it can be annotated to 80% top-1 accuracy with 0.35 annotations per image on average.
arXiv Detail & Related papers (2021-04-26T16:29:32Z) - Learning to Detect Important People in Unlabelled Images for
Semi-supervised Important People Detection [85.91577271918783]
We propose learning important people detection on partially annotated images.
Our approach iteratively learns to assign pseudo-labels to individuals in un-annotated images.
We have collected two large-scale datasets for evaluation.
arXiv Detail & Related papers (2020-04-16T10:09:37Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.