Related papers: Causal Scene BERT: Improving object detection by searching for challenging groups of data

Causal Scene BERT: Improving object detection by searching for challenging groups of data

URL: http://arxiv.org/abs/2202.03651v1
Date: Tue, 8 Feb 2022 05:14:16 GMT
Title: Causal Scene BERT: Improving object detection by searching for challenging groups of data
Authors: Cinjon Resnick, Or Litany, Amlan Kar, Karsten Kreis, James Lucas, Kyunghyun Cho, Sanja Fidler
Abstract summary: Computer vision applications rely on learning-based perception modules parameterized with neural networks for tasks like object detection. These modules frequently have low expected error overall but high error on atypical groups of data due to biases inherent in the training process. Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes.
Score: 125.40669814080047
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern computer vision applications rely on learning-based perception modules parameterized with neural networks for tasks like object detection. These modules frequently have low expected error overall but high error on atypical groups of data due to biases inherent in the training process. In building autonomous vehicles (AV), this problem is an especially important challenge because their perception modules are crucial to the overall system performance. After identifying failures in AV, a human team will comb through the associated data to group perception failures that share common causes. More data from these groups is then collected and annotated before retraining the model to fix the issue. In other words, error groups are found and addressed in hindsight. Our main contribution is a pseudo-automatic method to discover such groups in foresight by performing causal interventions on simulated scenes. To keep our interventions on the data manifold, we utilize masked language models. We verify that the prioritized groups found via intervention are challenging for the object detector and show that retraining with data collected from these groups helps inordinately compared to adding more IID data. We also plan to release software to run interventions in simulated scenes, which we hope will benefit the causality community.

Related papers

Group-robust Machine Unlearning [38.36863497458095]
This work tackles the overlooked problem of non-uniformly distributed forget sets. We present MIU (Mutual Information-aware Machine Unlearning), the first approach for group robustness in approximate machine unlearning.
arXiv Detail & Related papers (2025-03-12T12:24:05Z)
Revisiting Multi-Granularity Representation via Group Contrastive Learning for Unsupervised Vehicle Re-identification [2.4822156881137367]
We propose an unsupervised vehicle ReID framework (MGR-GCL) It integrates a multi-granularity CNN representation for learning discriminative transferable features. It generates pseudo labels for the target dataset, facilitating the domain adaptation process.
arXiv Detail & Related papers (2024-10-29T02:24:36Z)
A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap [50.079224604394]
We present a novel model-agnostic framework called textbfContext-textbfEnhanced textbfFeature textbfAment (CEFA) CEFA consists of a feature alignment module and a context enhancement module. Our method can serve as a plug-and-play module to improve the detection performance of HOI models on rare categories.
arXiv Detail & Related papers (2024-07-31T08:42:48Z)
Adaptive Testing of Computer Vision Models [22.213542525825144]
We introduce AdaVision, an interactive process for testing vision models which helps users identify and fix coherent failure modes. We demonstrate the usefulness and generality of AdaVision in user studies, where users find major bugs in state-of-the-art classification, object detection, and image captioning models.
arXiv Detail & Related papers (2022-12-06T05:52:31Z)
Unpaired Referring Expression Grounding via Bidirectional Cross-Modal Matching [53.27673119360868]
Referring expression grounding is an important and challenging task in computer vision. We propose a novel bidirectional cross-modal matching (BiCM) framework to address these challenges. Our framework outperforms previous works by 6.55% and 9.94% on two popular grounding datasets.
arXiv Detail & Related papers (2022-01-18T01:13:19Z)
Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions. We propose an algorithm that optimize for the worst-off group assignments from a constraint set. We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z)
Examining and Combating Spurious Features under Distribution Shift [94.31956965507085]
We define and analyze robust and spurious representations using the information-theoretic concept of minimal sufficient statistics. We prove that even when there is only bias of the input distribution, models can still pick up spurious features from their training data. Inspired by our analysis, we demonstrate that group DRO can fail when groups do not directly account for various spurious correlations.
arXiv Detail & Related papers (2021-06-14T05:39:09Z)
Unsupervised Domain Adaption of Object Detectors: A Survey [87.08473838767235]
Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications. Learning highly accurate models relies on the availability of datasets with a large number of annotated images. Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images.
arXiv Detail & Related papers (2021-05-27T23:34:06Z)
Learning data association without data association: An EM approach to neural assignment prediction [12.970250708769708]
This paper introduces an expectation maximisation approach to train neural models for data association. It does not require labelling information to train a model for object recognition. Importantly, networks trained using the proposed approach can be re-used in downstream tracking applications.
arXiv Detail & Related papers (2021-05-02T01:11:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.