Generating Features with Increased Crop-related Diversity for Few-Shot
Object Detection
- URL: http://arxiv.org/abs/2304.05096v1
- Date: Tue, 11 Apr 2023 09:47:21 GMT
- Title: Generating Features with Increased Crop-related Diversity for Few-Shot
Object Detection
- Authors: Jingyi Xu, Hieu Le, Dimitris Samaras
- Abstract summary: Two-stage object detectors generate object proposals and classify them to detect objects in images.
Proposals often do not contain the objects perfectly but overlap with them in many possible ways.
We propose a novel variational autoencoder based data generation model, which is capable of generating data with increased crop-related diversity.
- Score: 35.652092907690694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Two-stage object detectors generate object proposals and classify them to
detect objects in images. These proposals often do not contain the objects
perfectly but overlap with them in many possible ways, exhibiting great
variability in the difficulty levels of the proposals. Training a robust
classifier against this crop-related variability requires abundant training
data, which is not available in few-shot settings. To mitigate this issue, we
propose a novel variational autoencoder (VAE) based data generation model,
which is capable of generating data with increased crop-related diversity. The
main idea is to transform the latent space such latent codes with different
norms represent different crop-related variations. This allows us to generate
features with increased crop-related diversity in difficulty levels by simply
varying the latent norm. In particular, each latent code is rescaled such that
its norm linearly correlates with the IoU score of the input crop w.r.t. the
ground-truth box. Here the IoU score is a proxy that represents the difficulty
level of the crop. We train this VAE model on base classes conditioned on the
semantic code of each class and then use the trained model to generate features
for novel classes. In our experiments our generated features consistently
improve state-of-the-art few-shot object detection methods on the PASCAL VOC
and MS COCO datasets.
Related papers
- Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models [49.439311430360284]
We introduce a novel data synthesis method inspired by contrastive learning and image difference captioning.
Our key idea involves challenging the model to discern both matching and distinct elements.
We leverage this generated dataset to fine-tune state-of-the-art (SOTA) MLLMs.
arXiv Detail & Related papers (2024-08-08T17:10:16Z) - CamDiff: Camouflage Image Augmentation via Diffusion Model [83.35960536063857]
CamDiff is a novel approach to synthesize salient objects in camouflaged scenes.
We leverage the latent diffusion model to synthesize salient objects in camouflaged scenes.
Our approach enables flexible editing and efficient large-scale dataset generation at a low cost.
arXiv Detail & Related papers (2023-04-11T19:37:47Z) - MixTeacher: Mining Promising Labels with Mixed Scale Teacher for
Semi-Supervised Object Detection [22.047246997864143]
Scale variation across object instances remains a key challenge in object detection task.
We propose a novel framework that addresses the scale variation problem by introducing a mixed scale teacher.
Our experiments on MS COCO and PASCAL VOC benchmarks under various semi-supervised settings demonstrate that our method achieves new state-of-the-art performance.
arXiv Detail & Related papers (2023-03-16T03:37:54Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Balancing Constraints and Submodularity in Data Subset Selection [43.03720397062461]
We show that one can achieve similar accuracy to traditional deep-learning models, while using less training data.
We propose a novel diversity driven objective function, and balancing constraints on class labels and decision boundaries using matroids.
arXiv Detail & Related papers (2021-04-26T19:22:27Z) - Multi-scale Interactive Network for Salient Object Detection [91.43066633305662]
We propose the aggregate interaction modules to integrate the features from adjacent levels.
To obtain more efficient multi-scale features, the self-interaction modules are embedded in each decoder unit.
Experimental results on five benchmark datasets demonstrate that the proposed method without any post-processing performs favorably against 23 state-of-the-art approaches.
arXiv Detail & Related papers (2020-07-17T15:41:37Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.