Inverse Image Frequency for Long-tailed Image Recognition
- URL: http://arxiv.org/abs/2209.04861v2
- Date: Sat, 7 Oct 2023 12:15:00 GMT
- Title: Inverse Image Frequency for Long-tailed Image Recognition
- Authors: Konstantinos Panagiotis Alexandridis and Shan Luo and Anh Nguyen and
Jiankang Deng and Stefanos Zafeiriou
- Abstract summary: We propose a novel de-biasing method named Inverse Image Frequency (IIF)
IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network.
Our experiments show that IIF surpasses the state of the art on many long-tailed benchmarks.
- Score: 59.40098825416675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The long-tailed distribution is a common phenomenon in the real world.
Extracted large scale image datasets inevitably demonstrate the long-tailed
property and models trained with imbalanced data can obtain high performance
for the over-represented categories, but struggle for the under-represented
categories, leading to biased predictions and performance degradation. To
address this challenge, we propose a novel de-biasing method named Inverse
Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation
of the logits in the classification layer of a convolutional neural network.
Our method achieves stronger performance than similar works and it is
especially useful for downstream tasks such as long-tailed instance
segmentation as it produces fewer false positive detections. Our extensive
experiments show that IIF surpasses the state of the art on many long-tailed
benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8%
top-1 accuracy with ResNet50 on ImageNet-LT and 26.2% segmentation AP with
MaskRCNN on LVIS. Code available at https://github.com/kostas1515/iif
Related papers
- Misalignment-Robust Frequency Distribution Loss for Image Transformation [51.0462138717502]
This paper aims to address a common challenge in deep learning-based image transformation methods, such as image enhancement and super-resolution.
We introduce a novel and simple Frequency Distribution Loss (FDL) for computing distribution distance within the frequency domain.
Our method is empirically proven effective as a training constraint due to the thoughtful utilization of global information in the frequency domain.
arXiv Detail & Related papers (2024-02-28T09:27:41Z) - Target-aware Bi-Transformer for Few-shot Segmentation [4.3753381458828695]
Few-shot semantic segmentation (FSS) aims to use limited labeled support images to identify the segmentation of new classes of objects.
In this paper, we propose the Target-aware Bi-Transformer Network (TBTNet) to equivalent treat of support images and query image.
A vigorous Target-aware Transformer Layer (TTL) also be designed to distill correlations and force the model to focus on foreground information.
arXiv Detail & Related papers (2023-09-18T05:28:51Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - ImageNet-Hard: The Hardest Images Remaining from a Study of the Power of
Zoom and Spatial Biases in Image Classification [9.779748872936912]
We show that proper framing of the input image can lead to the correct classification of 98.91% of ImageNet images.
We propose a test-time augmentation (TTA) technique that improves classification accuracy by forcing models to explicitly perform zoom-in operations.
arXiv Detail & Related papers (2023-04-11T23:55:50Z) - Improving GANs for Long-Tailed Data through Group Spectral
Regularization [51.58250647277375]
We propose a novel group Spectral Regularizer (gSR) that prevents the spectral explosion alleviating mode collapse.
We find that gSR effectively combines with existing augmentation and regularization techniques, leading to state-of-the-art image generation performance on long-tailed data.
arXiv Detail & Related papers (2022-08-21T17:51:05Z) - Revisiting Global Statistics Aggregation for Improving Image Restoration [8.803962179239385]
Test-time Local Statistics Converter (TLSC) significantly improves image restorer's performance.
By extending SE with TLSC to the state-of-the-art models, MPRNet boost by 0.65 dB in PSNR on GoPro dataset, achieves 33.31 dB, exceeds the previous best result 0.6 dB.
arXiv Detail & Related papers (2021-12-08T12:52:14Z) - VL-LTR: Learning Class-wise Visual-Linguistic Representation for
Long-Tailed Visual Recognition [61.75391989107558]
We present a visual-linguistic long-tailed recognition framework, termed VL-LTR.
Our method can learn visual representation from images and corresponding linguistic representation from noisy class-level text descriptions.
Notably, our method achieves 77.2% overall accuracy on ImageNet-LT, which significantly outperforms the previous best method by over 17 points.
arXiv Detail & Related papers (2021-11-26T16:24:03Z) - Scalable Visual Transformers with Hierarchical Pooling [61.05787583247392]
We propose a Hierarchical Visual Transformer (HVT) which progressively pools visual tokens to shrink the sequence length.
It brings a great benefit by scaling dimensions of depth/width/resolution/patch size without introducing extra computational complexity.
Our HVT outperforms the competitive baselines on ImageNet and CIFAR-100 datasets.
arXiv Detail & Related papers (2021-03-19T03:55:58Z) - Image Segmentation Using Hybrid Representations [2.414172101538764]
We introduce an end-to-end U-Net based network called DU-Net for medical image segmentation.
SC are translation invariant and Lipschitz continuous to deformations which help DU-Net outperform other conventional CNN counterparts.
The proposed method shows remarkable improvement over the basic U-Net with performance competitive to state-of-the-art methods.
arXiv Detail & Related papers (2020-04-15T13:07:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.