Related papers: FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting

FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting

URL: http://arxiv.org/abs/2502.10677v1
Date: Sat, 15 Feb 2025 05:24:43 GMT
Title: FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting
Authors: Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Xian Zhong, Shengfeng He,
Abstract summary: In class-agnostic object counting, the goal is to estimate the total number of object instances in an image without distinguishing between specific categories.<n>Our approach significantly improves the model's ability to distinguish between specific classes and general counts.
Score: 34.25988613929425
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In class-agnostic object counting, the goal is to estimate the total number of object instances in an image without distinguishing between specific categories. Existing methods often predict this count without considering class-specific outputs, leading to inaccuracies when such outputs are required. These inaccuracies stem from two key challenges: 1) the prevalence of single-category images in datasets, which leads models to generalize specific categories as representative of all objects, and 2) the use of mean squared error loss during training, which applies uniform penalization. This uniform penalty disregards errors in less frequent categories, particularly when these errors contribute minimally to the overall loss. To address these issues, we propose {FocalCount}, a novel approach that leverages diverse feature attributes to estimate the number of object categories in an image. This estimate serves as a weighted factor to correct class-count imbalances. Additionally, we introduce {Focal-MSE}, a new loss function that integrates binary cross-entropy to generate stronger error gradients, enhancing the model's sensitivity to errors in underrepresented categories. Our approach significantly improves the model's ability to distinguish between specific classes and general counts, demonstrating superior performance and scalability in both few-shot and zero-shot scenarios across three object counting datasets. The code will be released soon.

Related papers

Understanding the Detrimental Class-level Effects of Data Augmentation [63.1733767714073]
achieving optimal average accuracy comes at the cost of significantly hurting individual class accuracy by as much as 20% on ImageNet. We present a framework for understanding how DA interacts with class-level learning dynamics. We show that simple class-conditional augmentation strategies improve performance on the negatively affected classes.
arXiv Detail & Related papers (2023-12-07T18:37:43Z)
Deep Imbalanced Regression via Hierarchical Classification Adjustment [50.19438850112964]
Regression tasks in computer vision are often formulated into classification by quantizing the target space into classes. The majority of training samples lie in a head range of target values, while a minority of samples span a usually larger tail range. We propose to construct hierarchical classifiers for solving imbalanced regression tasks. Our novel hierarchical classification adjustment (HCA) for imbalanced regression shows superior results on three diverse tasks.
arXiv Detail & Related papers (2023-10-26T04:54:39Z)
No One Left Behind: Improving the Worst Categories in Long-Tailed Learning [29.89394406438639]
We argue that under such an evaluation setting, some categories are inevitably sacrificed. We propose a simple plug-in method that is applicable to a wide range of methods.
arXiv Detail & Related papers (2023-03-07T03:24:54Z)
Generalization Bounds for Few-Shot Transfer Learning with Pretrained Classifiers [26.844410679685424]
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes. We show that the few-shot error of the learned feature map on new classes is small in case of class-feature-variability collapse.
arXiv Detail & Related papers (2022-12-23T18:46:05Z)
Learning Debiased and Disentangled Representations for Semantic Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation. By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes. Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z)
On Clustering Categories of Categorical Predictors in Generalized Linear Models [0.0]
We propose a method to reduce the complexity of Generalized Linear Models in the presence of categorical predictors. The traditional one-hot encoding, where each category is represented by a dummy variable, can be wasteful, difficult to interpret, and prone to overfitting. This paper addresses these challenges by finding a reduced representation of the categorical predictors by clustering their categories.
arXiv Detail & Related papers (2021-10-19T15:36:35Z)
Redesigning the classification layer by randomizing the class representation vectors [12.953517767147998]
We analyze how simple design choices for the classification layer affect the learning dynamics. We show that the standard cross-entropy training implicitly captures visual similarities between different classes. We propose to draw the class vectors randomly and set them as fixed during training, thus invalidating the visual similarities encoded in these vectors.
arXiv Detail & Related papers (2020-11-16T13:45:23Z)
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification. Our analysis reveals that the classification accuracy is highly distribution-dependent. The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z)
Closing the Generalization Gap in One-Shot Object Detection [92.82028853413516]
We show that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead in scaling the number of categories. Future data annotation efforts should therefore focus on wider datasets and annotate a larger number of categories.
arXiv Detail & Related papers (2020-11-09T09:31:17Z)
Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives. A ranked range is a consecutive sequence of sorted values of a set of real numbers. We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.