No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
- URL: http://arxiv.org/abs/2509.00760v1
- Date: Sun, 31 Aug 2025 09:23:15 GMT
- Title: No More Sibling Rivalry: Debiasing Human-Object Interaction Detection
- Authors: Bin Yang, Yulin Zhang, Hong-Yu Zhou, Sibei Yang,
- Abstract summary: This study identifies a critical issue-"Toxic Siblings" bias-which hinders the interaction decoder's learning.<n>This bias arises from high confusion among sibling triplets/categories, where increased similarity paradoxically reduces precision.<n>We propose two novel debiasing learning objectives-"contrastive-then-calibration" and "merge-then-split"
- Score: 47.554732714656296
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Detection transformers have been applied to human-object interaction (HOI) detection, enhancing the localization and recognition of human-action-object triplets in images. Despite remarkable progress, this study identifies a critical issue-"Toxic Siblings" bias-which hinders the interaction decoder's learning, as numerous similar yet distinct HOI triplets interfere with and even compete against each other both input side and output side to the interaction decoder. This bias arises from high confusion among sibling triplets/categories, where increased similarity paradoxically reduces precision, as one's gain comes at the expense of its toxic sibling's decline. To address this, we propose two novel debiasing learning objectives-"contrastive-then-calibration" and "merge-then-split"-targeting the input and output perspectives, respectively. The former samples sibling-like incorrect HOI triplets and reconstructs them into correct ones, guided by strong positional priors. The latter first learns shared features among sibling categories to distinguish them from other groups, then explicitly refines intra-group differentiation to preserve uniqueness. Experiments show that we significantly outperform both the baseline (+9.18% mAP on HICO-Det) and the state-of-the-art (+3.59% mAP) across various settings.
Related papers
- Mitigate One, Skew Another? Tackling Intersectional Biases in Text-to-Image Models [73.20190633746442]
We introduce BiasConnect, a novel tool for analyzing and quantifying bias interactions in text-to-image models.<n>We propose InterMit, an intersectional bias mitigation algorithm guided by user-defined target distributions and priority weights.
arXiv Detail & Related papers (2025-05-22T20:56:38Z) - Mitigating Spurious Negative Pairs for Robust Industrial Anomaly Detection [9.93548802132951]
The robustness of existing detection methods against adversarial attacks remains a challenge, compromising their reliability in real-world applications such as autonomous driving.<n>We propose a pseudo-anomaly group derived from normal group samples as an ideal objective function for adversarial training in AD.<n>We show that spurious negative pairs compromise the conventional contrastive loss to achieve robust AD.
arXiv Detail & Related papers (2025-01-26T07:32:39Z) - I$^2$MD: 3D Action Representation Learning with Inter- and Intra-modal
Mutual Distillation [147.2183428328396]
We introduce a general Inter- and Intra-modal Mutual Distillation (I$2$MD) framework.
In I$2$MD, we first re-formulate the cross-modal interaction as a Cross-modal Mutual Distillation (CMD) process.
To alleviate the interference of similar samples and exploit their underlying contexts, we further design the Intra-modal Mutual Distillation (IMD) strategy.
arXiv Detail & Related papers (2023-10-24T07:22:17Z) - Human-Object Interaction Detection via Disentangled Transformer [63.46358684341105]
We present Disentangled Transformer, where both encoder and decoder are disentangled to facilitate learning of two sub-tasks.
Our method outperforms prior work on two public HOI benchmarks by a sizeable margin.
arXiv Detail & Related papers (2022-04-20T08:15:04Z) - Analyzing and Mitigating Interference in Neural Architecture Search [96.60805562853153]
We investigate the interference issue by sampling different child models and calculating the gradient similarity of shared operators.
Inspired by these two observations, we propose two approaches to mitigate the interference.
Our searched architecture outperforms RoBERTa$_rm base$ by 1.1 and 0.6 scores and ELECTRA$_rm base$ by 1.6 and 1.1 scores on the dev and test set of GLUE benchmark.
arXiv Detail & Related papers (2021-08-29T11:07:46Z) - ICE: Inter-instance Contrastive Encoding for Unsupervised Person
Re-identification [7.766663319126491]
Unsupervised person re-identification (ReID) aims at learning discriminative identity features without annotations.
We propose Inter-instance Contrastive ICE that leverages inter-instance pairwise similarity scores to boost previous class-level contrastive ReID methods.
Experiments on several large-scale person ReID datasets validate the effectiveness of our proposed unsupervised method ICE.
arXiv Detail & Related papers (2021-03-30T14:05:09Z) - First Target and Opinion then Polarity: Enhancing Target-opinion
Correlation for Aspect Sentiment Triplet Extraction [45.82241446769157]
Aspect Sentiment Triplet Extraction (ASTE) aims to extract triplets from a sentence, including target entities, associated sentiment polarities, and opinion spans which rationalize the polarities.
Existing methods are short on building correlation between target-opinion pairs, and neglect the mutual interference among different sentiment triplets.
We propose a novel two-stage method which enhances the correlation between targets and opinions through sequence tagging.
arXiv Detail & Related papers (2021-02-17T03:28:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.