Learning of Visual Relations: The Devil is in the Tails
- URL: http://arxiv.org/abs/2108.09668v1
- Date: Sun, 22 Aug 2021 08:59:35 GMT
- Title: Learning of Visual Relations: The Devil is in the Tails
- Authors: Alakh Desai, Tz-Ying Wu, Subarna Tripathi, Nuno Vasconcelos
- Abstract summary: Visual relation learning is a long-tailed problem, due to the nature of joint reasoning about groups of objects.
In this paper, we explore an alternative hypothesis, denoted the Devil is in the Tails.
Under this hypothesis, better performance is achieved by keeping the model simple but improving its ability to cope with long-tailed distributions.
- Score: 59.737494875502215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Significant effort has been recently devoted to modeling visual relations.
This has mostly addressed the design of architectures, typically by adding
parameters and increasing model complexity. However, visual relation learning
is a long-tailed problem, due to the combinatorial nature of joint reasoning
about groups of objects. Increasing model complexity is, in general, ill-suited
for long-tailed problems due to their tendency to overfit. In this paper, we
explore an alternative hypothesis, denoted the Devil is in the Tails. Under
this hypothesis, better performance is achieved by keeping the model simple but
improving its ability to cope with long-tailed distributions. To test this
hypothesis, we devise a new approach for training visual relationships models,
which is inspired by state-of-the-art long-tailed recognition literature. This
is based on an iterative decoupled training scheme, denoted Decoupled Training
for Devil in the Tails (DT2). DT2 employs a novel sampling approach,
Alternating Class-Balanced Sampling (ACBS), to capture the interplay between
the long-tailed entity and predicate distributions of visual relations. Results
show that, with an extremely simple architecture, DT2-ACBS significantly
outperforms much more complex state-of-the-art methods on scene graph
generation tasks. This suggests that the development of sophisticated models
must be considered in tandem with the long-tailed nature of the problem.
Related papers
- Data-efficient Large Vision Models through Sequential Autoregression [58.26179273091461]
We develop an efficient, autoregression-based vision model on a limited dataset.
We demonstrate how this model achieves proficiency in a spectrum of visual tasks spanning both high-level and low-level semantic understanding.
Our empirical evaluations underscore the model's agility in adapting to various tasks, heralding a significant reduction in the parameter footprint.
arXiv Detail & Related papers (2024-02-07T13:41:53Z) - RelVAE: Generative Pretraining for few-shot Visual Relationship
Detection [2.2230760534775915]
We present the first pretraining method for few-shot predicate classification that does not require any annotated relations.
We construct few-shot training splits and show quantitative experiments on VG200 and VRD datasets.
arXiv Detail & Related papers (2023-11-27T19:08:08Z) - Orthogonal Uncertainty Representation of Data Manifold for Robust
Long-Tailed Learning [52.021899899683675]
In scenarios with long-tailed distributions, the model's ability to identify tail classes is limited due to the under-representation of tail samples.
We propose an Orthogonal Uncertainty Representation (OUR) of feature embedding and an end-to-end training strategy to improve the long-tail phenomenon of model robustness.
arXiv Detail & Related papers (2023-10-16T05:50:34Z) - Alleviating the Effect of Data Imbalance on Adversarial Training [26.36714114672729]
We study adversarial training on datasets that obey the long-tailed distribution.
We propose a new adversarial training framework -- Re-balancing Adversarial Training (REAT)
arXiv Detail & Related papers (2023-07-14T07:01:48Z) - Improving Tail-Class Representation with Centroid Contrastive Learning [145.73991900239017]
We propose interpolative centroid contrastive learning (ICCL) to improve long-tailed representation learning.
ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the ICCL can be used to retrieve the centroids for both source classes.
Our result shows a significant accuracy gain of 2.8% on the iNaturalist 2018 dataset with a real-world long-tailed distribution.
arXiv Detail & Related papers (2021-10-19T15:24:48Z) - Exploring Task Difficulty for Few-Shot Relation Extraction [22.585574542329677]
Few-shot relation extraction (FSRE) focuses on recognizing novel relations by learning with merely a handful of annotated instances.
We introduce a novel approach based on contrastive learning that learns better representations by exploiting relation label information.
arXiv Detail & Related papers (2021-09-12T09:40:33Z) - A Multi-Level Attention Model for Evidence-Based Fact Checking [58.95413968110558]
We present a simple model that can be trained on sequence structures.
Results on a large-scale dataset for Fact Extraction and VERification show that our model outperforms the graph-based approaches.
arXiv Detail & Related papers (2021-06-02T05:40:12Z) - RH-Net: Improving Neural Relation Extraction via Reinforcement Learning
and Hierarchical Relational Searching [2.1828601975620257]
We propose a novel framework named RH-Net, which utilizes Reinforcement learning and Hierarchical relational searching module to improve relation extraction.
We then propose the hierarchical relational searching module to share the semantics from correlative instances between data-rich and data-poor classes.
arXiv Detail & Related papers (2020-10-27T12:50:27Z) - Convolutional Tensor-Train LSTM for Spatio-temporal Learning [116.24172387469994]
We propose a higher-order LSTM model that can efficiently learn long-term correlations in the video sequence.
This is accomplished through a novel tensor train module that performs prediction by combining convolutional features across time.
Our results achieve state-of-the-art performance-art in a wide range of applications and datasets.
arXiv Detail & Related papers (2020-02-21T05:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.