Related papers: D3: Data Diversity Design for Systematic Generalization in Visual Question Answering

D3: Data Diversity Design for Systematic Generalization in Visual Question Answering

URL: http://arxiv.org/abs/2309.08798v1
Date: Fri, 15 Sep 2023 22:45:02 GMT
Title: D3: Data Diversity Design for Systematic Generalization in Visual Question Answering
Authors: Amir Rahimi, Vanessa D'Amario, Moyuru Yamada, Kentaro Takemoto, Tomotake Sasaki, Xavier Boix
Abstract summary: We show that the diversity of simple tasks plays a key role in achieving systematic generalization. This implies that it may not be essential to gather a large and varied number of complex tasks, which could be costly to obtain.
Score: 6.392972407599867
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Systematic generalization is a crucial aspect of intelligence, which refers to the ability to generalize to novel tasks by combining known subtasks and concepts. One critical factor that has been shown to influence systematic generalization is the diversity of training data. However, diversity can be defined in various ways, as data have many factors of variation. A more granular understanding of how different aspects of data diversity affect systematic generalization is lacking. We present new evidence in the problem of Visual Question Answering (VQA) that reveals that the diversity of simple tasks (i.e. tasks formed by a few subtasks and concepts) plays a key role in achieving systematic generalization. This implies that it may not be essential to gather a large and varied number of complex tasks, which could be costly to obtain. We demonstrate that this result is independent of the similarity between the training and testing data and applies to well-known families of neural network architectures for VQA (i.e. monolithic architectures and neural module networks). Additionally, we observe that neural module networks leverage all forms of data diversity we evaluated, while monolithic architectures require more extensive amounts of data to do so. These findings provide a first step towards understanding the interactions between data diversity design, neural network architectures, and systematic generalization capabilities.

Related papers

Are Unified Vision-Language Models Necessary: Generalization Across Understanding and Generation [50.22361866757033]
unified vision-language models (VLMs) integrate both visual understanding and generation capabilities.<n>This paper systematically investigates the generalization across understanding and generation tasks in unifiedVLMs.
arXiv Detail & Related papers (2025-05-29T03:40:21Z)
A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images. We validated our approach on a large hyperspectral dataset called TAIGA. A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z)
Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data. We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z)
Attribute Diversity Determines the Systematicity Gap in VQA [7.433031036510163]
We study the systematicity gap in visual question answering. We find that the systematicity gap is not reduced by increasing the quantity of training data. In particular, our experiments suggest that the more distinct attribute type combinations are seen during training, the more systematic we can expect the resulting model to be.
arXiv Detail & Related papers (2023-11-15T04:50:30Z)
Leveraging sparse and shared feature activations for disentangled representation learning [112.22699167017471]
We propose to leverage knowledge extracted from a diversified set of supervised tasks to learn a common disentangled representation. We validate our approach on six real world distribution shift benchmarks, and different data modalities.
arXiv Detail & Related papers (2023-04-17T01:33:24Z)
Synergistic information supports modality integration and flexible learning in neural networks solving multiple tasks [107.8565143456161]
We investigate the information processing strategies adopted by simple artificial neural networks performing a variety of cognitive tasks. Results show that synergy increases as neural networks learn multiple diverse tasks. randomly turning off neurons during training through dropout increases network redundancy, corresponding to an increase in robustness.
arXiv Detail & Related papers (2022-10-06T15:36:27Z)
Regressing Relative Fine-Grained Change for Sub-Groups in Unreliable Heterogeneous Data Through Deep Multi-Task Metric Learning [0.5999777817331317]
We investigate how techniques in multi-task metric learning can be applied for theregression of fine-grained change in real data. The techniques investigated are specifically tailored for handling heterogeneous data sources.
arXiv Detail & Related papers (2022-08-11T12:57:11Z)
Diversity and Generalization in Neural Network Ensembles [0.0]
We combine and expand previously published results in a theoretically sound framework that describes the relationship between diversity and ensemble performance. We provide sound answers to the following questions: how to measure diversity, how diversity relates to the generalization error of an ensemble, and how diversity is promoted by neural network ensemble algorithms.
arXiv Detail & Related papers (2021-10-26T15:41:10Z)
SYGMA: System for Generalizable Modular Question Answering OverKnowledge Bases [57.89642289610301]
We present SYGMA, a modular approach facilitating general-izability across multiple knowledge bases and multiple rea-soning types. We demonstrate effectiveness of our system by evaluating on datasets belonging to two distinct knowledge bases,DBpedia and Wikidata.
arXiv Detail & Related papers (2021-09-28T01:57:56Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics [131.93113552146195]
We present a new dataset, Handwritten arithmetic with INTegers (HINT), to examine machines' capability of learning generalizable concepts. In HINT, machines are tasked with learning how concepts are perceived from raw signals such as images. We undertake extensive experiments with various sequence-to-sequence models, including RNNs, Transformers, and GPT-3.
arXiv Detail & Related papers (2021-03-02T01:32:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.