Latent Compositional Representations Improve Systematic Generalization
in Grounded Question Answering
- URL: http://arxiv.org/abs/2007.00266v3
- Date: Tue, 10 Nov 2020 06:22:21 GMT
- Title: Latent Compositional Representations Improve Systematic Generalization
in Grounded Question Answering
- Authors: Ben Bogin, Sanjay Subramanian, Matt Gardner, Jonathan Berant
- Abstract summary: State-of-the-art models in grounded question answering often do not explicitly perform decomposition.
We propose a model that computes a representation and denotation for all question spans in a bottom-up, compositional manner.
Our model induces latent trees, driven by end-to-end (the answer) only.
- Score: 46.87501300706542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Answering questions that involve multi-step reasoning requires decomposing
them and using the answers of intermediate steps to reach the final answer.
However, state-of-the-art models in grounded question answering often do not
explicitly perform decomposition, leading to difficulties in generalization to
out-of-distribution examples. In this work, we propose a model that computes a
representation and denotation for all question spans in a bottom-up,
compositional manner using a CKY-style parser. Our model induces latent trees,
driven by end-to-end (the answer) supervision only. We show that this inductive
bias towards tree structures dramatically improves systematic generalization to
out-of-distribution examples, compared to strong baselines on an arithmetic
expressions benchmark as well as on CLOSURE, a dataset that focuses on
systematic generalization for grounded question answering. On this challenging
dataset, our model reaches an accuracy of 96.1%, significantly higher than
prior models that almost perfectly solve the task on a random, in-distribution
split.
Related papers
- Towards an Understanding of Stepwise Inference in Transformers: A
Synthetic Graph Navigation Model [19.826983068662106]
We propose to study autoregressive Transformer models on a synthetic task that embodies the multi-step nature of problems where stepwise inference is generally most useful.
Despite is simplicity, we find we can empirically reproduce and analyze several phenomena observed at scale.
arXiv Detail & Related papers (2024-02-12T16:25:47Z) - Detection-based Intermediate Supervision for Visual Question Answering [13.96848991623376]
We propose a generative detection framework to facilitate multiple grounding supervisions via sequence generation.
Our proposed DIS offers more comprehensive and accurate intermediate supervisions, thereby boosting answer prediction performance.
Extensive experiments demonstrate the superiority of our proposed DIS, showcasing both improved accuracy and state-of-the-art reasoning consistency.
arXiv Detail & Related papers (2023-12-26T11:45:22Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Grounded Graph Decoding Improves Compositional Generalization in
Question Answering [68.72605660152101]
Question answering models struggle to generalize to novel compositions of training patterns, such as longer sequences or more complex test structures.
We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism.
Our model significantly outperforms state-of-the-art baselines on the Compositional Freebase Questions (CFQ) dataset, a challenging benchmark for compositional generalization in question answering.
arXiv Detail & Related papers (2021-11-05T17:50:14Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Discrete Reasoning Templates for Natural Language Understanding [79.07883990966077]
We present an approach that reasons about complex questions by decomposing them to simpler subquestions.
We derive the final answer according to instructions in a predefined reasoning template.
We show that our approach is competitive with the state-of-the-art while being interpretable and requires little supervision.
arXiv Detail & Related papers (2021-04-05T18:56:56Z) - Paired Examples as Indirect Supervision in Latent Decision Models [109.76417071249945]
We introduce a way to leverage paired examples that provide stronger cues for learning latent decisions.
We apply our method to improve compositional question answering using neural module networks on the DROP dataset.
arXiv Detail & Related papers (2021-04-05T03:58:30Z) - Extrapolatable Relational Reasoning With Comparators in Low-Dimensional
Manifolds [7.769102711230249]
We propose a neuroscience-inspired inductive-biased module that can be readily amalgamated with current neural network architectures.
We show that neural nets with this inductive bias achieve considerably better o.o.d generalisation performance for a range of relational reasoning tasks.
arXiv Detail & Related papers (2020-06-15T19:09:13Z) - Robust Question Answering Through Sub-part Alignment [53.94003466761305]
We model question answering as an alignment problem.
We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets.
arXiv Detail & Related papers (2020-04-30T09:10:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.