Paired Examples as Indirect Supervision in Latent Decision Models
- URL: http://arxiv.org/abs/2104.01759v1
- Date: Mon, 5 Apr 2021 03:58:30 GMT
- Title: Paired Examples as Indirect Supervision in Latent Decision Models
- Authors: Nitish Gupta, Sameer Singh, Matt Gardner, Dan Roth
- Abstract summary: We introduce a way to leverage paired examples that provide stronger cues for learning latent decisions.
We apply our method to improve compositional question answering using neural module networks on the DROP dataset.
- Score: 109.76417071249945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compositional, structured models are appealing because they explicitly
decompose problems and provide interpretable intermediate outputs that give
confidence that the model is not simply latching onto data artifacts. Learning
these models is challenging, however, because end-task supervision only
provides a weak indirect signal on what values the latent decisions should
take. This often results in the model failing to learn to perform the
intermediate tasks correctly. In this work, we introduce a way to leverage
paired examples that provide stronger cues for learning latent decisions. When
two related training examples share internal substructure, we add an additional
training objective to encourage consistency between their latent decisions.
Such an objective does not require external supervision for the values of the
latent output, or even the end task, yet provides an additional training signal
to that provided by individual training examples themselves. We apply our
method to improve compositional question answering using neural module networks
on the DROP dataset. We explore three ways to acquire paired questions in DROP:
(a) discovering naturally occurring paired examples within the dataset, (b)
constructing paired examples using templates, and (c) generating paired
examples using a question generation model. We empirically demonstrate that our
proposed approach improves both in- and out-of-distribution generalization and
leads to correct latent decision predictions.
Related papers
- Ensemble Modeling for Multimodal Visual Action Recognition [50.38638300332429]
We propose an ensemble modeling approach for multimodal action recognition.
We independently train individual modality models using a variant of focal loss tailored to handle the long-tailed distribution of the MECCANO [21] dataset.
arXiv Detail & Related papers (2023-08-10T08:43:20Z) - Consistent Explanations in the Face of Model Indeterminacy via
Ensembling [12.661530681518899]
This work addresses the challenge of providing consistent explanations for predictive models in the presence of model indeterminacy.
We introduce ensemble methods to enhance the consistency of the explanations provided in these scenarios.
Our findings highlight the importance of considering model indeterminacy when interpreting explanations.
arXiv Detail & Related papers (2023-06-09T18:45:43Z) - Think Twice: Measuring the Efficiency of Eliminating Prediction
Shortcuts of Question Answering Models [3.9052860539161918]
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature.
We assess the robustness towards a large set of known and newly found prediction biases for various pre-trained models and debiasing methods in Question Answering (QA)
We find that while existing debiasing methods can mitigate reliance on a chosen spurious feature, the OOD performance gains of these methods can not be explained by mitigated reliance on biased features.
arXiv Detail & Related papers (2023-05-11T14:35:00Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - Influence Tuning: Demoting Spurious Correlations via Instance
Attribution and Instance-Driven Updates [26.527311287924995]
influence tuning can help deconfounding the model from spurious patterns in data.
We show that in a controlled setup, influence tuning can help deconfounding the model from spurious patterns in data.
arXiv Detail & Related papers (2021-10-07T06:59:46Z) - Explaining and Improving Model Behavior with k Nearest Neighbor
Representations [107.24850861390196]
We propose using k nearest neighbor representations to identify training examples responsible for a model's predictions.
We show that kNN representations are effective at uncovering learned spurious associations.
Our results indicate that the kNN approach makes the finetuned model more robust to adversarial inputs.
arXiv Detail & Related papers (2020-10-18T16:55:25Z) - Mind the Trade-off: Debiasing NLU Models without Degrading the
In-distribution Performance [70.31427277842239]
We introduce a novel debiasing method called confidence regularization.
It discourages models from exploiting biases while enabling them to receive enough incentive to learn from all the training examples.
We evaluate our method on three NLU tasks and show that, in contrast to its predecessors, it improves the performance on out-of-distribution datasets.
arXiv Detail & Related papers (2020-05-01T11:22:55Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.