An Empirical Investigation of Contextualized Number Prediction
- URL: http://arxiv.org/abs/2011.07961v1
- Date: Tue, 20 Oct 2020 23:12:23 GMT
- Title: An Empirical Investigation of Contextualized Number Prediction
- Authors: Daniel Spokoyny, Taylor Berg-Kirkpatrick
- Abstract summary: We consider two tasks: (1)masked number prediction-predicting a missing numerical value within a sentence, and (2)numerical anomaly detection-detecting an errorful numeric value within a sentence.
We introduce a suite of output distribution parameterizations that incorporate latent variables to add expressivity and better fit the natural distribution of numeric values in running text.
We evaluate these models on two numeric datasets in the financial and scientific domain.
- Score: 34.56914472173953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We conduct a large scale empirical investigation of contextualized number
prediction in running text. Specifically, we consider two tasks: (1)masked
number prediction-predicting a missing numerical value within a sentence, and
(2)numerical anomaly detection-detecting an errorful numeric value within a
sentence. We experiment with novel combinations of contextual encoders and
output distributions over the real number line. Specifically, we introduce a
suite of output distribution parameterizations that incorporate latent
variables to add expressivity and better fit the natural distribution of
numeric values in running text, and combine them with both recurrent and
transformer-based encoder architectures. We evaluate these models on two
numeric datasets in the financial and scientific domain. Our findings show that
output distributions that incorporate discrete latent variables and allow for
multiple modes outperform simple flow-based counterparts on all datasets,
yielding more accurate numerical prediction and anomaly detection. We also show
that our models effectively utilize textual con-text and benefit from
general-purpose unsupervised pretraining.
Related papers
- Semiparametric conformal prediction [79.6147286161434]
Risk-sensitive applications require well-calibrated prediction sets over multiple, potentially correlated target variables.
We treat the scores as random vectors and aim to construct the prediction set accounting for their joint correlation structure.
We report desired coverage and competitive efficiency on a range of real-world regression problems.
arXiv Detail & Related papers (2024-11-04T14:29:02Z) - Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain.
We propose an adversarial algorithm to make the retriever component robust against distribution shift.
We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z) - xVal: A Continuous Number Encoding for Large Language Models [42.19323262199993]
We propose xVal, a numerical encoding scheme that represents any real number using just a single token.
We empirically evaluate our proposal on a number of synthetic and real-world datasets.
arXiv Detail & Related papers (2023-10-04T17:26:16Z) - Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input.
We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise.
We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z) - Mutual Exclusivity Training and Primitive Augmentation to Induce
Compositionality [84.94877848357896]
Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models.
We analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias and the tendency to memorize whole examples.
We show substantial empirical improvements using standard sequence-to-sequence models on two widely-used compositionality datasets.
arXiv Detail & Related papers (2022-11-28T17:36:41Z) - Two-stage Modeling for Prediction with Confidence [0.0]
It is difficult to generalize the performance of neural networks under the condition of distributional shift.
We propose a novel two-stage model for the potential distribution shift problem.
We show that our model offers reliable predictions for the vast majority of datasets.
arXiv Detail & Related papers (2022-09-19T08:48:07Z) - Bayesian Topic Regression for Causal Inference [3.9082355007261427]
Causal inference using observational text data is becoming increasingly popular in many research areas.
This paper presents the Bayesian Topic Regression model that uses both text and numerical information to model an outcome variable.
arXiv Detail & Related papers (2021-09-11T16:40:43Z) - Significance tests of feature relevance for a blackbox learner [6.72450543613463]
We derive two consistent tests for the feature relevance of a blackbox learner.
The first evaluates a loss difference with perturbation on an inference sample.
The second splits the inference sample into two but does not require data perturbation.
arXiv Detail & Related papers (2021-03-02T00:59:19Z) - Ambiguity in Sequential Data: Predicting Uncertain Futures with
Recurrent Models [110.82452096672182]
We propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data.
We also introduce a novel metric for ambiguous problems, which is better suited to account for uncertainties.
arXiv Detail & Related papers (2020-03-10T09:15:42Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.