Action Recognition and State Change Prediction in a Recipe Understanding
Task Using a Lightweight Neural Network Model
- URL: http://arxiv.org/abs/2001.08665v1
- Date: Thu, 23 Jan 2020 17:04:00 GMT
- Title: Action Recognition and State Change Prediction in a Recipe Understanding
Task Using a Lightweight Neural Network Model
- Authors: Qing Wan, Yoonsuck Choe
- Abstract summary: In this paper, we propose a simplified neural network model that separates action recognition and state change prediction.
This allows learning to indirectly influence each other.
- Score: 8.49031088470346
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consider a natural language sentence describing a specific step in a food
recipe. In such instructions, recognizing actions (such as press, bake, etc.)
and the resulting changes in the state of the ingredients (shape molded,
custard cooked, temperature hot, etc.) is a challenging task. One way to cope
with this challenge is to explicitly model a simulator module that applies
actions to entities and predicts the resulting outcome (Bosselut et al. 2018).
However, such a model can be unnecessarily complex. In this paper, we propose a
simplified neural network model that separates action recognition and state
change prediction, while coupling the two through a novel loss function. This
allows learning to indirectly influence each other. Our model, although
simpler, achieves higher state change prediction performance (67% average
accuracy for ours vs. 55% in (Bosselut et al. 2018)) and takes fewer samples to
train (10K ours vs. 65K+ by (Bosselut et al. 2018)).
Related papers
- Language models scale reliably with over-training and on downstream tasks [121.69867718185125]
Scaling laws are useful guides for derisking expensive training runs.
However, there remain gaps between current studies and how language models are trained.
In contrast, scaling laws mostly predict loss on inference, but models are usually compared on downstream task performance.
arXiv Detail & Related papers (2024-03-13T13:54:00Z) - Could Giant Pretrained Image Models Extract Universal Representations? [94.97056702288317]
We present a study of frozen pretrained models when applied to diverse and representative computer vision tasks.
Our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes.
arXiv Detail & Related papers (2022-11-03T17:57:10Z) - MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided
Adaptation [68.30497162547768]
We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed.
We validate the efficiency and effectiveness of MoEBERT on natural language understanding and question answering tasks.
arXiv Detail & Related papers (2022-04-15T23:19:37Z) - Contextual Dropout: An Efficient Sample-Dependent Dropout Module [60.63525456640462]
Dropout has been demonstrated as a simple and effective module to regularize the training process of deep neural networks.
We propose contextual dropout with an efficient structural design as a simple and scalable sample-dependent dropout module.
Our experimental results show that the proposed method outperforms baseline methods in terms of both accuracy and quality of uncertainty estimation.
arXiv Detail & Related papers (2021-03-06T19:30:32Z) - Cooking Object's State Identification Without Using Pretrained Model [0.0]
In this paper, we have proposed a CNN and trained it from scratch.
The model is trained and tested on the dataset from cooking state recognition challenge.
Our model achieves 65.8% accuracy on the unseen test dataset.
arXiv Detail & Related papers (2021-03-03T10:33:27Z) - On the Reproducibility of Neural Network Predictions [52.47827424679645]
We study the problem of churn, identify factors that cause it, and propose two simple means of mitigating it.
We first demonstrate that churn is indeed an issue, even for standard image classification tasks.
We propose using emphminimum entropy regularizers to increase prediction confidences.
We present empirical results showing the effectiveness of both techniques in reducing churn while improving the accuracy of the underlying model.
arXiv Detail & Related papers (2021-02-05T18:51:01Z) - How do Decisions Emerge across Layers in Neural Models? Interpretation
with Differentiable Masking [70.92463223410225]
DiffMask learns to mask-out subsets of the input while maintaining differentiability.
Decision to include or disregard an input token is made with a simple model based on intermediate hidden layers.
This lets us not only plot attribution heatmaps but also analyze how decisions are formed across network layers.
arXiv Detail & Related papers (2020-04-30T17:36:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.