Related papers: OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking

OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking

URL: http://arxiv.org/abs/2306.00887v2
Date: Tue, 20 Jun 2023 19:47:20 GMT
Title: OpenPI-C: A Better Benchmark and Stronger Baseline for Open-Vocabulary State Tracking
Authors: Xueqing Wu, Sha Li, Heng Ji
Abstract summary: OpenPI is the only dataset annotated for open-vocabulary state tracking. We categorize 3 types of problems on the procedure level, step level and state change level respectively. For the evaluation metric, we propose a cluster-based metric to fix the original metric's preference for repetition.
Score: 55.62705574507595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Open-vocabulary state tracking is a more practical version of state tracking that aims to track state changes of entities throughout a process without restricting the state space and entity space. OpenPI is to date the only dataset annotated for open-vocabulary state tracking. However, we identify issues with the dataset quality and evaluation metric. For the dataset, we categorize 3 types of problems on the procedure level, step level and state change level respectively, and build a clean dataset OpenPI-C using multiple rounds of human judgment. For the evaluation metric, we propose a cluster-based metric to fix the original metric's preference for repetition. Model-wise, we enhance the seq2seq generation baseline by reinstating two key properties for state tracking: temporal dependency and entity awareness. The state of the world after an action is inherently dependent on the previous state. We model this dependency through a dynamic memory bank and allow the model to attend to the memory slots during decoding. On the other hand, the state of the world is naturally a union of the states of involved entities. Since the entities are unknown in the open-vocabulary setting, we propose a two-stage model that refines the state change prediction conditioned on entities predicted from the first stage. Empirical results show the effectiveness of our proposed model especially on the cluster-based metric. The code and data are released at https://github.com/shirley-wu/openpi-c

Related papers

OpenworldAUC: Towards Unified Evaluation and Optimization for Open-world Prompt Tuning [86.20909814421748]
Real-world scenarios require models to handle inputs without prior domain knowledge.<n>We propose OpenworldAUC, a metric that assesses detection and classification through pairwise instance comparisons.<n> Experiments on 15 benchmarks in open-world scenarios show OpenworldAUC achieves SOTA performance on OpenworldAUC and other metrics.
arXiv Detail & Related papers (2025-05-08T12:31:40Z)
Multistep Inverse Is Not All You Need [87.62730694973696]
In real-world control settings, the observation space is often unnecessarily high-dimensional and subject to time-correlated noise. It is therefore desirable to learn an encoder to map the observation space to a simpler space of control-relevant variables. We propose a new algorithm, ACDF, which combines multistep-inverse prediction with a latent forward model.
arXiv Detail & Related papers (2024-03-18T16:36:01Z)
OpenPI2.0: An Improved Dataset for Entity Tracking in Texts [36.84433853139042]
An earlier dataset, OpenPI, provided crowdsourced annotations of entity state changes in text. We present an improved dataset, OpenPI2.0, where entities and attributes are fully canonicalized and additional entity salience annotations are added. We show that using state changes of salient entities as a chain-of-thought prompt, downstream performance is improved on tasks such as question answering and classical planning.
arXiv Detail & Related papers (2023-05-24T00:57:35Z)
Going beyond research datasets: Novel intent discovery in the industry setting [60.90117614762879]
This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform. We show the benefit of pre-training language models on in-domain data: both self-supervised and with weak supervision. We also devise the best method to utilize the conversational structure (i.e., question and answer) of real-life datasets during fine-tuning for clustering tasks, which we call Conv.
arXiv Detail & Related papers (2023-05-09T14:21:29Z)
Understand the Dynamic World: An End-to-End Knowledge Informed Framework for Open Domain Entity State Tracking [15.421012879083463]
Open domain entity state tracking aims to predict reasonable state changes of entities (i.e., [attribute] of [entity] was [before_state] and [after_state] afterwards) given the action descriptions. It's challenging as the model needs to predict an arbitrary number of entity state changes caused by the action while most of the entities are implicitly relevant to the actions and their attributes as well as states are from open vocabularies. We propose a novel end-to-end Knowledge Informed framework for open domain Entity State Tracking, namely KIEST, which explicitly retrieves the relevant entities and attributes from
arXiv Detail & Related papers (2023-04-26T22:45:30Z)
Coalescing Global and Local Information for Procedural Text Understanding [70.10291759879887]
A complete procedural understanding solution should combine three core aspects: local and global views of the inputs, and global view of outputs. In this paper, we propose Coalescing Global and Local InformationCG, a new model that builds entity and time representations. Experiments on a popular procedural text understanding dataset show that our model achieves state-of-the-art results.
arXiv Detail & Related papers (2022-08-26T19:16:32Z)
Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values. It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z)
Topological Experience Replay [22.84244156916668]
deep Q-learning methods update Q-values using state transitions sampled from the experience replay buffer. We organize the agent's experience into a graph that explicitly tracks the dependency between Q-values of states. We empirically show that our method is substantially more data-efficient than several baselines on a diverse range of goal-reaching tasks.
arXiv Detail & Related papers (2022-03-29T18:28:20Z)
State estimation with limited sensors -- A deep learning based approach [0.0]
We propose a novel deep learning based state estimation framework that learns from sequential data. We illustrate that utilizing sequential data allows for state recovery from only one or two sensors.
arXiv Detail & Related papers (2021-01-27T16:14:59Z)
A New Bandit Setting Balancing Information from State Evolution and Corrupted Context [52.67844649650687]
We propose a new sequential decision-making setting combining key aspects of two established online learning problems with bandit feedback. The optimal action to play at any given moment is contingent on an underlying changing state which is not directly observable by the agent. We present an algorithm that uses a referee to dynamically combine the policies of a contextual bandit and a multi-armed bandit.
arXiv Detail & Related papers (2020-11-16T14:35:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.