Unsupervised Task Graph Generation from Instructional Video Transcripts
- URL: http://arxiv.org/abs/2302.09173v2
- Date: Tue, 2 May 2023 19:46:14 GMT
- Title: Unsupervised Task Graph Generation from Instructional Video Transcripts
- Authors: Lajanugen Logeswaran, Sungryull Sohn, Yunseok Jang, Moontae Lee,
Honglak Lee
- Abstract summary: We consider a setting where text transcripts of instructional videos performing a real-world activity are provided.
The goal is to identify the key steps relevant to the task as well as the dependency relationship between these key steps.
We propose a novel task graph generation approach that combines the reasoning capabilities of instruction-tuned language models along with clustering and ranking components.
- Score: 53.54435048879365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work explores the problem of generating task graphs of real-world
activities. Different from prior formulations, we consider a setting where text
transcripts of instructional videos performing a real-world activity (e.g.,
making coffee) are provided and the goal is to identify the key steps relevant
to the task as well as the dependency relationship between these key steps. We
propose a novel task graph generation approach that combines the reasoning
capabilities of instruction-tuned language models along with clustering and
ranking components to generate accurate task graphs in a completely
unsupervised manner. We show that the proposed approach generates more accurate
task graphs compared to a supervised learning approach on tasks from the ProceL
and CrossTask datasets.
Related papers
- Replay-and-Forget-Free Graph Class-Incremental Learning: A Task Profiling and Prompting Approach [28.194940062243003]
Class-incremental learning (CIL) aims to continually learn a sequence of tasks, with each task consisting of a set of unique classes.
The key characteristic of CIL lies in the absence of task identifiers (IDs) during inference.
We show theoretically that accurate task ID prediction on graph data can be achieved by a Laplacian smoothing-based graph task profiling approach.
arXiv Detail & Related papers (2024-10-14T09:54:20Z) - Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos [13.99137623722021]
Procedural activities are sequences of key-steps aimed at achieving specific goals.
Task graphs have emerged as a human-understandable representation of procedural activities.
arXiv Detail & Related papers (2024-06-03T16:11:39Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Exploring Correlations of Self-Supervised Tasks for Graphs [6.977921096191354]
This paper aims to provide a fresh understanding of graph self-supervised learning based on task correlations.
We evaluate the performance of the representations trained by one specific task on other tasks and define correlation values to quantify task correlations.
We propose Graph Task Correlation Modeling (GraphTCM) to illustrate the task correlations and utilize it to enhance graph self-supervised training.
arXiv Detail & Related papers (2024-05-07T12:02:23Z) - GIMLET: A Unified Graph-Text Model for Instruction-Based Molecule
Zero-Shot Learning [71.89623260998934]
This study investigates the feasibility of employing natural language instructions to accomplish molecule-related tasks in a zero-shot setting.
Existing molecule-text models perform poorly in this setting due to inadequate treatment of instructions and limited capacity for graphs.
We propose GIMLET, which unifies language models for both graph and text data.
arXiv Detail & Related papers (2023-05-28T18:27:59Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Sequential Manipulation Planning on Scene Graph [90.28117916077073]
We devise a 3D scene graph representation, contact graph+ (cg+), for efficient sequential task planning.
Goal configurations, naturally specified on contact graphs, can be produced by a genetic algorithm with an optimization method.
A task plan is then succinct by computing the Graph Editing Distance (GED) between the initial contact graphs and the goal configurations, which generates graph edit operations corresponding to possible robot actions.
arXiv Detail & Related papers (2022-07-10T02:01:33Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - FAITH: Few-Shot Graph Classification with Hierarchical Task Graphs [39.576675425158754]
Few-shot graph classification aims at predicting classes for graphs, given limited labeled graphs for each class.
We propose a novel few-shot learning framework FAITH that captures task correlations via constructing a hierarchical task graph.
Experiments on four prevalent few-shot graph classification datasets demonstrate the superiority of FAITH over other state-of-the-art baselines.
arXiv Detail & Related papers (2022-05-05T04:28:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.