Related papers: Game of LLMs: Discovering Structural Constructs in Activities using Large Language Models

Game of LLMs: Discovering Structural Constructs in Activities using Large Language Models

URL: http://arxiv.org/abs/2406.13777v1
Date: Wed, 19 Jun 2024 19:02:44 GMT
Title: Game of LLMs: Discovering Structural Constructs in Activities using Large Language Models
Authors: Shruthi K. Hiremath, Thomas Ploetz,
Abstract summary: We focus on identifying underlying building blocks--structural constructs--with the use of large language models. We propose the development of an activity recognition procedure that uses these building blocks to model activities.
Score: 0.11029371407785957
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Human Activity Recognition is a time-series analysis problem. A popular analysis procedure used by the community assumes an optimal window length to design recognition pipelines. However, in the scenario of smart homes, where activities are of varying duration and frequency, the assumption of a constant sized window does not hold. Additionally, previous works have shown these activities to be made up of building blocks. We focus on identifying these underlying building blocks--structural constructs, with the use of large language models. Identifying these constructs can be beneficial especially in recognizing short-duration and infrequent activities. We also propose the development of an activity recognition procedure that uses these building blocks to model activities, thus helping the downstream task of activity monitoring in smart homes.

Related papers

WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks [85.95607119635102]
Large language models (LLMs) can mimic human-like intelligence. WorkArena++ is designed to evaluate the planning, problem-solving, logical/arithmetic reasoning, retrieval, and contextual understanding abilities of web agents.
arXiv Detail & Related papers (2024-07-07T07:15:49Z)
Maintenance Required: Updating and Extending Bootstrapped Human Activity Recognition Systems for Smart Homes [0.11029371407785957]
Off-the-shelf HAR systems are effective in limited capacity for an individual home. Previous work has successfully targeted the initial phase. We build on bootstrapped HAR systems and introduce an effective updating and extension procedure.
arXiv Detail & Related papers (2024-06-20T16:08:40Z)
Latent Properties of Lifelong Learning Systems [59.50307752165016]
We introduce an algorithm-agnostic explainable surrogate-modeling approach to estimate latent properties of lifelong learning algorithms. We validate the approach for estimating these properties via experiments on synthetic data.
arXiv Detail & Related papers (2022-07-28T20:58:13Z)
Robust Object Detection via Instance-Level Temporal Cycle Confusion [89.1027433760578]
We study the effectiveness of auxiliary self-supervised tasks to improve the out-of-distribution generalization of object detectors. Inspired by the principle of maximum entropy, we introduce a novel self-supervised task, instance-level temporal cycle confusion (CycConf) For each object, the task is to find the most different object proposals in the adjacent frame in a video and then cycle back to itself for self-supervision.
arXiv Detail & Related papers (2021-04-16T21:35:08Z)
Spatiotemporal Deformable Models for Long-Term Complex Activity Detection [23.880673582575856]
Long-term complex activity recognition can be crucial for autonomous systems such as cars and surgical robots. Most current methods are designed to merely localise short-term action/activities or combinations of actions that only last for a few frames or seconds. Our framework consists of three main building blocks: (i) action detection, (ii) the modelling of the deformable geometry of parts, and (iii) a sparsity mechanism.
arXiv Detail & Related papers (2021-04-16T16:05:34Z)
A Tree-structure Convolutional Neural Network for Temporal Features Exaction on Sensor-based Multi-resident Activity Recognition [4.619245607612873]
We propose an end-to-end Tree-Structure Convolutional neural network based framework for Multi-Resident Activity Recognition (TSC-MRAR) First, we treat each sample as an event and obtain the current event embedding through the previous sensor readings in the sliding window. Then, in order to automatically generate the temporal features, a tree-structure network is designed to derive the temporal dependence of nearby readings.
arXiv Detail & Related papers (2020-11-05T14:31:00Z)
Learning to Abstract and Predict Human Actions [60.85905430007731]
We model the hierarchical structure of human activities in videos and demonstrate the power of such structure in action prediction. We propose Hierarchical-Refresher-Anticipator, a multi-level neural machine that can learn the structure of human activities by observing a partial hierarchy of events and roll-out such structure into a future prediction in multiple levels of abstraction.
arXiv Detail & Related papers (2020-08-20T23:57:58Z)
Intra- and Inter-Action Understanding via Temporal Action Parsing [118.32912239230272]
We construct a new dataset developed on sport videos with manual annotations of sub-actions, and conduct a study on temporal action parsing on top. Our study shows that a sport activity usually consists of multiple sub-actions and that the awareness of such temporal structures is beneficial to action recognition. We also investigate a number of temporal parsing methods, and thereon devise an improved method that is capable of mining sub-actions from training data without knowing the labels of them.
arXiv Detail & Related papers (2020-05-20T17:45:18Z)
Enabling Edge Cloud Intelligence for Activity Learning in Smart Home [1.3858051019755284]
We propose a novel activity learning framework based on Edge Cloud architecture. We utilize temporal features for activity recognition and prediction in a single smart home setting.
arXiv Detail & Related papers (2020-05-14T11:43:20Z)
ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.