Text me the data: Generating Ground Pressure Sequence from Textual
Descriptions for HAR
- URL: http://arxiv.org/abs/2402.14427v1
- Date: Thu, 22 Feb 2024 10:14:59 GMT
- Title: Text me the data: Generating Ground Pressure Sequence from Textual
Descriptions for HAR
- Authors: Lala Shakti Swarup Ray, Bo Zhou, Sungho Suh, Lars Krupp, Vitor Fortes
Rey, Paul Lukowicz
- Abstract summary: Text-to-Pressure (T2P) is a framework designed to generate ground pressure sequences from textual descriptions.
We show that the combination of vector quantization of sensor data along with simple text conditioned auto regressive strategy allows us to obtain high-quality generated pressure sequences.
- Score: 4.503003860563811
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In human activity recognition (HAR), the availability of substantial ground
truth is necessary for training efficient models. However, acquiring ground
pressure data through physical sensors itself can be cost-prohibitive,
time-consuming. To address this critical need, we introduce Text-to-Pressure
(T2P), a framework designed to generate extensive ground pressure sequences
from textual descriptions of human activities using deep learning techniques.
We show that the combination of vector quantization of sensor data along with
simple text conditioned auto regressive strategy allows us to obtain
high-quality generated pressure sequences from textual descriptions with the
help of discrete latent correlation between text and pressure maps. We achieved
comparable performance on the consistency between text and generated motion
with an R squared value of 0.722, Masked R squared value of 0.892, and FID
score of 1.83. Additionally, we trained a HAR model with the the synthesized
data and evaluated it on pressure dynamics collected by a real pressure sensor
which is on par with a model trained on only real data. Combining both real and
synthesized training data increases the overall macro F1 score by 5.9 percent.
Related papers
- Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing [71.29488677105127]
Existing scene text recognition (STR) methods struggle to recognize challenging texts, especially for artistic and severely distorted characters.
We propose a contrastive learning-based STR framework by leveraging synthetic and real unlabeled data without any human cost.
Our method achieves SOTA performance (94.7% and 70.9% average accuracy on common benchmarks and Union14M-Benchmark.
arXiv Detail & Related papers (2024-11-23T15:24:47Z) - Text2Data: Low-Resource Data Generation with Textual Control [104.38011760992637]
Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines.
We propose Text2Data, a novel approach that utilizes unlabeled data to understand the underlying data distribution through an unsupervised diffusion model.
It undergoes controllable finetuning via a novel constraint optimization-based learning objective that ensures controllability and effectively counteracts catastrophic forgetting.
arXiv Detail & Related papers (2024-02-08T03:41:39Z) - Contrastive Transformer Learning with Proximity Data Generation for
Text-Based Person Search [60.626459715780605]
Given a descriptive text query, text-based person search aims to retrieve the best-matched target person from an image gallery.
Such a cross-modal retrieval task is quite challenging due to significant modality gap, fine-grained differences and insufficiency of annotated data.
In this paper, we propose a simple yet effective dual Transformer model for text-based person search.
arXiv Detail & Related papers (2023-11-15T16:26:49Z) - Approximating Human-Like Few-shot Learning with GPT-based Compression [55.699707962017975]
We seek to equip generative pre-trained models with human-like learning capabilities that enable data compression during inference.
We present a novel approach that utilizes the Generative Pre-trained Transformer (GPT) to approximate Kolmogorov complexity.
arXiv Detail & Related papers (2023-08-14T05:22:33Z) - PressureTransferNet: Human Attribute Guided Dynamic Ground Pressure
Profile Transfer using 3D simulated Pressure Maps [7.421780713537146]
PressureTransferNet is an encoder-decoder model taking a source pressure map and a target human attribute vector as inputs.
We use a sensor simulation to create a diverse dataset with various human attributes and pressure profiles.
We visually confirm the fidelity of the synthesized pressure shapes using a physics-based deep learning model and achieve a binary R-square value of 0.79 on areas with ground contact.
arXiv Detail & Related papers (2023-08-01T13:31:25Z) - On the Possibilities of AI-Generated Text Detection [76.55825911221434]
We argue that as machine-generated text approximates human-like quality, the sample size needed for detection bounds increases.
We test various state-of-the-art text generators, including GPT-2, GPT-3.5-Turbo, Llama, Llama-2-13B-Chat-HF, and Llama-2-70B-Chat-HF, against detectors, including oBERTa-Large/Base-Detector, GPTZero.
arXiv Detail & Related papers (2023-04-10T17:47:39Z) - PresSim: An End-to-end Framework for Dynamic Ground Pressure Profile
Generation from Monocular Videos Using Physics-based 3D Simulation [8.107762252448195]
Ground pressure exerted by the human body is a valuable source of information for human activity recognition (HAR) in pervasive sensing.
We present a novel end-to-end framework, PresSim, to synthesize sensor data from videos of human activities to reduce such effort significantly.
arXiv Detail & Related papers (2023-02-01T12:02:04Z) - BeCAPTCHA-Type: Biometric Keystroke Data Generation for Improved Bot
Detection [63.447493500066045]
This work proposes a data driven learning model for the synthesis of keystroke biometric data.
The proposed method is compared with two statistical approaches based on Universal and User-dependent models.
Our experimental framework considers a dataset with 136 million keystroke events from 168 thousand subjects.
arXiv Detail & Related papers (2022-07-27T09:26:15Z) - How May I Help You? Using Neural Text Simplification to Improve
Downstream NLP Tasks [20.370296294233313]
We evaluate the use of neural TS in two ways: simplifying input texts at prediction time and augmenting data to provide machines with additional information during training.
We demonstrate that the latter scenario provides positive effects on machine performance on two separate datasets.
In particular, the latter use of TS improves the performances of LSTM (1.82-1.98%) and SpanBERT (0.7-1.3%) extractors on TACRED, a complex, large-scale, real-world relation extraction task.
arXiv Detail & Related papers (2021-09-10T01:04:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.