ContextLabeler Dataset: physical and virtual sensors data collected from
smartphone usage in-the-wild
- URL: http://arxiv.org/abs/2307.03586v1
- Date: Fri, 7 Jul 2023 13:28:29 GMT
- Title: ContextLabeler Dataset: physical and virtual sensors data collected from
smartphone usage in-the-wild
- Authors: Mattia Giovanni Campana, Franca Delmastro
- Abstract summary: This paper describes a data collection campaign and the resulting dataset derived from smartphone sensors.
The collected dataset represents a useful source of real data to both define and evaluate a broad set of novel context-aware solutions.
- Score: 7.310043452300736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes a data collection campaign and the resulting dataset
derived from smartphone sensors characterizing the daily life activities of 3
volunteers in a period of two weeks. The dataset is released as a collection of
CSV files containing more than 45K data samples, where each sample is composed
by 1332 features related to a heterogeneous set of physical and virtual
sensors, including motion sensors, running applications, devices in proximity,
and weather conditions. Moreover, each data sample is associated with a ground
truth label that describes the user activity and the situation in which she was
involved during the sensing experiment (e.g., working, at restaurant, and doing
sport activity). To avoid introducing any bias during the data collection, we
performed the sensing experiment in-the-wild, that is, by using the volunteers'
devices, and without defining any constraint related to the user's behavior.
For this reason, the collected dataset represents a useful source of real data
to both define and evaluate a broad set of novel context-aware solutions (both
algorithms and protocols) that aim to adapt their behavior according to the
changes in the user's situation in a mobile environment.
Related papers
- Lightweight Modeling of User Context Combining Physical and Virtual
Sensor Data [15.800978541993706]
We present a framework to collect datasets containing heterogeneous sensing data from personal mobile devices.
We propose a lightweight approach to model the user context able to efficiently perform the entire reasoning process.
We achieve a 10x speed up and a feature reduction of more than 90% while keeping the accuracy loss less than 3%.
arXiv Detail & Related papers (2023-06-28T08:57:01Z) - MyDigitalFootprint: an extensive context dataset for pervasive computing
applications at the edge [7.310043452300736]
MyDigitalFootprint is a large-scale dataset comprising smartphone sensor data, physical proximity information, and Online Social Networks interactions.
It spans two months of measurements from 31 volunteer users in their natural environment, allowing for unrestricted behavior.
To demonstrate the dataset's effectiveness, we present three context-aware applications utilizing various machine learning tasks.
arXiv Detail & Related papers (2023-06-28T07:59:47Z) - Reconstructing human activities via coupling mobile phone data with
location-based social networks [20.303827107229445]
We propose a data analysis framework to identify user's activity via coupling the mobile phone data with location-based social networks (LBSN) data.
We reconstruct the activity chains of 1,000,000 active mobile phone users and analyze the temporal and spatial characteristics of each activity type.
arXiv Detail & Related papers (2023-06-06T06:37:14Z) - Going beyond research datasets: Novel intent discovery in the industry
setting [60.90117614762879]
This paper proposes methods to improve the intent discovery pipeline deployed in a large e-commerce platform.
We show the benefit of pre-training language models on in-domain data: both self-supervised and with weak supervision.
We also devise the best method to utilize the conversational structure (i.e., question and answer) of real-life datasets during fine-tuning for clustering tasks, which we call Conv.
arXiv Detail & Related papers (2023-05-09T14:21:29Z) - Multimodal Dataset from Harsh Sub-Terranean Environment with Aerosol
Particles for Frontier Exploration [55.41644538483948]
This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles.
It contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format.
The focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data.
arXiv Detail & Related papers (2023-04-27T20:21:18Z) - Video-based Pose-Estimation Data as Source for Transfer Learning in
Human Activity Recognition [71.91734471596433]
Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments.
Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data.
This paper proposes using datasets intended for human-pose estimation as a source for transfer learning.
arXiv Detail & Related papers (2022-12-02T18:19:36Z) - TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual
Environments [84.6017003787244]
This work proposes a synthetic data generation pipeline to address the difficulties and domain-gaps present in simulated datasets.
We show that using annotations and visual cues from existing datasets, we can facilitate automated multi-modal data generation.
arXiv Detail & Related papers (2022-08-16T20:46:08Z) - Detection Hub: Unifying Object Detection Datasets via Query Adaptation
on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned.
It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets.
The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z) - An Automated Analysis Framework for Trajectory Datasets [0.0]
Trajectory datasets of road users have become more important in the last years for safety validation of automated vehicles.
Several naturalistic trajectory datasets with each more than 10.000 tracks were released and others will follow.
Considering this amount of data, it is necessary to be able to compare these datasets in-depth with ease.
arXiv Detail & Related papers (2022-02-12T10:55:53Z) - DAPPER: Label-Free Performance Estimation after Personalization for
Heterogeneous Mobile Sensing [95.18236298557721]
We present DAPPER (Domain AdaPtation Performance EstimatoR) that estimates the adaptation performance in a target domain with unlabeled target data.
Our evaluation with four real-world sensing datasets compared against six baselines shows that DAPPER outperforms the state-of-the-art baseline by 39.8% in estimation accuracy.
arXiv Detail & Related papers (2021-11-22T08:49:33Z) - Data Collection and Labeling of Real-Time IoT-Enabled Bio-Signals in
Everyday Settings for Mental Health Improvement [6.7377504888630675]
Real-time physiological data collection and analysis play a central role in modern well-being applications.
This paper builds a system for the real-time collection and analysis of photoplethysmogram, acceleration, gyroscope, and gravity data from a wearable sensor.
arXiv Detail & Related papers (2021-08-02T20:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.