Scaling laws in wearable human activity recognition
- URL: http://arxiv.org/abs/2502.03364v1
- Date: Wed, 05 Feb 2025 17:00:08 GMT
- Title: Scaling laws in wearable human activity recognition
- Authors: Tom Hoddes, Alex Bijamov, Saket Joshi, Daniel Roggen, Ali Etemad, Robert Harle, David Racz,
- Abstract summary: Scaling laws have the potential to help move towards more principled design by linking model capacity with pre-training data volume.<n>By conducting an exhaustive grid search on both amount of pre-training data and Transformer architectures, we establish the first known scaling laws for HAR.<n>We show that these scaling laws translate to downstream performance improvements on three HAR benchmark datasets.
- Score: 19.49701947129144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many deep architectures and self-supervised pre-training techniques have been proposed for human activity recognition (HAR) from wearable multimodal sensors. Scaling laws have the potential to help move towards more principled design by linking model capacity with pre-training data volume. Yet, scaling laws have not been established for HAR to the same extent as in language and vision. By conducting an exhaustive grid search on both amount of pre-training data and Transformer architectures, we establish the first known scaling laws for HAR. We show that pre-training loss scales with a power law relationship to amount of data and parameter count and that increasing the number of users in a dataset results in a steeper improvement in performance than increasing data per user, indicating that diversity of pre-training data is important, which contrasts to some previously reported findings in self-supervised HAR. We show that these scaling laws translate to downstream performance improvements on three HAR benchmark datasets of postures, modes of locomotion and activities of daily living: UCI HAR and WISDM Phone and WISDM Watch. Finally, we suggest some previously published works should be revisited in light of these scaling laws with more adequate model capacities.
Related papers
- Scaling Laws for Data-Efficient Visual Transfer Learning [14.114908296325277]
This paper establishes the first practical framework for data-efficient scaling laws in visual transfer learning.
We propose the distillation boundary theory, revealing a critical turning point in distillation efficiency.
This work redefines scaling laws for data-limited regimes, bridging the knowledge gap between large-scale pretraining and practical downstream adaptation.
arXiv Detail & Related papers (2025-04-17T07:01:01Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
We investigate how model size, training data scale, and inference-time compute jointly influence generative retrieval performance.
Our experiments show that n-gram-based methods demonstrate strong alignment with both training and inference scaling laws.
We find that LLaMA models consistently outperform T5 models, suggesting a particular advantage for larger decoder-only models in generative retrieval.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - Scaling Laws for Differentially Private Language Models [53.14592585413073]
Scaling laws have emerged as important components of large language model (LLM) training as they can predict performance gains through scale.<n>LLMs rely on large, high-quality training datasets, like those sourced from (sometimes sensitive) user data.<n>Training models on this sensitive user data requires careful privacy protections like differential privacy (DP)
arXiv Detail & Related papers (2025-01-31T06:32:46Z) - Scaling Laws for the Value of Individual Data Points in Machine Learning [55.596413470429475]
We introduce a new perspective by investigating scaling behavior for the value of individual data points.
We provide learning theory to support our scaling law, and we observe empirically that it holds across diverse model classes.
Our work represents a first step towards understanding and utilizing scaling properties for the value of individual data points.
arXiv Detail & Related papers (2024-05-30T20:10:24Z) - Scaling Laws For Dense Retrieval [22.76001461620846]
We investigate whether the performance of dense retrieval models follows the scaling law as other neural models.
Results indicate that, under our settings, the performance of dense retrieval models follows a precise power-law scaling related to the model size and the number of annotations.
arXiv Detail & Related papers (2024-03-27T15:27:36Z) - SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - Inverse Scaling: When Bigger Isn't Better [80.42834197416444]
Large language models (LMs) show predictable improvements to overall loss with increased scale.
We present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale.
arXiv Detail & Related papers (2023-06-15T20:11:23Z) - A Solvable Model of Neural Scaling Laws [72.8349503901712]
Large language models with a huge number of parameters, when trained on near internet-sized number of tokens, have been empirically shown to obey neural scaling laws.
We propose a statistical model -- a joint generative data model and random feature model -- that captures this neural scaling phenomenology.
Key findings are the manner in which the power laws that occur in the statistics of natural datasets are extended by nonlinear random feature maps.
arXiv Detail & Related papers (2022-10-30T15:13:18Z) - Beyond neural scaling laws: beating power law scaling via data pruning [37.804100045519846]
We show how in theory we can break beyond power law scaling and potentially even reduce it to exponential scaling.
We develop a new simple, cheap and scalable self-supervised pruning metric that demonstrates comparable performance to the best supervised metrics.
Overall, our work suggests that the discovery of good data-pruning metrics may provide a viable path forward to substantially improved neural scaling laws.
arXiv Detail & Related papers (2022-06-29T09:20:47Z) - Scaling Laws Under the Microscope: Predicting Transformer Performance
from Small Scale Experiments [42.793379799720434]
We investigate whether scaling laws can be used to accelerate model development.
We find that scaling laws emerge at finetuning time in some NLP tasks.
For tasks where scaling laws exist, they can be used to predict the performance of larger models.
arXiv Detail & Related papers (2022-02-13T19:13:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.