On-device Self-supervised Learning of Visual Perception Tasks aboard
Hardware-limited Nano-quadrotors
- URL: http://arxiv.org/abs/2403.04071v1
- Date: Wed, 6 Mar 2024 22:04:14 GMT
- Title: On-device Self-supervised Learning of Visual Perception Tasks aboard
Hardware-limited Nano-quadrotors
- Authors: Elia Cereda, Manuele Rusci, Alessandro Giusti, Daniele Palossi
- Abstract summary: Sub-SI50gram nano-drones are gaining momentum in both academia and industry.
Their most compelling applications rely on onboard deep learning models for perception.
When deployed in unknown environments, these models often underperform due to domain shift.
We propose for the first time, on-device learning aboard nano-drones, where the first part of the in-field mission is dedicated to self-supervised fine-tuning.
- Score: 53.59319391812798
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sub-\SI{50}{\gram} nano-drones are gaining momentum in both academia and
industry. Their most compelling applications rely on onboard deep learning
models for perception despite severe hardware constraints (\ie
sub-\SI{100}{\milli\watt} processor). When deployed in unknown environments not
represented in the training data, these models often underperform due to domain
shift. To cope with this fundamental problem, we propose, for the first time,
on-device learning aboard nano-drones, where the first part of the in-field
mission is dedicated to self-supervised fine-tuning of a pre-trained
convolutional neural network (CNN). Leveraging a real-world vision-based
regression task, we thoroughly explore performance-cost trade-offs of the
fine-tuning phase along three axes: \textit{i}) dataset size (more data
increases the regression performance but requires more memory and longer
computation); \textit{ii}) methodologies (\eg fine-tuning all model parameters
vs. only a subset); and \textit{iii}) self-supervision strategy. Our approach
demonstrates an improvement in mean absolute error up to 30\% compared to the
pre-trained baseline, requiring only \SI{22}{\second} fine-tuning on an
ultra-low-power GWT GAP9 System-on-Chip. Addressing the domain shift problem
via on-device learning aboard nano-drones not only marks a novel result for
hardware-limited robots but lays the ground for more general advancements for
the entire robotics community.
Related papers
- HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation [54.03004125910057]
We show that hierarchical vision-language-action models can be more effective in utilizing off-domain data than standard monolithic VLA models.
We show that, with the hierarchical design, the high-level VLM can transfer across significant domain gaps between the off-domain finetuning data and real-robot testing scenarios.
arXiv Detail & Related papers (2025-02-08T07:50:22Z) - Generalizable and Fast Surrogates: Model Predictive Control of Articulated Soft Robots using Physics-Informed Neural Networks [4.146337610044239]
We propose physics-informed neural networks (PINNs) for articulated soft robots (ASRs) with a focus on data efficiency.
The amount of expensive real-world training data is reduced to a minimum - one dataset in one system domain.
The prediction speed of an accurate FP model is improved with the PINN by up to a factor of 466 at slightly reduced accuracy.
arXiv Detail & Related papers (2025-02-04T01:16:33Z) - Training on the Fly: On-device Self-supervised Learning aboard Nano-drones within 20 mW [52.280742520586756]
Miniaturized cyber-physical systems (CPSes) powered by tiny machine learning (TinyML), such as nano-drones, are becoming an increasingly attractive technology.
Simple electronics make these CPSes inexpensive, but strongly limit the computational, memory, and sensing resources available on board.
We present a novel on-device fine-tuning approach that relies only on the limited ultra-low power resources available aboard nano-drones.
arXiv Detail & Related papers (2024-08-06T13:11:36Z) - High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks [51.23613834703353]
Relative drone-to-drone localization is a fundamental building block for any swarm operations.
We present a vertically integrated system based on a novel vision-based fully convolutional neural network (FCNN)
Our model results in an R-squared improvement from 32 to 47% on the horizontal image coordinate and from 18 to 55% on the vertical image coordinate, on a real-world dataset of 30k images.
arXiv Detail & Related papers (2024-02-21T12:34:31Z) - Adaptive Deep Learning for Efficient Visual Pose Estimation aboard
Ultra-low-power Nano-drones [5.382126081742012]
We present a novel adaptive deep learning-based mechanism for the efficient execution of a vision-based human pose estimation task.
On a real-world dataset and the actual nano-drone hardware, our best-performing system shows 28% latency reduction while keeping the same mean absolute error (MAE), 3% MAE reduction while being iso-latency, and the absolute peak performance, i.e., 6% better than SoA model.
arXiv Detail & Related papers (2024-01-26T23:04:26Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - On-Device Domain Generalization [93.79736882489982]
Domain generalization is critical to on-device machine learning applications.
We find that knowledge distillation is a strong candidate for solving the problem.
We propose a simple idea called out-of-distribution knowledge distillation (OKD), which aims to teach the student how the teacher handles (synthetic) out-of-distribution data.
arXiv Detail & Related papers (2022-09-15T17:59:31Z) - Efficient Training of Deep Convolutional Neural Networks by Augmentation
in Embedding Space [24.847651341371684]
In applications where data are scarce, transfer learning and data augmentation techniques are commonly used to improve the generalization of deep learning models.
Fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input.
We propose a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space.
arXiv Detail & Related papers (2020-02-12T03:26:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.