Related papers: On-device Self-supervised Learning of Visual Perception Tasks aboard Hardware-limited Nano-quadrotors

On-device Self-supervised Learning of Visual Perception Tasks aboard Hardware-limited Nano-quadrotors

URL: http://arxiv.org/abs/2403.04071v1
Date: Wed, 6 Mar 2024 22:04:14 GMT
Title: On-device Self-supervised Learning of Visual Perception Tasks aboard Hardware-limited Nano-quadrotors
Authors: Elia Cereda, Manuele Rusci, Alessandro Giusti, Daniele Palossi
Abstract summary: Sub-SI50gram nano-drones are gaining momentum in both academia and industry. Their most compelling applications rely on onboard deep learning models for perception. When deployed in unknown environments, these models often underperform due to domain shift. We propose for the first time, on-device learning aboard nano-drones, where the first part of the in-field mission is dedicated to self-supervised fine-tuning.
Score: 53.59319391812798
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Sub-\SI{50}{\gram} nano-drones are gaining momentum in both academia and industry. Their most compelling applications rely on onboard deep learning models for perception despite severe hardware constraints (\ie sub-\SI{100}{\milli\watt} processor). When deployed in unknown environments not represented in the training data, these models often underperform due to domain shift. To cope with this fundamental problem, we propose, for the first time, on-device learning aboard nano-drones, where the first part of the in-field mission is dedicated to self-supervised fine-tuning of a pre-trained convolutional neural network (CNN). Leveraging a real-world vision-based regression task, we thoroughly explore performance-cost trade-offs of the fine-tuning phase along three axes: \textit{i}) dataset size (more data increases the regression performance but requires more memory and longer computation); \textit{ii}) methodologies (\eg fine-tuning all model parameters vs. only a subset); and \textit{iii}) self-supervision strategy. Our approach demonstrates an improvement in mean absolute error up to 30\% compared to the pre-trained baseline, requiring only \SI{22}{\second} fine-tuning on an ultra-low-power GWT GAP9 System-on-Chip. Addressing the domain shift problem via on-device learning aboard nano-drones not only marks a novel result for hardware-limited robots but lays the ground for more general advancements for the entire robotics community.

Related papers

HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation [54.03004125910057]
We show that hierarchical vision-language-action models can be more effective in utilizing off-domain data than standard monolithic VLA models. We show that, with the hierarchical design, the high-level VLM can transfer across significant domain gaps between the off-domain finetuning data and real-robot testing scenarios.
arXiv Detail & Related papers (2025-02-08T07:50:22Z)
Generalizable and Fast Surrogates: Model Predictive Control of Articulated Soft Robots using Physics-Informed Neural Networks [4.146337610044239]
We propose physics-informed neural networks (PINNs) for articulated soft robots (ASRs) with a focus on data efficiency. The amount of expensive real-world training data is reduced to a minimum - one dataset in one system domain. The prediction speed of an accurate FP model is improved with the PINN by up to a factor of 466 at slightly reduced accuracy.
arXiv Detail & Related papers (2025-02-04T01:16:33Z)
Neuromorphic Attitude Estimation and Control [17.895261339368815]
This research presents the first neuromorphic control system using a spiking neural network (SNN)<n>We apply this method to low-level attitude estimation and control for a quadrotor, deploying the SNN on a tiny Crazyflie.<n>Our work shows the feasibility of performing neuromorphic end-to-end control, laying the basis for highly energy-efficient and low-latency neuromorphic autopilots.
arXiv Detail & Related papers (2024-11-21T08:54:45Z)
Training on the Fly: On-device Self-supervised Learning aboard Nano-drones within 20 mW [52.280742520586756]
Miniaturized cyber-physical systems (CPSes) powered by tiny machine learning (TinyML), such as nano-drones, are becoming an increasingly attractive technology. Simple electronics make these CPSes inexpensive, but strongly limit the computational, memory, and sensing resources available on board. We present a novel on-device fine-tuning approach that relies only on the limited ultra-low power resources available aboard nano-drones.
arXiv Detail & Related papers (2024-08-06T13:11:36Z)
High-throughput Visual Nano-drone to Nano-drone Relative Localization using Onboard Fully Convolutional Networks [51.23613834703353]
Relative drone-to-drone localization is a fundamental building block for any swarm operations. We present a vertically integrated system based on a novel vision-based fully convolutional neural network (FCNN) Our model results in an R-squared improvement from 32 to 47% on the horizontal image coordinate and from 18 to 55% on the vertical image coordinate, on a real-world dataset of 30k images.
arXiv Detail & Related papers (2024-02-21T12:34:31Z)
Adaptive Deep Learning for Efficient Visual Pose Estimation aboard Ultra-low-power Nano-drones [5.382126081742012]
We present a novel adaptive deep learning-based mechanism for the efficient execution of a vision-based human pose estimation task. On a real-world dataset and the actual nano-drone hardware, our best-performing system shows 28% latency reduction while keeping the same mean absolute error (MAE), 3% MAE reduction while being iso-latency, and the absolute peak performance, i.e., 6% better than SoA model.
arXiv Detail & Related papers (2024-01-26T23:04:26Z)
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
An Unbiased Look at Datasets for Visuo-Motor Pre-Training [20.094244564603184]
We show that dataset choice is just as important to this paradigm's success. We observe that traditional vision datasets are surprisingly competitive options for visuo-motor representation learning. We show that common simulation benchmarks are not a reliable proxy for real world performance.
arXiv Detail & Related papers (2023-10-13T17:59:02Z)
On-Device Domain Generalization [93.79736882489982]
Domain generalization is critical to on-device machine learning applications. We find that knowledge distillation is a strong candidate for solving the problem. We propose a simple idea called out-of-distribution knowledge distillation (OKD), which aims to teach the student how the teacher handles (synthetic) out-of-distribution data.
arXiv Detail & Related papers (2022-09-15T17:59:31Z)
One-step regression and classification with crosspoint resistive memory arrays [62.997667081978825]
High speed, low energy computing machines are in demand to enable real-time artificial intelligence at the edge. One-step learning is supported by simulations of the prediction of the cost of a house in Boston and the training of a 2-layer neural network for MNIST digit recognition. Results are all obtained in one computational step, thanks to the physical, parallel, and analog computing within the crosspoint array.
arXiv Detail & Related papers (2020-05-05T08:00:07Z)
Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space [24.847651341371684]
In applications where data are scarce, transfer learning and data augmentation techniques are commonly used to improve the generalization of deep learning models. Fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input. We propose a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space.
arXiv Detail & Related papers (2020-02-12T03:26:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.