AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation
- URL: http://arxiv.org/abs/2410.12411v1
- Date: Wed, 16 Oct 2024 09:52:38 GMT
- Title: AdaCropFollow: Self-Supervised Online Adaptation for Visual Under-Canopy Navigation
- Authors: Arun N. Sivakumar, Federico Magistri, Mateus V. Gasparino, Jens Behley, Cyrill Stachniss, Girish Chowdhary,
- Abstract summary: Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks.
We propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling.
This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
- Score: 31.214318150001947
- License:
- Abstract: Under-canopy agricultural robots can enable various applications like precise monitoring, spraying, weeding, and plant manipulation tasks throughout the growing season. Autonomous navigation under the canopy is challenging due to the degradation in accuracy of RTK-GPS and the large variability in the visual appearance of the scene over time. In prior work, we developed a supervised learning-based perception system with semantic keypoint representation and deployed this in various field conditions. A large number of failures of this system can be attributed to the inability of the perception model to adapt to the domain shift encountered during deployment. In this paper, we propose a self-supervised online adaptation method for adapting the semantic keypoint representation using a visual foundational model, geometric prior, and pseudo labeling. Our preliminary experiments show that with minimal data and fine-tuning of parameters, the keypoint prediction model trained with labels on the source domain can be adapted in a self-supervised manner to various challenging target domains onboard the robot computer using our method. This can enable fully autonomous row-following capability in under-canopy robots across fields and crops without requiring human intervention.
Related papers
- SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation [82.61572106180705]
This paper presents a unified approach using vision-language models (VLMs) to improve keypoint prediction across various garment categories.
We created a large-scale synthetic dataset using advanced simulation techniques, allowing scalable training without extensive real-world data.
Experimental results indicate that the VLM-based method significantly enhances keypoint detection accuracy and task success rates.
arXiv Detail & Related papers (2024-09-26T17:26:16Z) - Context-Conditional Navigation with a Learning-Based Terrain- and Robot-Aware Dynamics Model [11.800678688260081]
We develop a novel probabilistic, terrain- and robot-aware forward dynamics model, termed TRADYN.
We evaluate our method in a simulated 2D navigation setting with a unicycle-like robot and different terrain layouts with spatially varying friction coefficients.
arXiv Detail & Related papers (2023-07-18T12:42:59Z) - Visual Affordance Prediction for Guiding Robot Exploration [56.17795036091848]
We develop an approach for learning visual affordances for guiding robot exploration.
We use a Transformer-based model to learn a conditional distribution in the latent embedding space of a VQ-VAE.
We show how the trained affordance model can be used for guiding exploration by acting as a goal-sampling distribution, during visual goal-conditioned policy learning in robotic manipulation.
arXiv Detail & Related papers (2023-05-28T17:53:09Z) - Policy Pre-training for End-to-end Autonomous Driving via
Self-supervised Geometric Modeling [96.31941517446859]
We propose PPGeo (Policy Pre-training via Geometric modeling), an intuitive and straightforward fully self-supervised framework curated for the policy pretraining in visuomotor driving.
We aim at learning policy representations as a powerful abstraction by modeling 3D geometric scenes on large-scale unlabeled and uncalibrated YouTube driving videos.
In the first stage, the geometric modeling framework generates pose and depth predictions simultaneously, with two consecutive frames as input.
In the second stage, the visual encoder learns driving policy representation by predicting the future ego-motion and optimizing with the photometric error based on current visual observation only.
arXiv Detail & Related papers (2023-01-03T08:52:49Z) - Uncertainty-aware Perception Models for Off-road Autonomous Unmanned
Ground Vehicles [6.2574402913714575]
Off-road autonomous unmanned ground vehicles (UGVs) are being developed for military and commercial use to deliver crucial supplies in remote locations.
Current datasets used to train perception models for off-road autonomous navigation lack of diversity in seasons, locations, semantic classes, as well as time of day.
We investigate how to combine multiple datasets to train a semantic segmentation-based environment perception model.
We show that training the model to capture uncertainty could improve the model performance by a significant margin.
arXiv Detail & Related papers (2022-09-22T15:59:33Z) - Waypoint Generation in Row-based Crops with Deep Learning and
Contrastive Clustering [1.2599533416395767]
We propose a learning-based approach to tackle waypoint generation for planning a navigation path for row-based crops.
We present a novel methodology for waypoint clustering based on a contrastive loss, able to project the points to a separable latent space.
The proposed deep neural network can simultaneously predict the waypoint position and cluster assignment with two specialized heads in a single forward pass.
arXiv Detail & Related papers (2022-06-23T11:21:04Z) - Polyline Based Generative Navigable Space Segmentation for Autonomous
Visual Navigation [57.3062528453841]
We propose a representation-learning-based framework to enable robots to learn the navigable space segmentation in an unsupervised manner.
We show that the proposed PSV-Nets can learn the visual navigable space with high accuracy, even without any single label.
arXiv Detail & Related papers (2021-10-29T19:50:48Z) - Self-Improving Semantic Perception on a Construction Robot [6.823936426747797]
We propose a framework in which semantic models are continuously updated on the robot to adapt to the deployment environments.
Our system therefore tightly couples multi-sensor perception and localisation to continuously learn from self-supervised pseudo labels.
arXiv Detail & Related papers (2021-05-04T16:06:12Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Online Domain Adaptation for Occupancy Mapping [28.081328051535618]
We propose a theoretical framework building upon the theory of optimal transport to adapt model parameters to account for changes in the environment.
With the use of high-fidelity driving simulators and real-world datasets, we demonstrate how parameters of 2D and 3D occupancy maps can be automatically adapted to accord with local spatial changes.
arXiv Detail & Related papers (2020-07-01T00:46:51Z) - Understanding Self-Training for Gradual Domain Adaptation [107.37869221297687]
We consider gradual domain adaptation, where the goal is to adapt an initial classifier trained on a source domain given only unlabeled data that shifts gradually in distribution towards a target domain.
We prove the first non-vacuous upper bound on the error of self-training with gradual shifts, under settings where directly adapting to the target domain can result in unbounded error.
The theoretical analysis leads to algorithmic insights, highlighting that regularization and label sharpening are essential even when we have infinite data, and suggesting that self-training works particularly well for shifts with small Wasserstein-infinity distance.
arXiv Detail & Related papers (2020-02-26T08:59:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.