Related papers: Test-Time Adaptation for Keypoint-Based Spacecraft Pose Estimation Based on Predicted-View Synthesis

Test-Time Adaptation for Keypoint-Based Spacecraft Pose Estimation Based on Predicted-View Synthesis

URL: http://arxiv.org/abs/2410.04298v1
Date: Sat, 5 Oct 2024 22:24:19 GMT
Title: Test-Time Adaptation for Keypoint-Based Spacecraft Pose Estimation Based on Predicted-View Synthesis
Authors: Juan Ignacio Bravo Pérez-Villar, Álvaro García-Martín, Jesús Bescós, Juan C. SanMiguel,
Abstract summary: supervised algorithms for spacecraft pose estimation experience a drop in performance when trained on synthetic data. We propose a test-time adaptation approach that leverages the temporal redundancy between images acquired during close proximity operations.
Score: 9.273012275620527
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Due to the difficulty of replicating the real conditions during training, supervised algorithms for spacecraft pose estimation experience a drop in performance when trained on synthetic data and applied to real operational data. To address this issue, we propose a test-time adaptation approach that leverages the temporal redundancy between images acquired during close proximity operations. Our approach involves extracting features from sequential spacecraft images, estimating their poses, and then using this information to synthesise a reconstructed view. We establish a self-supervised learning objective by comparing the synthesised view with the actual one. During training, we supervise both pose estimation and image synthesis, while at test-time, we optimise the self-supervised objective. Additionally, we introduce a regularisation loss to prevent solutions that are not consistent with the keypoint structure of the spacecraft. Our code is available at: https://github.com/JotaBravo/spacecraft-tta.

Related papers

Enabling Robust, Real-Time Verification of Vision-Based Navigation through View Synthesis [0.0]
VISY-REVE is a novel pipeline to validate image processing algorithms for Vision-Based Navigation.<n>We propose augmenting image datasets in real-time with synthesized views at novel poses.
arXiv Detail & Related papers (2025-07-01T19:47:04Z)
Drive-1-to-3: Enriching Diffusion Priors for Novel View Synthesis of Real Vehicles [81.29018359825872]
This paper consolidates a set of good practices to finetune large pretrained models for a real-world task. Specifically, we develop several strategies to account for discrepancies between the synthetic data and real driving data. Our insights lead to effective finetuning that results in a $68.8%$ reduction in FID for novel view synthesis over prior arts.
arXiv Detail & Related papers (2024-12-19T03:39:13Z)
Bench2Drive-R: Turning Real World Data into Reactive Closed-Loop Autonomous Driving Benchmark by Generative Model [63.336123527432136]
We introduce Bench2Drive-R, a generative framework that enables reactive closed-loop evaluation. Unlike existing video generative models for autonomous driving, the proposed designs are tailored for interactive simulation. We compare the generation quality of Bench2Drive-R with existing generative models and achieve state-of-the-art performance.
arXiv Detail & Related papers (2024-12-11T06:35:18Z)
XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis [84.23233209017192]
This paper presents a novel driving view synthesis dataset and benchmark specifically designed for autonomous driving simulations. The dataset is unique as it includes testing images captured by deviating from the training trajectory by 1-4 meters. We establish the first realistic benchmark for evaluating existing NVS approaches under front-only and multi-camera settings.
arXiv Detail & Related papers (2024-06-26T14:00:21Z)
Automatic UAV-based Airport Pavement Inspection Using Mixed Real and Virtual Scenarios [3.0874677990361246]
We propose a vision-based approach to automatically identify pavement distress using images captured by UAVs. The proposed method is based on Deep Learning (DL) to segment defects in the image. We demonstrate that the use of a mixed dataset composed of synthetic and real training images yields better results when testing the training models in real application scenarios.
arXiv Detail & Related papers (2024-01-11T16:30:07Z)
A Survey on Deep Learning-Based Monocular Spacecraft Pose Estimation: Current State, Limitations and Prospects [7.08026800833095]
Estimating the pose of an uncooperative spacecraft is an important computer vision problem for enabling vision-based systems in orbit. Following the general trend in computer vision, more and more works have been focusing on leveraging Deep Learning (DL) methods to address this problem. Despite promising research-stage results, major challenges preventing the use of such methods in real-life missions still stand in the way.
arXiv Detail & Related papers (2023-05-12T09:52:53Z)
TexPose: Neural Texture Learning for Self-Supervised 6D Object Pose Estimation [55.94900327396771]
We introduce neural texture learning for 6D object pose estimation from synthetic data. We learn to predict realistic texture of objects from real image collections. We learn pose estimation from pixel-perfect synthetic data.
arXiv Detail & Related papers (2022-12-25T13:36:32Z)
Learning-by-Novel-View-Synthesis for Full-Face Appearance-based 3D Gaze Estimation [8.929311633814411]
This work examines a novel approach for synthesizing gaze estimation training data based on monocular 3D face reconstruction. Unlike prior works using multi-view reconstruction, photo-realistic CG models, or generative neural networks, our approach can manipulate and extend the head pose range of existing training data.
arXiv Detail & Related papers (2022-01-20T00:29:45Z)
Space Non-cooperative Object Active Tracking with Deep Reinforcement Learning [1.212848031108815]
We propose an end-to-end active visual tracking method based on DQN algorithm, named as DRLAVT. It can guide the chasing spacecraft approach to arbitrary space non-cooperative target merely relied on color or RGBD images. It significantly outperforms position-based visual servoing baseline algorithm that adopts state-of-the-art 2D monocular tracker, SiamRPN.
arXiv Detail & Related papers (2021-12-18T06:12:24Z)
Task2Sim : Towards Effective Pre-training and Transfer from Synthetic Data [74.66568380558172]
We study the transferability of pre-trained models based on synthetic data generated by graphics simulators to downstream tasks. We introduce Task2Sim, a unified model mapping downstream task representations to optimal simulation parameters. It learns this mapping by training to find the set of best parameters on a set of "seen" tasks. Once trained, it can then be used to predict best simulation parameters for novel "unseen" tasks in one shot.
arXiv Detail & Related papers (2021-11-30T19:25:27Z)
Learning Co-segmentation by Segment Swapping for Retrieval and Discovery [67.6609943904996]
The goal of this work is to efficiently identify visually similar patterns from a pair of images. We generate synthetic training pairs by selecting object segments in an image and copy-pasting them into another image. We show our approach provides clear improvements for artwork details retrieval on the Brueghel dataset.
arXiv Detail & Related papers (2021-10-29T16:51:16Z)
Self-Supervision by Prediction for Object Discovery in Videos [62.87145010885044]
In this paper, we use the prediction task as self-supervision and build a novel object-centric model for image sequence representation. Our framework can be trained without the help of any manual annotation or pretrained network. Initial experiments confirm that the proposed pipeline is a promising step towards object-centric video prediction.
arXiv Detail & Related papers (2021-03-09T19:14:33Z)
Domain-invariant Similarity Activation Map Contrastive Learning for Retrieval-based Long-term Visual Localization [30.203072945001136]
In this work, a general architecture is first formulated probabilistically to extract domain invariant feature through multi-domain image translation. And then a novel gradient-weighted similarity activation mapping loss (Grad-SAM) is incorporated for finer localization with high accuracy. Extensive experiments have been conducted to validate the effectiveness of the proposed approach on the CMUSeasons dataset. Our performance is on par with or even outperforms the state-of-the-art image-based localization baselines in medium or high precision.
arXiv Detail & Related papers (2020-09-16T14:43:22Z)
Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video. Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses. We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.