Robust Multi-Task Learning and Online Refinement for Spacecraft Pose
Estimation across Domain Gap
- URL: http://arxiv.org/abs/2203.04275v6
- Date: Thu, 17 Aug 2023 22:45:06 GMT
- Title: Robust Multi-Task Learning and Online Refinement for Spacecraft Pose
Estimation across Domain Gap
- Authors: Tae Ha Park and Simone D'Amico
- Abstract summary: Spacecraft Pose Network v2 (SPNv2) is a Convolutional Neural Network (CNN) for pose estimation of noncooperative spacecraft across domain gap.
Online Domain Refinement (ODR) refines the parameters of the normalization layers of SPNv2 on the target domain images online at deployment.
- Score: 4.8951183832371
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work presents Spacecraft Pose Network v2 (SPNv2), a Convolutional Neural
Network (CNN) for pose estimation of noncooperative spacecraft across domain
gap. SPNv2 is a multi-scale, multi-task CNN which consists of a shared
multi-scale feature encoder and multiple prediction heads that perform
different tasks on a shared feature output. These tasks are all related to
detection and pose estimation of a target spacecraft from an image, such as
prediction of pre-defined satellite keypoints, direct pose regression, and
binary segmentation of the satellite foreground. It is shown that by jointly
training on different yet related tasks with extensive data augmentations on
synthetic images only, the shared encoder learns features that are common
across image domains that have fundamentally different visual characteristics
compared to synthetic images. This work also introduces Online Domain
Refinement (ODR) which refines the parameters of the normalization layers of
SPNv2 on the target domain images online at deployment. Specifically, ODR
performs self-supervised entropy minimization of the predicted satellite
foreground, thereby improving the CNN's performance on the target domain images
without their pose labels and with minimal computational efforts. The GitHub
repository for SPNv2 is available at https://github.com/tpark94/spnv2.
Related papers
- Bridging Domain Gap for Flight-Ready Spaceborne Vision [4.14360329494344]
This work presents Spacecraft Pose Network v3 (SPNv3), a Neural Network (NN) for monocular pose estimation of a known, non-cooperative target spacecraft.
SPNv3 is designed and trained to be computationally efficient while providing robustness to spaceborne images that have not been observed during offline training and validation on the ground.
Experiments demonstrate that the final SPNv3 can achieve state-of-the-art pose accuracy on hardware-in-the-loop images from a robotic testbed while having trained exclusively on computer-generated synthetic images.
arXiv Detail & Related papers (2024-09-18T02:56:50Z) - DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - ComPtr: Towards Diverse Bi-source Dense Prediction Tasks via A Simple
yet General Complementary Transformer [91.43066633305662]
We propose a novel underlineComPlementary underlinetransformer, textbfComPtr, for diverse bi-source dense prediction tasks.
ComPtr treats different inputs equally and builds an efficient dense interaction model in the form of sequence-to-sequence on top of the transformer.
arXiv Detail & Related papers (2023-07-23T15:17:45Z) - Scale Attention for Learning Deep Face Representation: A Study Against
Visual Scale Variation [69.45176408639483]
We reform the conv layer by resorting to the scale-space theory.
We build a novel style named SCale AttentioN Conv Neural Network (textbfSCAN-CNN)
As a single-shot scheme, the inference is more efficient than multi-shot fusion.
arXiv Detail & Related papers (2022-09-19T06:35:04Z) - Two-Stream Graph Convolutional Network for Intra-oral Scanner Image
Segmentation [133.02190910009384]
We propose a two-stream graph convolutional network (i.e., TSGCN) to handle inter-view confusion between different raw attributes.
Our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation.
arXiv Detail & Related papers (2022-04-19T10:41:09Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - Spatial Dependency Networks: Neural Layers for Improved Generative Image
Modeling [79.15521784128102]
We introduce a novel neural network for building image generators (decoders) and apply it to variational autoencoders (VAEs)
In our spatial dependency networks (SDNs), feature maps at each level of a deep neural net are computed in a spatially coherent way.
We show that augmenting the decoder of a hierarchical VAE by spatial dependency layers considerably improves density estimation.
arXiv Detail & Related papers (2021-03-16T07:01:08Z) - MACU-Net for Semantic Segmentation of Fine-Resolution Remotely Sensed
Images [11.047174552053626]
MACU-Net is a multi-scale skip connected and asymmetric-convolution-based U-Net for fine-resolution remotely sensed images.
Our design has the following advantages: (1) The multi-scale skip connections combine and realign semantic features contained in both low-level and high-level feature maps; (2) the asymmetric convolution block strengthens the feature representation and feature extraction capability of a standard convolution layer.
Experiments conducted on two remotely sensed datasets demonstrate that the proposed MACU-Net transcends the U-Net, U-NetPPL, U-Net 3+, amongst other benchmark approaches.
arXiv Detail & Related papers (2020-07-26T08:56:47Z) - When CNNs Meet Random RNNs: Towards Multi-Level Analysis for RGB-D
Object and Scene Recognition [10.796613905980609]
We propose a novel framework that extracts discriminative feature representations from multi-modal RGB-D images for object and scene recognition tasks.
To cope with the high dimensionality of CNN activations, a random weighted pooling scheme has been proposed.
Experiments verify that fully randomized structure in RNN stage encodes CNN activations to discriminative solid features successfully.
arXiv Detail & Related papers (2020-04-26T10:58:27Z) - Multi-Scale Representation Learning for Spatial Feature Distributions
using Grid Cells [11.071527762096053]
We propose a representation learning model called Space2Vec to encode the absolute positions and spatial relationships of places.
Results show that because of its multi-scale representations, Space2Vec outperforms well-established ML approaches.
arXiv Detail & Related papers (2020-02-16T04:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.