Self-Supervised Learning for Place Representation Generalization across
Appearance Changes
- URL: http://arxiv.org/abs/2303.02370v3
- Date: Thu, 21 Dec 2023 13:03:05 GMT
- Title: Self-Supervised Learning for Place Representation Generalization across
Appearance Changes
- Authors: Mohamed Adel Musallam, Vincent Gaudilli\`ere, Djamila Aouada
- Abstract summary: We investigate learning features that are robust to appearance modifications while sensitive to geometric transformations in a self-supervised manner.
Our results reveal that jointly learning appearance-robust and geometry-sensitive image descriptors leads to competitive visual place recognition results.
- Score: 11.030196234282675
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Visual place recognition is a key to unlocking spatial navigation for
animals, humans and robots. While state-of-the-art approaches are trained in a
supervised manner and therefore hardly capture the information needed for
generalizing to unusual conditions, we argue that self-supervised learning may
help abstracting the place representation so that it can be foreseen,
irrespective of the conditions. More precisely, in this paper, we investigate
learning features that are robust to appearance modifications while sensitive
to geometric transformations in a self-supervised manner. This dual-purpose
training is made possible by combining the two self-supervision main paradigms,
\textit{i.e.} contrastive and predictive learning. Our results on standard
benchmarks reveal that jointly learning such appearance-robust and
geometry-sensitive image descriptors leads to competitive visual place
recognition results across adverse seasonal and illumination conditions,
without requiring any human-annotated labels.
Related papers
- Emotic Masked Autoencoder with Attention Fusion for Facial Expression Recognition [1.4374467687356276]
This paper presents an innovative approach integrating the MAE-Face self-supervised learning (SSL) method and multi-view Fusion Attention mechanism for expression classification.
We suggest easy-to-implement and no-training frameworks aimed at highlighting key facial features to determine if such features can serve as guides for the model.
The efficacy of this method is validated by improvements in model performance on the Aff-wild2 dataset.
arXiv Detail & Related papers (2024-03-19T16:21:47Z) - Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models [64.24227572048075]
We propose a Knowledge-Aware Prompt Tuning (KAPT) framework for vision-language models.
Our approach takes inspiration from human intelligence in which external knowledge is usually incorporated into recognizing novel categories of objects.
arXiv Detail & Related papers (2023-08-22T04:24:45Z) - A Symbolic Representation of Human Posture for Interpretable Learning
and Reasoning [2.678461526933908]
We introduce a qualitative spatial reasoning approach that describes the human posture in terms that are more familiar to people.
This paper explores the derivation of our symbolic representation at two levels of detail and its preliminary use as features for interpretable activity recognition.
arXiv Detail & Related papers (2022-10-17T12:22:13Z) - CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial
Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets.
CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z) - Stochastic Coherence Over Attention Trajectory For Continuous Learning
In Video Streams [64.82800502603138]
This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream.
The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations.
Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream.
arXiv Detail & Related papers (2022-04-26T09:52:31Z) - Vision-Based Manipulators Need to Also See from Their Hands [58.398637422321976]
We study how the choice of visual perspective affects learning and generalization in the context of physical manipulation from raw sensor observations.
We find that a hand-centric (eye-in-hand) perspective affords reduced observability, but it consistently improves training efficiency and out-of-distribution generalization.
arXiv Detail & Related papers (2022-03-15T18:46:18Z) - Unsupervised Deep Metric Learning with Transformed Attention Consistency
and Contrastive Clustering Loss [28.17607283348278]
Existing approaches for unsupervised metric learning focus on exploring self-supervision information within the input image itself.
We observe that, when analyzing images, human eyes often compare images against each other instead of examining images individually.
We develop a new approach to unsupervised deep metric learning where the network is learned based on self-supervision information across images.
arXiv Detail & Related papers (2020-08-10T19:33:47Z) - Towards Purely Unsupervised Disentanglement of Appearance and Shape for
Person Images Generation [88.03260155937407]
We formulate an encoder-decoder-like network to extract shape and appearance features from input images at the same time.
We train the parameters by three losses: feature adversarial loss, color consistency loss and reconstruction loss.
Experimental results on DeepFashion and Market1501 demonstrate that the proposed method achieves clean disentanglement.
arXiv Detail & Related papers (2020-07-26T10:56:37Z) - Self-Supervised Learning Across Domains [33.86614301708017]
We propose to apply a similar approach to the problem of object recognition across domains.
Our model learns the semantic labels in a supervised fashion, and broadens its understanding of the data by learning from self-supervised signals on the same images.
This secondary task helps the network to focus on object shapes, learning concepts like spatial orientation and part correlation, while acting as a regularizer for the classification task.
arXiv Detail & Related papers (2020-07-24T06:19:53Z) - Unsupervised Landmark Learning from Unpaired Data [117.81440795184587]
Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.
We propose a cross-image cycle consistency framework which applies the swapping-reconstruction strategy twice to obtain the final supervision.
Our proposed framework is shown to outperform strong baselines by a large margin.
arXiv Detail & Related papers (2020-06-29T13:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.