Related papers: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks

Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks

URL: http://arxiv.org/abs/2512.04970v1
Date: Thu, 04 Dec 2025 16:38:26 GMT
Title: Stable Single-Pixel Contrastive Learning for Semantic and Geometric Tasks
Authors: Leonid Pogorelyuk, Niels Bracher, Aaron Verkleeren, Lars Kühmichel, Stefan T. Radev,
Abstract summary: Our approach maps each pixel of an image to an overcomplete descriptor that is both view-invariant and semantically meaningful.<n>It enables precise point-correspondence across images without requiring momentum-based teacher-student training.
Score: 2.5178202810957235
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We pilot a family of stable contrastive losses for learning pixel-level representations that jointly capture semantic and geometric information. Our approach maps each pixel of an image to an overcomplete descriptor that is both view-invariant and semantically meaningful. It enables precise point-correspondence across images without requiring momentum-based teacher-student training. Two experiments in synthetic 2D and 3D environments demonstrate the properties of our loss and the resulting overcomplete representations.

Related papers

Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models [55.99654128127689]
Visual Foundation Models (VFMs) are used to generate semantic labels for weakly-supervised pixel-to-point contrastive distillation.<n>We adapt sampling probabilities of points to address imbalances in spatial distribution and category frequency.<n>Our approach consistently surpasses existing image-to-LiDAR contrastive distillation methods in downstream tasks.
arXiv Detail & Related papers (2024-05-23T07:48:19Z)
Doppelgangers: Learning to Disambiguate Images of Similar Structures [76.61267007774089]
Illusory image matches can be challenging for humans to differentiate, and can lead 3D reconstruction algorithms to produce erroneous results. We propose a learning-based approach to visual disambiguation, formulating it as a binary classification task on image pairs. Our evaluation shows that our method can distinguish illusory matches in difficult cases, and can be integrated into SfM pipelines to produce correct, disambiguated 3D reconstructions.
arXiv Detail & Related papers (2023-09-05T17:50:36Z)
Generative Image Inpainting with Segmentation Confusion Adversarial Training and Contrastive Learning [14.358417509144523]
We present a new adversarial training framework for image inpainting with segmentation confusion adversarial training (SCAT) and contrastive learning. SCAT plays an adversarial game between an inpainting generator and a segmentation network, which provides pixel-level local training signals. We conduct extensive experiments on two benchmark datasets, demonstrating our model's effectiveness and superiority both qualitatively and quantitatively.
arXiv Detail & Related papers (2023-03-23T09:34:17Z)
Self-Supervised Image-to-Point Distillation via Semantically Tolerant Contrastive Loss [18.485918870427327]
We propose a novel semantically tolerant image-to-point contrastive loss that takes into consideration the semantic distance between positive and negative image regions. Our method consistently outperforms state-of-the-art 2D-to-3D representation learning frameworks across a wide range of 2D self-supervised pretrained models.
arXiv Detail & Related papers (2023-01-12T19:58:54Z)
Self-Supervised Image Representation Learning with Geometric Set Consistency [50.12720780102395]
We propose a method for self-supervised image representation learning under the guidance of 3D geometric consistency. Specifically, we introduce 3D geometric consistency into a contrastive learning framework to enforce the feature consistency within image views.
arXiv Detail & Related papers (2022-03-29T08:57:33Z)
Warp Consistency for Unsupervised Learning of Dense Correspondences [116.56251250853488]
Key challenge in learning dense correspondences is lack of ground-truth matches for real image pairs. We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression. Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS.
arXiv Detail & Related papers (2021-04-07T17:58:22Z)
Joint Deep Multi-Graph Matching and 3D Geometry Learning from Inhomogeneous 2D Image Collections [57.60094385551773]
We propose a trainable framework for learning a deformable 3D geometry model from inhomogeneous image collections. We in addition obtain the underlying 3D geometry of the objects depicted in the 2D images.
arXiv Detail & Related papers (2021-03-31T17:25:36Z)
Self-Supervised 2D Image to 3D Shape Translation with Disentangled Representations [92.89846887298852]
We present a framework to translate between 2D image views and 3D object shapes. We propose SIST, a Self-supervised Image to Shape Translation framework.
arXiv Detail & Related papers (2020-03-22T22:44:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.