Self-Supervised Keypoint Detection with Distilled Depth Keypoint Representation
- URL: http://arxiv.org/abs/2410.14700v1
- Date: Fri, 04 Oct 2024 22:14:08 GMT
- Title: Self-Supervised Keypoint Detection with Distilled Depth Keypoint Representation
- Authors: Aman Anand, Elyas Rashno, Amir Eskandari, Farhana Zulkernine,
- Abstract summary: Distill-DKP is a novel cross-modal knowledge distillation framework for keypoint detection in a self-supervised setting.
During training, Distill-DKP extracts embedding-level knowledge from a depth-based teacher model to guide an image-based student model.
Experiments show that Distill-DKP significantly outperforms previous unsupervised methods.
- Score: 0.8136541584281987
- License:
- Abstract: Existing unsupervised keypoint detection methods apply artificial deformations to images such as masking a significant portion of images and using reconstruction of original image as a learning objective to detect keypoints. However, this approach lacks depth information in the image and often detects keypoints on the background. To address this, we propose Distill-DKP, a novel cross-modal knowledge distillation framework that leverages depth maps and RGB images for keypoint detection in a self-supervised setting. During training, Distill-DKP extracts embedding-level knowledge from a depth-based teacher model to guide an image-based student model with inference restricted to the student. Experiments show that Distill-DKP significantly outperforms previous unsupervised methods by reducing mean L2 error by 47.15% on Human3.6M, mean average error by 5.67% on Taichi, and improving keypoints accuracy by 1.3% on DeepFashion dataset. Detailed ablation studies demonstrate the sensitivity of knowledge distillation across different layers of the network. Project Page: https://23wm13.github.io/distill-dkp/
Related papers
- Knowledge Distillation for 6D Pose Estimation by Keypoint Distribution
Alignment [77.70208382044355]
We introduce the first knowledge distillation method for 6D pose estimation.
We observe the compact student network to struggle predicting precise 2D keypoint locations.
Our experiments on several benchmarks show that our distillation method yields state-of-the-art results.
arXiv Detail & Related papers (2022-05-30T10:17:17Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z) - Learning to Recognize Patch-Wise Consistency for Deepfake Detection [39.186451993950044]
We propose a representation learning approach for this task, called patch-wise consistency learning (PCL)
PCL learns by measuring the consistency of image source features, resulting to representation with good interpretability and robustness to multiple forgery methods.
We evaluate our approach on seven popular Deepfake detection datasets.
arXiv Detail & Related papers (2020-12-16T23:06:56Z) - Dual Pixel Exploration: Simultaneous Depth Estimation and Image
Restoration [77.1056200937214]
We study the formation of the DP pair which links the blur and the depth information.
We propose an end-to-end DDDNet (DP-based Depth and De Network) to jointly estimate the depth and restore the image.
arXiv Detail & Related papers (2020-12-01T06:53:57Z) - Learning a Geometric Representation for Data-Efficient Depth Estimation
via Gradient Field and Contrastive Loss [29.798579906253696]
We propose a gradient-based self-supervised learning algorithm with momentum contrastive loss to help ConvNets extract the geometric information with unlabeled images.
Our method outperforms the previous state-of-the-art self-supervised learning algorithms and shows the efficiency of labeled data in triple.
arXiv Detail & Related papers (2020-11-06T06:47:19Z) - Towards Keypoint Guided Self-Supervised Depth Estimation [0.0]
We use keypoints as a self-supervision clue for learning depth map estimation from a collection of input images.
By learning a deep model with and without the keypoint extraction technique, we show that using the keypoints improve the depth estimation learning.
arXiv Detail & Related papers (2020-11-05T20:45:03Z) - Cascade Network for Self-Supervised Monocular Depth Estimation [0.07161783472741746]
We propose a new self-supervised learning method based on cascade networks.
Compared with the previous self-supervised methods, our method has improved accuracy and reliability.
We show a cascaded neural network that divides the target scene into parts of different sight distances and trains them separately to generate a better depth map.
arXiv Detail & Related papers (2020-09-14T06:50:05Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.