Unsupervised Skin Feature Tracking with Deep Neural Networks
- URL: http://arxiv.org/abs/2405.04943v1
- Date: Wed, 8 May 2024 10:27:05 GMT
- Title: Unsupervised Skin Feature Tracking with Deep Neural Networks
- Authors: Jose Chang, Torbjörn E. M. Nordling,
- Abstract summary: Deep convolutional neural networks have shown remarkable accuracy in tracking tasks.
Our pipeline employs a convolutional stacked autoencoder to match image crops with a reference crop containing the target feature.
Our unsupervised learning approach excels in tracking various skin features under significant motion conditions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial feature tracking is essential in imaging ballistocardiography for accurate heart rate estimation and enables motor degradation quantification in Parkinson's disease through skin feature tracking. While deep convolutional neural networks have shown remarkable accuracy in tracking tasks, they typically require extensive labeled data for supervised training. Our proposed pipeline employs a convolutional stacked autoencoder to match image crops with a reference crop containing the target feature, learning deep feature encodings specific to the object category in an unsupervised manner, thus reducing data requirements. To overcome edge effects making the performance dependent on crop size, we introduced a Gaussian weight on the residual errors of the pixels when calculating the loss function. Training the autoencoder on facial images and validating its performance on manually labeled face and hand videos, our Deep Feature Encodings (DFE) method demonstrated superior tracking accuracy with a mean error ranging from 0.6 to 3.3 pixels, outperforming traditional methods like SIFT, SURF, Lucas Kanade, and the latest transformers like PIPs++ and CoTracker. Overall, our unsupervised learning approach excels in tracking various skin features under significant motion conditions, providing superior feature descriptors for tracking, matching, and image registration compared to both traditional and state-of-the-art supervised learning methods.
Related papers
- Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models.
In this paper, we investigate how detection performance varies across model backbones, types, and datasets.
We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z) - Classification and regression of trajectories rendered as images via 2D Convolutional Neural Networks [0.0]
Recent advances in computer vision have facilitated the processing of trajectories rendered as images via artificial neural networks with 2d convolutional layers (CNNs)
In this study, we investigate the effectiveness of CNNs for solving classification and regression problems from synthetic trajectories rendered as images using different modalities.
Results highlight the importance of choosing an appropriate image resolution according to model depth and motion history in applications where movement direction is critical.
arXiv Detail & Related papers (2024-09-27T15:27:04Z) - Leveraging Neural Radiance Fields for Uncertainty-Aware Visual
Localization [56.95046107046027]
We propose to leverage Neural Radiance Fields (NeRF) to generate training samples for scene coordinate regression.
Despite NeRF's efficiency in rendering, many of the rendered data are polluted by artifacts or only contain minimal information gain.
arXiv Detail & Related papers (2023-10-10T20:11:13Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - Deep Convolutional Pooling Transformer for Deepfake Detection [54.10864860009834]
We propose a deep convolutional Transformer to incorporate decisive image features both locally and globally.
Specifically, we apply convolutional pooling and re-attention to enrich the extracted features and enhance efficacy.
The proposed solution consistently outperforms several state-of-the-art baselines on both within- and cross-dataset experiments.
arXiv Detail & Related papers (2022-09-12T15:05:41Z) - A Bayesian Detect to Track System for Robust Visual Object Tracking and
Semi-Supervised Model Learning [1.7268829007643391]
We ad-dress problems in a Bayesian tracking and detection framework parameterized by neural network outputs.
We propose a particle filter-based approximate sampling algorithm for tracking object state estimation.
Based on our particle filter inference algorithm, a semi-supervised learn-ing algorithm is utilized for learning tracking network on intermittent labeled frames.
arXiv Detail & Related papers (2022-05-05T00:18:57Z) - Skin feature point tracking using deep feature encodings [0.0]
We propose a pipeline for feature tracking, that applies a convolutional stacked autoencoder to identify the most similar crop in an image to a reference crop containing the feature of interest.
We train the autoencoder on facial images and validate its ability to track skin features in general using manually labeled face and hand videos.
We conclude that our method creates better feature descriptors for feature tracking, feature matching, and image registration than the traditional algorithms.
arXiv Detail & Related papers (2021-12-28T14:29:08Z) - Coarse-to-Fine Object Tracking Using Deep Features and Correlation
Filters [2.3526458707956643]
This paper presents a novel deep learning tracking algorithm.
We exploit the generalization ability of deep features to coarsely estimate target translation.
Then, we capitalize on the discriminative power of correlation filters to precisely localize the tracked object.
arXiv Detail & Related papers (2020-12-23T16:43:21Z) - DAQ: Distribution-Aware Quantization for Deep Image Super-Resolution
Networks [49.191062785007006]
Quantizing deep convolutional neural networks for image super-resolution substantially reduces their computational costs.
Existing works either suffer from a severe performance drop in ultra-low precision of 4 or lower bit-widths, or require a heavy fine-tuning process to recover the performance.
We propose a novel distribution-aware quantization scheme (DAQ) which facilitates accurate training-free quantization in ultra-low precision.
arXiv Detail & Related papers (2020-12-21T10:19:42Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z) - Calibrating Deep Neural Networks using Focal Loss [77.92765139898906]
Miscalibration is a mismatch between a model's confidence and its correctness.
We show that focal loss allows us to learn models that are already very well calibrated.
We show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
arXiv Detail & Related papers (2020-02-21T17:35:50Z) - Deep Feature Consistent Variational Autoencoder [46.25741696270528]
We present a novel method for constructing Variational Autoencoder (VAE)
Instead of using pixel-by-pixel loss, we enforce deep feature consistency between the input and the output of a VAE.
We also show that our method can produce latent vectors that can capture the semantic information of face expressions.
arXiv Detail & Related papers (2016-10-02T15:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.